logoalt Hacker News

abdullinyesterday at 7:03 PM0 repliesview on HN

Working on benchmark arena for AI agents with my wife.

We grab interesting business problems, turn them into fun challenges for hundreds of AI engineers to find the best architecture for. Insights are shared back with the community.

It is a fun learning process with unexpected scaling challenges.