What it feels like to work with Mythos

137 points • by swolpers • today at 5:17 PM • 135 comments • view on HN

Comments

What I find fascinating that there is so little substance in this article about the quality of produced code and the medium. Is the code documented and tested? Is it understandable and extendable? Is it secure? What language, framework, database was used? Author mentions judgement and taste - well, is the code tasteful? Will the model rearchitecture the entire thing if I ask it to add new functionality, spending another 9.5h in tokens? I assume that the research part is domain knowledge = how different types of travel translate to time making it presentable; how did the author verify this?

These questions are even not about AI: if I were to give money to a human agency and were given something they tell me works, I would ask the same questions. If I did not know how to evaluate, I would hire people that do. With LLMs the verification part is what bothers me the most.

➕ show 10 replies

JumpCrisscross • today at 7:18 PM

Anecdote: I fed Fable some models I’ve been hand verifying (basically, I sketch out a scenario for Opus to model, it builds it, I ask it to show me the math, I correct it, we iterate like this, then I double check its code to make sure the math matches the model logic). Fable found almost every error I found, and then had some interesting suggestions for additional variables.

It also burned through my usage quota like a late-90s Hummer.

➕ show 2 replies

ecocentrik • today at 7:57 PM

Reading the first few paragraphs of what he calls "the most sophisticated academic social science paper I have yet seen from an AI" does not impress as much as I hoped.

"Posterior beliefs about market demand are purely referencedependent: holding dollars raised constant, they track only performance relative to the founder’s self-chosen goal—jumping half a standard deviation at the threshold, responding steeply for the first ten points past it, and flattening thereafter"

Humans generally don't verbalize data this way. The summary document is also very fluffy.

➕ show 1 reply

olafmol • today at 8:00 PM

This little line from the article scares me: "but a software engineer would iron out the remaining potential bugs that I could not find quickly"

Every sw dev knows this is a very dangerous, and unrealistic, assumption.

mohsen1 • today at 7:43 PM

I have been using it for less than an hour so take this with a grain of salt of being excited for the new tech.

In a project like mine (https://github.com/tsz-org/tsz) I am constantly frustrated that models were not doing enough research and were not taking into account other situations. Again and again models would produce code that would fix one thing and break 2 other tests that were "unrelated".

With Fable it seems like tasks are taking much longer (I have not seen a pull request from Fable sessions yet) but reading the transcription of those sessions I can see how it is doing the right thing by not leaving any stone unturned.

As the article says, it's hard to communicate this "feeling" about models because it is very project specific but I thought I share

➕ show 2 replies

selfawareMammal • today at 7:25 PM

What are people working on that they see such a substantial difference between Mythos and Opus? I'd say I'm working with advanced stuff and more than often Deepseek is even more than enough. Why is everybody a genius in here?

➕ show 6 replies

gopalv • today at 5:56 PM

> It worked for nine and a half hours.

> Again, it wasn’t perfect. As an expert, I was able to spot some errors and omissions (some as a result of the design I had asked for) that I had the AI correct

That's the bit that stuck out to me - that's longer than I would expect to work on a problem in a day or even expect to go back & fix the output of something that has a core reward loop of hours.

My customers are currently clamoring to push down my agent response times from 85 seconds down to below the 20s mark.

At the same time, it is very dissonant to see the industry heading towards hour+ long workflows with an agent.

➕ show 3 replies

ElijahLynn • today at 10:11 PM

Loved the article!

And I'm excited to try it, but also have a fear that I will like it too much and then won't have access to it in 2 weeks... But maybe I will and maybe it will be worth it and I'll just pay a bunch of extra for it and it'll be great!

I think the article could be improved by actually sharing more feelings. I clicked on the article for feelings but I didn't see that many feelings described.

thepasch • today at 7:28 PM

What it feels like to work with Fable:

> Switched to Opus 4.8: Fable 5 has safety measures that flag messages on most cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Send feedback or learn more.

➕ show 1 reply

theturtletalks • today at 7:24 PM

This is what he built:

https://isochronic-passage-chart.netlify.app/

Doesn’t work too well on mobile but looks interesting

➕ show 2 replies

neaden • today at 7:57 PM

Man, that poem it made is terrible. Like just incredibly bad. Sure it's neat that software can make an incredibly bad poem but there is enough bad poetry in the world that we don't need it.

➕ show 2 replies

pu_pe • today at 9:06 PM

The isochrone maps are quite beautiful [1], and go beyond the scope and refinement of some earlier human attempts I could find [2][3][4].

[1] https://isochronic-passage-chart.netlify.app/

[2] https://mapitout.welcome-to-nl.nl/

[3] https://commutetimemap.com/

[4] https://andrewding.ca/flightisochrones/

ElijahLynn • today at 10:04 PM

> The work has shifted from process to outcome. I no longer steer; I commission.

recursivedoubts • today at 6:21 PM

would it be possible for mythos to make the space bar scroll the pages on your website properly?

➕ show 1 reply

ComplexSystems • today at 9:51 PM

Who can afford to use this damn thing though? They're pricing everyone out of the market with stuff like this.

wxw • today at 8:20 PM

I am… underwhelmed by the artifacts in the post.

I don’t see why working longer is a pro. The results don’t seem much better than you’d get from putting Opus in a long loop.

➕ show 1 reply

mjamesaustin • today at 8:35 PM

The snake game is legit very fun. Once I got the ability to pick up the apples and plant apple trees, I was sold.

Aperocky • today at 7:46 PM

> This is a map that shows the distance you can travel in a given length of time, and the first one was created in 1881 showing travel times from London.

The first item on the article, the first thing it showed, was wrong though.

It is 100% faster to go from London to New York in 1881 than Volgagrad. Or any of the Russian hinterland colored green or Turkey or Egypt.

➕ show 1 reply

asdK120 • today at 5:41 PM

Mollick runs the Generative AI Lab at Wharton, with all the corporate sponsors.

He is a professor but sadly also an AI shill. He should switch to advertising washing power.

➕ show 2 replies

mawadev • today at 8:04 PM

Isn't it weird that we started to gauge the quality of a model by checking the vibe of the vibe coding?

vb-8448 • today at 8:24 PM

Nice, but I'm really curious about how many tokens have been used.

There is only one hint: 475k tokens in the screenshot when OP asked the model to fix some behaviour, but it would be fascinating to know the total tokens amount.

steve1977 • today at 8:35 PM

> it is indicative of AI solving a hard problem involving research, math, visual development, taste, judgement, complex coding, and more.

Is it a hard problem or is it just labor intensive?

➕ show 1 reply

382hi • today at 5:48 PM

I think Qwen 3.7-Plus is better at reasoning than Mythos, and I've used both for quite a while.

PaulHoule • today at 8:32 PM

My wife likes to say "feelings aren't facts"

LogicFailsMe • today at 8:38 PM

I'm using Fable this afternoon and it's definitely a step up from Opus 4.8, finding and fixing things Opus 4.8 was blind to even perceiving. The next 13 days are going to be fun IMO. And Opus 4.8 was less annoying than Opus 4.7 FWIW.

Edit: A couple hours in and I just got my first gaslighting attempt from the model. Good times!

root_axis • today at 5:39 PM

I just can't stand this type of fawning language.

catigula • today at 8:12 PM

>Ethan Mollick

Just an FYI this guy is an AI hype-beast. Some of his tweets are truly out there.

➕ show 1 reply

zb3 • today at 7:44 PM

Was the condition of being granted early access to this castrated model writing a post praising it?

zuzululu • today at 7:40 PM

> First, how good is Fable? In experiment after experiment I conducted, it outperformed basically every other public model I have used by a considerable margin.

What makes me excited is that GPT 5.6 (its actually GPT 6) is going to be crazy

ThejaCH • today at 7:56 PM

What it feels like to work with Mythos? Feels like am poor

younglunaman • today at 8:11 PM

>What it feels like to work with Mythos >Looks Inside >So I did this with fable...

What?

➕ show 1 reply

the_doctah • today at 5:53 PM

More Mythos Marketing.

➕ show 2 replies

honeycrispy • today at 7:56 PM

Reading it, I can't help but feel he's being paid to write this. Or maybe he hopes to be paid. The language he uses makes him sound like he's fawning over the lost days of his childhood. Pardon me for being skeptical, but a trillion dollar company running a net-loss is hoping to IPO, and needs to sway public opinion by any means necessary. I would imagine that no dirty marketing scheme is off of the table, even from the self-proclaimed "good guys".

et-al • today at 5:46 PM

[flagged]

➕ show 2 replies

alt Hacker News

What it feels like to work with Mythos

Comments