> Instead of running the model once (flash) or multiple times (thinking/pro) in its entirety...

> Instead of running the model once (flash) or multiple times (thinking/pro) in its entirety

I'm not sure what you mean here, but there isn't a difference in the number of times a model runs during inference.

alt Hacker News