Holy cow their chatapp demo!!! I for first time thought i mistakenly pasted the answer. It was literally in a blink of an eye.!!
With this speed, you can keep looping and generating code until it passes all tests. If you have tests.
Generate lots of solutions and mix and match. This allows a new way to look at LLMs.
OK investors, time to pull out of OpenAI and move all your money to ChatJimmy.
I dunno, it pretty quickly got stuck; the "attach file" didn't seem to work, and when I asked "can you see the attachment" it replied to my first message rather than my question.
I got 16.000 tokens per second ahaha
That… what…
Well it got all 10 incorrect when I asked for top 10 catchphrases from a character in Plato's books. It confused the baddie for Socrates.
Fast, but stupid.
Me: "How many r's in strawberry?"
Jimmy: There are 2 r's in "strawberry".
Generated in 0.001s • 17,825 tok/s
The question is not about how fast it is. The real question(s) are: 1. How is this worth it over diffusion LLMs (No mention of diffusion LLMs at all in this thread)
(This also assumes that diffusion LLMs will get faster) 2. Will Talaas also work with reasoning models, especially those that are beyond 100B parameters and with the output being correct?
3. How long will it take to create newer models to be turned into silicon? (This industry moves faster than Talaas.)
4. How does this work when one needs to fine-tune the model, but still benefit from the speed advantages?I asked, “What are the newest restaurants in New York City?”
Jimmy replied with, “2022 and 2023 openings:”
0_0
Is super fast but also super inaccurate, I would say not even gpt-3 levels.
It is incredibly fast, on that I agree, but even simple queries I tried got very inaccurate answers. Which makes sense, it's essentially a trade off of how much time you give it to "think", but if it's fast to the point where it has no accuracy, I'm not sure I see the appeal.
I asked it to design a submarine for my cat and literally the instant my finger touched return the answer was there. And that is factoring in the round-trip time for the data too. Crazy.
The answer wasn't dumb like others are getting. It was pretty comprehensive and useful.