logoalt Hacker News

janalsncmyesterday at 8:39 AM2 repliesview on HN

I worked on it for a more specialized task (query rewriting). It’s blazing fast.

A lot of inference code is set up for autoregressive decoding now. Diffusion is less mature. Not sure if Ollama or llama cpp support it.


Replies

philipportneryesterday at 11:53 AM

Did you publish anything you could link wrt. query rewriting?

stavrosyesterday at 10:02 AM

How was the quality?

show 1 reply