logoalt Hacker News

hirako2000yesterday at 10:13 PM1 replyview on HN

Also because it's a large PR. Also because the maintainer has better things to do than taking longer and more energy to review than the author spent to write it, just to find that multiple optimisations will be requested, which the author may not be able to take on.

the creator of llama.cc can hardly be suspected to be reluctant or biased towards GenAI.


Replies

adefayesterday at 10:47 PM

Absolutely -- it's perfectly understandable. I wanted to be completely upfront about AI usage and while I was willing and did start to break the PR down into parts, it's totally OK for the maintainers to reject that too.

I wanted to see if Claude Code could port the HF / MLX implementation to llama.cpp and it was successful -- in my mind that's wild!

I also learned a ton about GPU programming, how omni models work, and refined my approach to planning large projects with automated end to end integration tests.

The PR was mostly to let people know about the code and weights, since there are quite a few comments requesting support:

https://github.com/ggml-org/llama.cpp/issues/16186

show 1 reply