logoalt Hacker News

Aurornisyesterday at 8:48 PM2 repliesview on HN

This is a Claude-code generated repo that implements some ideas from research papers. If you follow this space, every paper release spawns tens or hundreds of vibecoded repos like this that get spammed to Reddit, Hacker News, and other sites.

It's generally best to overlook the vibecoded repos and go closer to the source for up to date information. In this case, z-lab already showed Qwen3.5-27B with DFlash last month: https://huggingface.co/z-lab/Qwen3.5-27B-DFlash

This repo is an example of what you get if you point Claude Code at the upstream repo and have it iterate with some other objective (loading GGUF). They also included DDTree in there somewhere.

You also need to look closely at the claims. A classic trick in these repos is to cherry-pick numbers that make the work in the repo look extraordinary until you start reading the details. From my quick read, this repo is using Q4 quantization on the KV cache which does not produce good results. Someone who reads everything in detail might find more tricks. This is par for all of these demo repos because the goal is to impress casual viewers with big numbers.

I'm trying to find where they get the 207 tok/s number but the 207 number only appears in their headline claim. If you read deeper the real numbers are half that or less.

There are also several (possibly vibecoded, I haven't checked) draft PRs and forks to use these techniques on upstream llama.cpp that would be much more useful for experimenting. One example I picked at random: https://github.com/ggml-org/llama.cpp/pull/22105


Replies

j45yesterday at 9:18 PM

Appreciate the reading and things to go learn more from.

Learning about Qwen 3.5, and also learning how Gemma 4 appears to be unique (relatively speaking), and Apple possibly using some type of Gemma model on-device I think will also help fill in how to track local model and local device capabilities which could be additional measures/KPIs as well.

GreenGamesyesterday at 9:53 PM

[flagged]

show 3 replies