logoalt Hacker News

anima-coretoday at 1:25 AM2 repliesview on HN

I’ve been working independently on a method that replaces full-transformer inference with a low-rank “meaning field” extracted from internal activations.

The core result: a frozen Llama-3.3-70B can be distilled into a 256-dimensional field representation, giving 224× compression and slightly higher accuracy on several benchmarks. A small student model then learns to directly generate these fields from text, removing the transformer from the inference path.

The Zenodo link contains the full paper, statistical results, and methodology. A reference implementation (non-optimized) is here: https://github.com/Anima-Core/an1-core

Production variants (AN1-Turbo, FPU work, etc.) are not included.

I’m an outsider to academia so I’m posting this openly to get technical feedback, replication attempts, and critique from people who understand this space.


Replies

broretoretoday at 6:51 AM

10 pages for a paper with this groundbreaking of a concept is just embarrassing. It is barely an outline.

"confirming that 40× compression preserves field geometry with minimal distortion. Over 95% of samples achieve similarity above 0.90."

I smell Grok. Grok 3, maybe Grok 4 Fast.

> "Implementation details. Optimal configurations are task and architecture-dependent. Production systems require task-specific tuning beyond baseline heuristics provided in reference implementation."

"Implementation? Idk, uhh, it's task specific or something." Come on, dude. You're better than this.

4.4 Student/Teacher evaluation. What even is the benchmark? You give percentage values but no indication of what benchmark. Seems made up.

4.5. Computational Analysis. Why do you need to do the trivial multiplying out of "savings" for 1B tok/day to $700M/year? This reads like a GPT advertising hallucinated performance.

Three sentence conclusion restating the title?

ForOldHacktoday at 4:46 AM

Technical feedback: Every single announcement, like compression needs the addition of the lower limits of machine requirements. if a 64Gb model is compressed 224x times, should that not be able to be run on a 292mb video card?

show 1 reply