logoalt Hacker News

Show HN: Hekate – A Zero-Copy ZK Engine Overcoming the Memory Wall

4 pointsby y00zzeektoday at 3:52 AM7 commentsview on HN

Most ZK proving systems are optimized for server-grade hardware with massive RAM. When scaling to industrial-sized traces (2^20+ rows), they often hit a "Memory Wall" where allocation and data movement become a larger bottleneck than the actual computation.

I have been developing Hekate, a ZK engine written in Rust that utilizes a Zero-Copy streaming model and a hybrid tiled evaluator. To test its limits, I ran a head-to-head benchmark against Binius64 on an Apple M3 Max laptop using Keccak-256.

The results highlight a significant architectural divergence:

At 2^15 rows: Binius64 is faster (147ms vs 202ms), but Hekate is already 10x more memory efficient (44MB vs ~400MB).

At 2^20 rows: Binius64 hits 72GB of RAM usage, entering swap hell on a laptop. Hekate processes the same workload in 4.74s using just 1.4GB of RAM.

At 2^24 rows (16.7M steps): Hekate finishes in 88s with a peak RAM of 21.5GB. Binius64 is unable to complete the task due to OOM/Swap on this hardware.

The core difference is "Materialization vs. Streaming". While many engines materialize and copy massive polynomials in RAM during Sumcheck and PCS operations, Hekate streams them through the CPU cache in tiles. This shifts the unit economics of ZK proving from $2.00/hour high-memory cloud instances to $0.10/hour commodity hardware or local edge devices.

I am looking for feedback from the community, especially those working on binary fields, GKR, and memory-constrained SNARK/STARK implementations.


Comments

y00zzeektoday at 7:33 AM

Since the edit window is closed, I want to clarify the AIR structure for those asking about the "row" definition.

In Hekate's Keccak AIR, the relationship is ~25 trace rows per 1 Keccak-f[1600] permutation.

2^24 Rows = The raw size of the execution trace matrix (height). ~671k Permutations = The actual cryptographic workload (equivalent to hashing ~90MB of data).

The benchmark compares the cost to prove the same cryptographic work, regardless of internal AIR row mapping.

THE MANIFESTO: https://github.com/oumuamua-corp/hekate

y00zzeektoday at 4:02 AM

My motivation for building Hekate is simple: I am done watching well-funded teams with 50+ people and a busload of PhDs produce engineering trash.

There is a massive, widening gap between academic brilliance and silicon-level implementation. You can write the most elegant paper in the world, but if your prover requires 100GB of RAM to execute a basic trace, you haven't built a protocol, you've built a research project that collapses under its own weight.

I don't have "strategic planning" committees or HR-mandated consensus. If Hekate's core doesn't meet my performance standards, I rewrite it in 48 hours. This agility is a weapon. I want to prove that a single engineer, driven by physics and zero-copy principles, can wreck the unit economics of a multi-million dollar venture-backed startup.

Disrupting inefficient financial models is more than fun—it's necessary. The current "safe" hiring meta (US-only, HR-compliant, resume-padded candidates) is a strategic failure. While industry leaders focus on compliance, state-sponsored actors like Lazarus are eating their lunch.

You don't need "safe" candidates. You need predators. You need the difficult, inconvenient outliers who don't need a visa to outcode your entire department. Hekate is a reminder that in deep-tech, capital is noise, but performance is the only signal that matters.

show 2 replies
SERSI-Stoday at 4:00 AM

Interesting work. This seems highly relevant for ZK systems that need to generate large proofs on commodity hardware. Streaming-first proving could be a key enabler for permissionless ZK infrastructure

show 1 reply