logoalt Hacker News

notpushkintoday at 4:08 PM1 replyview on HN

This is really cool. I probably won’t be using it directly, but will definitely study some architecture and implementation decisions.

> Compliance Core: Immutable audit logs with blockchain-style hashing (prev_hash) for integrity.

Had this in the back of my mind for a while now, too. In terms of prior art, Keybase had been doing something similar, but with Merkle trees.

> I’d love feedback on the DSL implementation

Could you tell in a bit more detail why you decided to go with your own DSL here? :)


Replies

lalitgehanitoday at 4:44 PM

Great question!

Keybase's Merkle approach is elegant for their use case (efficient proofs without revealing the full chain), but I went simpler with a linear chain because:

1. Audit trails are inherently sequential - they're ordered by time and typically read/written in order. Merkle trees shine for unordered data where you need efficient inclusion proofs. 2. Verification simplicity - with a linear chain, integrity verification is just "walk the sequence and check that each entry's previous_hash matches the prior entry's checksum." O(n) and dead simple. 3. Storage efficiency - each entry stores two SHA-256 hashes as strings. No tree overhead. 4. Regulatory fit - for GxP/CFR Part 11 compliance, the requirement is tamper detection, not zero-knowledge proofs. A linear chain detects any modification equally well.

That said, if I ever need selective verification (prove entry #500 is valid without transmitting the full chain), I'd revisit Merkle. The implementation is in src/snackbase/infrastructure/persistence/repositories/audit_log_repository.py if you're curious.

On the custom DSL: This was the biggest architectural decision in SnackBase. Here's the honest breakdown:

Why not just use existing options? >Approach: Python eval() >>Why I didn't choose it: Security nightmare - can't safely let users store arbitrary code in the database

>Approach: CEL (Google's Common Expression Language) >>Why I didn't choose it: Battle-tested, but heavy dependency and less control over semantics

>Approach: JEXL/JSONLogic >>Why I didn't choose it: Another runtime to learn, harder to integrate with my macro system

>Approach: Pure JSON rules >>Why I didn't choose it: Becomes unreadable for complex expressions

What drove the decision:

1. Permissions are database-storable - rules live as strings in the permissions table, editable via API and admin UI. I needed something safe to parse and evaluate at runtime. 2. Sandboxed execution - the DSL only exposes specific operations (==, in, @has_role(), etc.). No imports, no file access, no arbitrary code. Even if someone compromises the admin UI, they can only express logic within the vocabulary I provide. 3. Syntax for non-programmers - "@has_role('admin') and @owns_record()" is more approachable than Python lambdas when you're building permissions in a web UI. 4. Macro integration - the @ prefix ties into my SQL macro system, letting users define reusable business logic like @is_department_head() that executes database queries.

The trade-off:

It's 700+ lines of lexer/parser/evaluator code I have to maintain. Every edge case (null handling, type coercion, short-circuit evaluation) needs explicit test coverage. Debugging a failed rule means returning syntax errors at position X rather than a stack trace.

If I were starting fresh today, I'd give CEL harder consideration. But since permissions are a core differentiator for SnackBase, having full control over the semantics has been worth it—especially for field-level access control and the wildcard collection system.

Implementation files if you're interested in it: - src/snackbase/core/rules/lexer.py - tokenizer - src/snackbase/core/rules/parser.py - recursive descent → AST - src/snackbase/core/rules/evaluator.py - async evaluation with short-circuiting

Happy to go deeper on any of this!