logoalt Hacker News

Show HN: Xmloxide – an agent-made Rust replacement for libxml2

54 pointsby jawigginsyesterday at 11:44 PM56 commentsview on HN

Recently several AI labs have published experiments where they tried to get AI coding agents to complete large software projects.

- Cursor attempted to make a browser from scratch: https://cursor.com/blog/scaling-agents

- Anthropic attempted to make a C Compiler: https://www.anthropic.com/engineering/building-c-compiler

I have been wondering if there are software packages that can be easily reproduced by taking the available test suites and tasking agents to work on projects until the existing test suites pass.

After playing with this concept by having Claude Code reproduce redis and sqlite, I began looking for software packages where an agent-made reproduction might actually be useful.

I found libxml2, a widely used, open-source C language library designed for parsing, creating, and manipulating XML and HTML documents. Three months ago it became unmaintained with the update, "This project is unmaintained and has [known security issues](https://gitlab.gnome.org/GNOME/libxml2/-/issues/346). It is foolish to use this software to process untrusted data.".

With a few days of work, I was able to create xmloxide, a memory safe rust replacement for libxml2 which passes the compatibility suite as well as the W3C XML Conformance Test Suite. Performance is similar on most parsing operations and better on serialization. It comes with a C API so that it can be a replacement for existing uses of libxml2.

- crates.io: https://crates.io/crates/xmloxide

- GitHub release: https://github.com/jonwiggins/xmloxide/releases/tag/v0.1.0

While I don't expect people to cut over to this new and unproven package, I do think there is something interesting to think about here in how coding agents like Claude Code can quickly iterate given a test suite. It's possible the legacy code problem that COBOL and other systems present will go away as rewrites become easier. The problem of ongoing maintenance to fix CVEs and update to later package versions becomes a larger percentage of software package management work.


Comments

wooptootoday at 1:40 AM

A comment on libxml, not on your work: Funny how so many companies use this library in production and not one steps in to maintain this project and patch the issues. What a sad state of affairs we are in.

show 2 replies
Imustaskforhelptoday at 8:32 AM

Can this work with XLSX (The Open XML format) & .odt format though these also use zip. It would be interesting to think if this can help solve this and create a rust GUI app with very basic XLSX doc editing as alternative to OpenOffice/LibreOffice.

yobbotoday at 7:56 AM

The code might be a little verbose which is tiresome for humans to read and follow. Structure and functions look idiomatic. It seems to be using xml parser idioms which makes it readable.

It could be doing double checks in both tokeniser and parser and things like that.

Actually looks like a good starting point and reference for someone working on xml parsers in rust.

kburmantoday at 1:18 AM

Amazing work! I'd love to hear more details about your workflow with Claude Code.

As a side note and this isn't a knock on your project specifically. I think the community needs to normalize disclaimers for "vibe-coded" packages. Consumers really need to understand the potential risks of relying on agent-generated code upfront.

show 1 reply
hrtlatoday at 4:59 AM

Yes, you can rip off any sucker who published a test suite when the AI is trained on existing code as well. Congratulations, you will be showered with praise and AI mafia money.

agentifyshtoday at 7:21 AM

lot of weird comments here getting upset AI was used but thanks for doing this

libxml2 is always one of those libraries that i used to have trouble with for different platforms

I think its great that more and more OSS projects get attention now with ai coding agents

alexhanstoday at 2:03 AM

> I do think there is something interesting to think about here in how coding agents like Claude Code can quickly iterate given a test suite.

This is a point I've tried to advocate for a while. Specially to empower non coders and make them see that we CAN approach automation with control.

Some aspects will be the classic unit or integration tests for validation. Others, will be AI Evals [1] which to me could be the common language for product design for different families/disciplines who don't quite understand how to collaborate with each other.

The amount of progress in a short time is amazing to see.

- [1] https://ai-evals.io/

show 1 reply
bleggetoday at 12:58 AM

> arena-based tree with zero unsafe in the public API

Why "in the public API"? Does this imply it's using unsafe behind the hood? If so, what for?

show 3 replies
mkjtoday at 4:36 AM

Intriguing work! Does it panic on any bad inputs? That's better than memory unsafety of libxml2, but still a DoS concern for some servers.

nicoburnstoday at 1:09 AM

How does it compare to the original in terms of source code size (number of lines of code?)

show 1 reply
fourtharktoday at 1:04 AM

Does it fix the security flaws that caused the original project to be shut down?

show 3 replies
benatkintoday at 3:20 AM

It would be interesting to try this approach out with mQuickJS, QuickJS or micropython. They could potentially run hoops around the ones that were first coded in Rust, such as Boa or RustPython.

dmitrygrtoday at 7:55 AM

cool, now do it without the test suite that some human made for you

mdavid626today at 6:17 AM

Can you add “made with AI” to the GitHub repo?

It’s time to make this mandatory.

Nothing against AI - just to inform people about quality, maintainability and future of this library. No human has mental model of the code, so don’t waste your time creating it - the original author didn’t either.

show 1 reply
lynxbot2026today at 3:35 AM

[flagged]

show 1 reply
man4today at 12:58 AM

[dead]