Automatically translating C to unsafe Rust is pointless, the resultant code is harder to read and th...

veltas • yesterday at 12:11 PM • 6 replies • view on HN

Automatically translating C to unsafe Rust is pointless, the resultant code is harder to read and there's no improvement in understanding how to get the code maintainable and safe, that requires tons of manual work by someone with a deep understanding of the codebase.

Generally the Rust community as well don't seem to have an answer on how to do this incrementally. In business terms we have no idea how to do work slices with demonstrable value, so no way to keep this on track and cut losses if it becomes too much work. This also strongly indicates you're 'stuck' with Rust when you're done, maybe a better and less unidiomatic C++ killer comes later and sounds like you're either going to have to rewrite the whole thing or give up.

I'm definitely open to wisdom on this if anyone disagrees because it is valuable to me and probably most of the readers of this comment section.

Replies

pizza234 • yesterday at 12:30 PM

> Automatically translating C to unsafe Rust is pointless, the resultant code is harder to read and there's no improvement in understanding how to get the code maintainable and safe, that requires tons of manual work by someone with a deep understanding of the codebase.

I have experience on a (nontrivial) translation of a "very unsafe" C codebase to Rust, and it's not true that there is no value in this type of work.

The first step, automatic translation from C to Rust via tools, immediately revealed bugs in the original codebase. This step alone is worth spending some time on the operation.

Ports from C to Rust aren't a binary distribution of "all safe" or no port at all. Some projects, for example ClamAV, are adopting a mixed approach - (part/most of) new code in Rust, and some translation of existing functionalities to Rust.

In general, I think that automatic porting of C to Rust is, in real world, an academic exercise. This is because C codebases designed without safety in mind, simply need to be redesigned, so the domain in not really "how to port C to Rust" - it's "how to redesign and unsafe C codebase to a safe one" first of all. Additionally, I believe that in such cases, maintaining the implementation details is impossible - unsafety is a design, after all.

I personally advocate for very precisely scoped ports, where it can be beneficial (safety an stability); where that's not possible, I agree, better abandon early.

Diggsey • yesterday at 3:37 PM

IMO, safety and "idiomatic-ness" of Rust code are two separate concerns, with the former being easier to automate.

In most C code I've read, the lifetimes of pointers are not that complicated. They can't be that complicated, because complex lifetimes are too error prone without automated checking. That means those lifetimes can be easily expressed.

In that sense, a fairly direct C to Rust translation that doesn't try to generate idomatic Rust, but does accurately encode the lifetimes into the type system (ie. replacing pointers with references and Box) is already a huge safety win, since you gain automatic checking of the rules you were already implicitly following.

Here's an example of the kind of unidiomatic-but-safe Rust code I mean: https://play.rust-lang.org/?version=stable&mode=debug&editio...

If that can be automated (which seems increasingly plausible) then the need to do such a translation incrementally also goes away.

Making it idiomatic would be a case of recognising higher level patterns that couldn't be abstracted away in C, but can be turned into abstractions in Rust, and creating those abstractions. That is a more creative process that would require something like an LLM to drive, but that can be done incrementally, and provides a different kind of value from the basic safety checks.

➕ show 1 reply

zozbot234 • yesterday at 12:18 PM

> Generally the Rust community as well don't seem to have an answer on how to do this incrementally.

You can very much translate C to Rust on a function-by-function basis, the only issue is at the boundary where you're either left with unsafe interfaces or a "safe" but slow interop. But this is inherent since soundness is a global property, even a tiny bit of wrong unsafe code can spoil it all unless you do things like placing your untrusted code in a separate sandbox. So you can do the work incrementally, but much of the advantage accrues at the end.

➕ show 2 replies

the__alchemist • yesterday at 3:07 PM

My 2C: What we need isn't a translater, but painless FFI. The FFI tools avail like cc and bindgen make working results most of the time, but they need [manual] wrapping.

It's kind of a similar situation (Although a bit more complicated) exposing Rust libs in python; PyO3/maturin do the job, but you have to manually wrap.

So... I would like tools that call C code from rust, but with slices etc instead of pointers.

➕ show 1 reply

IshKebab • yesterday at 4:50 PM

It's not pointless. For a start it frees you from the C toolchain so things like cross-compilation and WASM become much easier.

Secondly, it's a sensible first step in the tedious manual work of idiomatic porting. I'm guessing you didn't read the article but it's about automating some of this step too.

➕ show 1 reply

alt Hacker News

Replies