About 2 months ago I would have said the same as the author, but I kept running against the hard edg...

miningape • 11/08/2024 • 18 replies • view on HN

About 2 months ago I would have said the same as the author, but I kept running against the hard edges of Rust: the borrow checker. I realised that while I really liked using algebraic data types (e.g. Enums) and pattern matching, the borrow checker and the low level memory concerns meant I spent a lot of time fighting the borrow checker instead of fighting the PL issues at the heart of my project. So while tokenising/parsing was nice, interpreting and typechecking became the bane of my existence

With that realisation I started looking for another more suitable language - I knew the FP aspects of Rust are what I was looking for so at first I considered something like F# but I didn't like that it's tied to microsoft/.NET. Looking a bit further I could have gone with something like Zig/C but then I lose the FP niceness I'm looking for. I also spent a fair amount of time looking at Go, but eventually decided that 1. I wanted a fair amount of syntax sugar, and 2. golang is a server side language, a lot of its features and library are geared towards this use case.

Finally I found OCaml, what really convinced me was seeing the syntax was like a friendly version of Haskell, or like Rust without lifetimes. In fact the first Rust compiler was written in OCaml, and OCaml is well known in the programming language space. I'm still learning OCaml so I'm not sure I can give a fair review yet, but so far it's exactly what I was looking for.

Replies

krick • 11/08/2024

Bringing up goland always annoys me for some reason. Like, it's really practical, it is fast, but not actually low-level, it compiles fast, and most importantly it is very popular and has all the libraries. It seems like I should use it. But I just almost irrationally hate the language itself. Everything about it is just so ugly. It's a language invented in 2009 by some C-people who are apparently oblivious to everything that excited PL design folks for the last 20 years (as of 2009). PHP in 2009 was already a more modern and better designed language than goddamn golang. And golang didn't really improve since. I just cannot let it go somehow.

➕ show 13 replies

rtpg • 11/08/2024

I think one core thing that you have to do with ASTs in Rust is to absolutely not store strings of the like in your AST. Instead you want to use something like static string libraries (so you get cheap clones and interning) and for things like positions in text you want to use just indices. Absolutely avoid storing references if you can avoid it!

The more your stuff is held in things that are cheap/free to clone, the less you have to fight the borrow checker… since you can get away with clones!

And for actual interpretation there’s these libraries that can help a lot for memory management with arenas and the like. It’s super specialized stuff but helps to give you perf + usability. Projects like Ruffle use this heavily and it’s nice when you figure out the patterns

Having said that OCaml and Haskell are both languages that will do all of this “for free” with their built in reference counting and GC… I just like the idea of going very fast in Rust

radicalbyte • 11/08/2024

I've been writing a lot of Golang in the last year and I wouldn't use it for writing a parser. It's just a modernised C, the model it provides is very simple (coming from C# the simplicity actually made it harder to learn!) and is very well suited to small, focused applications where a low conceptual load are beneficial and the trade off of verbosity are acceptable.

F# or even the latest version of C# are what I would recommend. Yes Microsoft are involved but if you're going to live in a world where you won't touch anything created by evil corporations then you're going to have a hard time. Java, Golang, Python, TypeScript/Javascript and Swift all suffer from this. That leaves you with very little choice.

I'd be interested in hearing your thoughts over OCaml after a year or so of using it. The Haskell-likes are very interesting but Haskell itself has a poor learning curve / benefit ratio for me (Rust is similar there actually; I mastered C# and made heavy use of the type system but that involved going very very deep into some holes and I don't have the time to do that with Rust).

➕ show 3 replies

ThePhysicist • 11/08/2024

Lots of folks use Golang on the client side, even on mobile (for which Go has really great support with go-mobile). Of course it adds around 10-20 MB to your binary and memory footprint but in todays world that's almost nothing. I think Tailscale e.g. uses Golang as a cross-platform Wireguard layer in their mobile and desktop apps, seems to work really well. You wouldn't build a native UI using Golang of course but for low-level stuff it's fantastic. Tinygo even allows you to write Golang for microcontrollers or the web via Webassembly, lots of things aren't supported there but a large part of the standard library is.

➕ show 1 reply

ralegh • 11/08/2024

I wouldn't call Go a 'server side' language. The Go compiler is written in Go, for example! Cross compilation and (relatively) small binaries make it super easy for distribution. Syntax sugar is a fair point though, it doesn't lend itself to functional-y pattern matching.

➕ show 2 replies

materielle • 11/08/2024

I love Go for writing servers. And in fact, I do it professionally. But I totally agree that for parsers, it’s not the right tool for the job.

First off, the only way to express union types is with runtime reflection. You might as well be coding in Python (but without the convenient syntax sugar).

Second off, “if err != nil” is really terrible in parsers. I’m actually somewhat of a defender of Go’s error handling approach in servers. Sure, it could have used a more convenient syntax. But in servers, I almost never return an error without handling it or adding additional context. The same isn’t true in parser’s though. Almost half of my parser code was error checks that simply wouldn’t exist in other languages.

For Rust, I think the value proposition is if you are also writing a virtual machine or an interpreter, your compiler front end can be written in the same language as your backend. Your other alternatives are C and C++, but then you don’t have sum types. You could write the front end in Ocaml, but then you would have to write the backend and runtime in some other language anyways.

FrustratedMonky • 11/08/2024

FSharp is OCaml to great extent. So if you don't have the need to stay away from MS/.NET, it is more 'open source' than the rest of MS products. MS did release Fsharp with an Open Source License.

But, it does still run on .NET.

At this point, isn't every major language controlled by one main corporate entity?

Except Python? But Python doesn't have algebraic types, or very complete pattern matching.

➕ show 2 replies

pimeys • 11/08/2024

OCaml has been pretty common tool to write parsers for many years. Not a bad choice.

I've written parsers professionally with Rust for two companies now. I have to say the issues you had with the borrow checker are just in the beginning. After working with Rust a bit you realize it works miracles for parsers. Especially if you need to do runtime parsing in a network service serving large traffic. There are some good strategies we've found out to keep the borrow checker happy and at the same time writing the fastest possible code to serve our customers.

I highly recommend taking a look how flat vectors for the AST and using typed vector indices work. E.g. you have vector for types as `Vec<Type>` and fields in types as `Vec<(TypeId, Field)>`. Keep these sorted, so you can implement lookups with a binary search, which works quite well with CPU caches and is definitely faster than a hashmap lookup.

The other cool thing with writing parsers with Rust is how there are great high level libraries for things like lexing:

https://crates.io/crates/logos

The cool thing with Logos is it keeps the source data as a string under the surface, and just refers to a specific locations in it. Now use these tokens as a basis for your AST tree, which is all flat data structures and IDs. Simplify the usage with a type:

    #[Clone, Copy]
    struct Walker<'a, Id> {
        pub id: Id,
        pub ast: &'a Ast,
    }

    impl<'a, Id> Walker<'a, Id> {
        pub fn walk<T>(self, other_id: T) -> Walker<'a, T> {
            Walker { id: other_id, ast: self.ast }
        }
    }

Now you can specialize these with type aliases:

    type TypeWalker<'a> = Walker<'a, TypeId>;

And implement methods:

    impl<'a> TypeWalker<'a> {
        fn as_ref(&self) -> &'a Type {
            &self.ast[self.id]
        }
        
        fn name(&self) -> &'a str {
            &self.as_ref().name
        }
    }

From here you can introduce string interning if needed, it's easy to extend. What I like about this design is how all the IDs and Walkers are Copy, so you can pass them around as you like. There's also no reference counting needed anywhere, so you don't need to play the dance with Arc/Weak.

I understand Rust feels hard especially in the beginning. You need to program more like you write C++, but with Rust you are enforced to play safe. I would say an amazing strategy is to first write a prototype with Ocaml, it's really good for that. Then, if you need to be faster, do a rewrite in Rust.

➕ show 3 replies

kemaru • 11/08/2024

You wouldn't be losing FP niceness with Zig, and the pattern matching and enum situation is also similar to Rust. Even better, in a few areas, for example arbitrary-width integers and enum tagging in unions/structs. Writing parsers and low level device drivers is actually quite comfortable in Zig.

almostdeadguy • 11/08/2024

With some patience and practice, I think reasoning about borrows becomes second nature. And what it buys you with lexing/parsing is the ability to do zero-copy parsing.

balencpp • 11/08/2024

Did you discover Scala 3 and give it a thought? I think of it as Rust with an _overall_ stronger type-system, but where you don't have to worry about memory management. It has an amazing standard library, particularly around collections. You get access to the amazing JVM ecosystem. And more. Martin Odersky in fact sees Scala's future lying in being a simpler Rust.

Also, regarding F#. It runs on .NET, and indeed, since the ecosystem and community are very small, you need to rely on .NET (basically C#) libraries. But it's really not "tied" to Microsoft and is open source.

packetlost • 11/08/2024

Go is not a great language to write parsers in IMO, it just doesn't have anything that makes such a task nice. That being said, people seem to really dislike Go here, which is fine, but somewhat surprising. Go is extremely simple. If you take a look at it's creators pedigree, that should make a ton of sense: they want you to make small, focused utilities and systems that use message passing (typically via a network) as the primary means of scaling complexity. What I find odd about this is that it was originally intended as a C++ replacement.

➕ show 2 replies

PittleyDunkin • 11/08/2024

The borrow checker is definitely a pain, but it stops being such a pain once you design your types around ownership and passing around non-owned pointers or references or indexes.

➕ show 1 reply

neonsunset • 11/08/2024

> I considered something like F# but I didn't like that it's tied to microsoft/.NET.

Could you explain your thought process when deciding to not use F# because it runs on top of .NET? (both of which are open-source, and .NET is what makes F# fast and usable in almost every domain)

➕ show 1 reply

MrMcCall • 11/08/2024

(OCaml Question Ahead)

I agree on F#. It changed my C && OO perspective in fantastic ways, but I too can't support anything Microsoft anymore.

But, seeing as OCaml was the basis for F#, I have a question, though:

Does OCaml allow the use of specifically sized integer types?

I seem to remember in my various explorations that OCaml just has a kind of "number" type. If I want a floating point variable, I want a specific 32- or 64- or 128-bit version; same with my ints. I did very much like F# having the ability to specify the size and signedness of my int vars.

Thanks in advance, OCaml folks.

➕ show 2 replies

wyager • 11/08/2024

Having written probably several hundred kloc of both Haskell and OCaml, I strongly prefer Haskell. A very simple core concept wrapped in an extremely powerful shell. Haskell is a lot better for parsing tasks because (among other considerations) its more powerful type system can better express constraints on grammars.

omginternets • 11/08/2024

I had a similar journey of enlightenment that likewise led me to OCaml. Unless you're doing low-level systems programming, OCaml will give you the "if it compiles, it's probably right" vibe with much less awkward stuff to type.

smolder • 11/08/2024

I think you missed something if you felt the borrow checker made things too hard. You can just copy and move on. Most languages do less efficient things anyway.

➕ show 1 reply

alt Hacker News

Replies