logoalt Hacker News

GNU recutils: Plain text database

144 pointsby polyrandlast Sunday at 7:08 PM44 commentsview on HN

Comments

emil-lplast Sunday at 9:31 PM

https://news.ycombinator.com/item?id=22153665 505 points, 143 comments, 6 years ago

https://news.ycombinator.com/item?id=31832564 155 points, 52 comments, 3 years ago

https://news.ycombinator.com/item?id=15302035 105 points, 46 comments, 8 years ago

show 1 reply
somatlast Sunday at 11:48 PM

On the topic of plain text databases, I have been playing around with wordnet(basically the backbone data structure of a thesaurus ) and while there are a lot of perfectly good libraries to handle the data I was having fun building my own. One interesting thing is that the original data format has the byte offset of every record and link baked into the data, this makes it trivial to avoid having to load the whole thing into memory and you can directly seek to the record in question thus making it a plain text database. Admittedly one only good for reading as writes would have to rebuild all the indexes.

https://wordnet.princeton.edu/documentation/wndb5wn

Honestly now days the whole thing can be trivially loaded into memory but back when the project was started this was much more of a concern, I do know that once I figured this out I started re writing my program to see how little memory I could use, It was a lot of fun to use access patterns other than "load the whole thing into memory"

show 1 reply
qubexlast Sunday at 10:56 PM

Amazingly rugged design that somehow places man and machine on equal footing by throwing us back to the time of casette futurism.

Amazingly poor choice of logo.

show 1 reply
saulpwlast Sunday at 9:23 PM

I use the .rec format whenever I want a database maintained in git/github. The format is ideal if you want reasonable data diffs.

show 1 reply
ndegruchylast Sunday at 10:38 PM

I love recutils. The database format is simple enough, it has a bunch of options for constraints, and it has Bash integration and a great Emacs mode to search, edit and verify the integrity of the database.

Sure, it's not as fast as SQLite or bigger systems, but often it's enough for smaller projects.

wuming2last Monday at 2:22 AM

With recutils csv2rec I have my invoices list.

With recutils recsel | recfmt -f template.fodt I have my invoices.

soffice and curl to generate .pdf s and email them off.

With recutils recset I have my invoices status updated. Done.

binaryturtlelast Sunday at 9:54 PM

For those who get blocked by gnu.org with a 403 (older Firefox) or an even sillier "Too Many Requests" error (older Safari) need to override their user agents strings to "curl" to make the site load again.

show 1 reply
c7blast Monday at 4:45 AM

I like plaintext database files, and this looks way more readable than csv. But what I'd find even more interesting than improved readability would be features like atomicity. There are many tools for joins and other basic operations on tables stored in various text formats, but the real gap to 'proper' databases are ACID properties.

setheronlast Sunday at 10:07 PM

In 2010 I remember people being very proficient with this at Amazon.

I really enjoying the toolset to query logs etc...

Good memories.

novoreorxlast Monday at 3:32 AM

Reminds me of https://news.ycombinator.com/item?id=45458455

In the AI era, the rec file seems to be a great choice for formatting text that will be feed into LLMs. Imagine converting an HTML table into a RAG file, the context will be much clearer.

show 1 reply
lloydatkinsonlast Sunday at 10:54 PM

Tortoise sex is a bold choice for a logo, but certainly memorable.

1718627440last Sunday at 10:49 PM

Thanks for the submission. I had no idea that existed, but I am definitely going to use this now.

adiuslast Monday at 12:49 AM

As a similar, yet more powerful data format I started using Nickel (https://nickel-lang.org). It has very sophisticated typing and transformation features. Highly recommended!

show 2 replies
bsndjdkdlast Sunday at 9:21 PM

[flagged]

show 5 replies
nrclarklast Monday at 5:01 AM

It's a bummer that the library and utilities are GPLv3 - really limits adoption, because it means that product developers can't build it into the kinds of small embedded Linux systems where it would really shine.

show 4 replies