logoalt Hacker News

n_etoday at 8:02 AM4 repliesview on HN

I process TB-size ndjson files. I want to use jq to do some simple transformations between stages of the processing pipeline (e.g. rename a field), but it so slow that I write a single-use node or rust script instead.


Replies

nchmytoday at 8:10 AM

This isn't for you then

> The query language is deliberately less expressive than jq's. jsongrep is a search tool, not a transformation tool-- it finds values but doesn't compute new ones. There are no filters, no arithmetic, no string interpolation.

Mind me asking what sorts of TB json files you work with? Seems excessively immense.

show 2 replies
erutoday at 8:17 AM

This reminds me of someone who wrote a regex tool that matches by compiling regexes (at runtime of the tool) via LLVM to native code.

You could probably do something similar for a faster jq.

messetoday at 8:11 AM

Now I'm really curious. What field are you in that ndjson files of that size are common?

I'm sure there are reasons against switching to something more efficient–we've all been there–I'm just surprised.

show 1 reply