logoalt Hacker News

haeseongtoday at 12:14 PM2 repliesview on HN

The query speed deserves the praise, but the JSON ingestion path has quiet footguns nobody mentions here. Every numeric column comes back as a string over JSONEachRow, so a forgotten Number() cast silently turns arithmetic into string concatenation, and with input_format_skip_unknown_fields enabled a single typo in a column name drops that field with no error at all. Worth wiring an assertion that inserts a row and reads it back into CI before trusting the dashboards.


Replies

charrondevtoday at 1:00 PM

We’ve done our JSON ingestion by keeping a schema in the app for all the types we expect, and injecting the types into the query builder.

Then as needed we have materialized columns on our different tables.

ignoramoustoday at 12:58 PM

[dead]