logoalt Hacker News

Elasticsearch Was Never a Database

48 pointsby jamesgresqllast Sunday at 8:54 PM44 commentsview on HN

Comments

roywigginstoday at 5:57 PM

> Elastic has been working on this gap. The more recent ES|QL introduces a similar feature called lookup joins, and Elastic SQL provides a more familiar syntax (with no joins). But these are still bound by Lucene’s underlying index model. On top of that, developers now face a confusing sprawl of overlapping query syntaxes (currently: Query DSL, ES|QL, SQL, EQL, KQL), each suited to different use cases, and with different strengths and weaknesses.

I suppose we need a new rule, "Any sufficiently successful data store eventually sprouts at least one ad hoc, informally-specified, inconsistency-ridden, slow implementation of half of a relational database"

show 3 replies
speedgoosetoday at 5:58 PM

Accenture managed to build a data platform for my company with Elasticsearch as the primary database. I raised concerns early during the process but their software architect told me they never had any issues. I assume he didn’t lie. I was only an user so I didn’t fight and decided to not make my work rely on their work.

show 4 replies
PedroBatistatoday at 5:53 PM

I really never understood how people could store very important information in ES like it was a database.

Even if they don't understand what ES is and what a "normal" database is, I'm sure some of those people run into issues where their "db" got either corrupted of lost data even when testing and building their system around it. This is and was general knowledge at the time, it was no secret that from time to time things got corrupted and indexes needed to be rebuilt.

Doesn't happen all the time, but way greater than zero times and it's understandable because Lucene is not a DB engine or "DB grade" storage engine, they had other more important things to solve in their domain.

So when I read stories of data loss and things going South, I don't have sympathy for anyone involved other than the unsuspecting final clients. These people knew or more or less knew and choose to ignore and be lazy.

show 4 replies
lvspifftoday at 5:51 PM

Everything is a database if you believe hard enough

Feel like the christmas story kid --

>simplicity, and world-class performance, get started with XXXXXXXX.

A crummy commercial?

cluckindantoday at 6:02 PM

”That means a recently acknowledged write may not show up until the next refresh.”

Which is why you supply the parameter

  refresh: ”wait_for”
in your writes. This forces a refresh and waits for it to happen before completing the request.

”schema migrations require moving the entire system of record into a new structure, under load, with no safety net”

Use index aliases. Create new index using the new mapping, make a reindex request from old index to new one. When it finishes, change the alias to point to the new index.

The other criticisms are more valid, but not entirely: for example, no database ”just works” without carefully tuning the memory-related configuration for your workload, schema and data.

show 1 reply
aaroninsftoday at 8:24 PM

We use ES like a DB, but, not with SQL; and most importantly, it's not the source of truth/primary store. It's operational truth and best-effort.

toenailtoday at 5:56 PM

I think elastic always clearly documented to expect "eventual consistency", they never claimed to be a "database" in the sense that tfa defines.

show 1 reply
jamesgresqllast Sunday at 8:54 PM

I know it sounds obvious, but some people are pretty determined to us it that way!

gmusleratoday at 7:52 PM

... for a particular, opinionated definition of what a database should be.

unethical_bantoday at 5:45 PM

I work in infosec and several popular platforms use elasticsearch for log storage and analysis.

I would never. Ever. Bet my savings on ES being stable enough to always be online to take in data, or predictable in retaining the data it took in.

It feels very best-effort and as a consultant, I recommend orgs use some other system for retaining their logs, even a raw filesystem with rolling zips, before relying on ES unless you have a dedicated team constantly monitoring it.

show 3 replies
stefanontoday at 6:37 PM

Yep!

this_usertoday at 6:46 PM

I mean, it is called "ElasticSEARCH", not "Elasticdatabase".

show 1 reply
alittletooraph2today at 5:55 PM

[dead]