Databases in 2025: A Year in Review

653 points • by viveknathani_ • yesterday at 7:14 AM • 175 comments • view on HN

Comments

Maybe off-topic but,

If you're not familiar with the CMU DB Group you might want to check out their eccentric teaching style [1].

I absolutely love their gangsta intros like [2] and pre-lecture dj sets like [3].

I also remember a video where he was lecturing with someone sleeping on the floor in the background for some reason. I can't find that video right now.

Not too sure about the context or Andy's biography, I'll research that later, I'm even more curious now.

[1] youtube.com/results?search_query=cmu+database

[2] youtu.be/dSxV5Sob5V8

[3] youtu.be/7NPIENPr-zk?t=85

➕ show 1 reply

beders • yesterday at 10:44 AM

While the author mentions that he just doesn't have the time to look at all the databases, none of the reviews of the last few years mention immutable and/or bi-temporal databases.

Which looks more like a blind spot to me honestly. This category of databases is just fantastic for industries like fintech.

Two candidates are sticking out. https://xtdb.com/blog/launching-xtdb-v2 (2025) https://blog.datomic.com/2023/04/datomic-is-free.html (2023)

➕ show 2 replies

TekMol • yesterday at 10:16 AM

From my perspective on databases, two trends continued in 2025:

1: Moving everything to SQLite

2: Using mostly JSON fields

Both started already a few years back and accelerated in 2025.

SQLite is just so nice and easy to deal with, with its no-daemon, one-file-per-db and one-type-per value approach.

And the JSON arrow functions make it a pleasure to work with flexible JSON data.

➕ show 4 replies

A1aM0 • yesterday at 7:56 AM

Pavlo is right to be skeptical about MCP security. The entire philosophy of MCP seems to be about maximizing context availability for the model, which stands in direct opposition to the principle of Least Privilege.

When you expose a database via a protocol designed for 'context', you aren't just exposing data; you're exposing the schema's complexity to an entity that handles ambiguity poorly. It feels like we're just reinventing SQL injection, but this time the injection comes from the system's own hallucinations rather than a malicious user.

➕ show 2 replies

cluckindan • today at 5:45 PM

”I still haven't met anybody who is actively using Dgraph.”

That’s because it is mostly used in national security and military applications in several countries.

p2hari • yesterday at 8:28 AM

The author mentions about it in the name change for edgeDb to gel. However, it could also have been added in the Acquisitions landscape. Gel joined vercel [1].

1. https://www.geldata.com/blog/gel-joins-vercel

➕ show 1 reply

felipelalli • yesterday at 7:06 PM

I think it's time for a big move towards immutable databases that weren't even mentioned in this article. I've already worked with Datomic and immudb: Datomic is very good, but extremely complex and exotic, difficult learning curve to achieve perfect tuning. immudb is definitely not ready for production and starts having problems with mere hundreds of thousands of records. There's nothing too serious yet.

ComputerGuru • yesterday at 10:34 PM

Pg18 is an absolutely fantastic release. Everyone flaks about the async IO worker support, but there’s so much more. Builtin Unicode locales, unique indexes/constraints/fks that can be added in unvalidated state, generated virtual (expression) columns, skip scans on btree indexes (absolutely huge), uuidv7 support, and so much more.

lvl155 • yesterday at 11:47 AM

I want to thank Andy and the entire DB Group at CMU. They’ve done a great job of making database accessible to so many people. They are world class.

➕ show 1 reply

zjaffee • yesterday at 5:34 PM

What an amazing set of articles, one thing that I think he's missed is the clear multi year trends.

Over the past 5 years there's been significant changes and several clear winners. Databricks and Snowflake have really demonstrated ability to stay resilient despite strong competition from cloud providers themselves, often through the privatization of what previously was open source. This is especially relevant given also the articles mentioning of how cloudera and hortonworks failed to make it.

I also think the quiet execution of databases like clickhouse have shown to be extremely impressive and have filled a niche that wasn't previously filled by an obvious solution.

throw0101d • yesterday at 3:59 PM

Regarding distributed(-ish) Postgres, does anyone know if something like My/MariaSQL's multi-master Galera† is around for Pg:

> MariaDB Galera Cluster provides a synchronous replication system that uses an approach often called eager replication. In this model, nodes in a cluster synchronize with all other nodes by applying replicated updates as a single transaction. This means that when a transaction COMMITs, all nodes in the cluster have the same value. This process is accomplished using write-set replication through a group communication framework.

* https://mariadb.com/docs/galera-cluster/galera-architecture/...

This isn't necessarily about being "web scale", but having a first-party, fairly-automated replication solution would make HA easier for a number internal-only stuff much simpler.

† Yes, I am aware: https://aphyr.com/posts/327-jepsen-mariadb-galera-cluster

➕ show 1 reply

qinchencq • today at 10:29 AM

Was hoping to read about graph database, AI-related changes..., but didn't expect this: "I almost died in the spring semester...surprisingly hard to concentrate on important things like databases when you can't breathe." Hope Prof. Pavlo has been breathing better, stellar review.

backtogeek • yesterday at 11:40 AM

I can't believe that article has no mention of SQLite ??

➕ show 1 reply

jereze • yesterday at 1:28 PM

No mention of DuckDB? Surprising.

➕ show 2 replies

divan • yesterday at 2:49 PM

> Acquisitions ... Gel → Vercel

is a bit misleading. Gel (formerly EdgeDB) is sunsetting it's development. (extremely talented) Team joins Vercel to work on other stuff.

That was a hard hit for me in December. I loved working with EdgeQL so much.

➕ show 1 reply

santiagobasulto • yesterday at 9:27 AM

I love these yearly review posts. Thanks Andy and team.

gr4vityWall • yesterday at 11:55 AM

Didn't know MongoDB was suing the company behind FerretDB. That's disgusting.

bzGoRust • yesterday at 1:05 PM

I would like to mention that vector databases like Milvus got lots of new features to support RAG, Agent development, features like BM25, hybrid search etc..

tiemster • yesterday at 2:12 PM

Also emmer (which is perhaps too niche to get mentioned in an article like this), which I focuses more on being a quick/flexible 'data scratchpad', rather than just scale.

https://hub.docker.com/r/tiemster/emmer

➕ show 1 reply

alexpadula • today at 6:54 AM

Been reading these for a few years. I enjoy them, thank you Andy. I hope you’re doing better.

thesurlydev • yesterday at 11:52 PM

Supabase seems to be killing it. I read somewhere they are used by ~70% of YCombinator startups. I wonder how many of those eventually move to self-hosted.

npalli • yesterday at 5:07 PM

Andy is probably the only person who adores Larry Ellison (Oracle) unironically.

➕ show 1 reply

shrx • yesterday at 9:32 AM

Nothing about time series-oriented databases?

➕ show 1 reply

pjmlp • yesterday at 10:39 AM

Over here, it is DB2, SQL Server or Oracle if using a plain RDMS, or whatever DB abstraction layer is provided on top of a SaaS product, where we get to query with some kind of ORM abstraction preventing raw SQL, or GraphQL, without knowing the implementation details.

➕ show 1 reply

codeulike • yesterday at 1:04 PM

Barely any mention of Oracle or MS Sql Server, commonly reckoned to be #1 and #3 most used databases in the world

https://db-engines.com/en/ranking

➕ show 1 reply

andersmurphy • yesterday at 7:52 PM

With a trend towards immutable single writer databases MMAP seems like a massive win.

➕ show 1 reply

jimmar • yesterday at 1:43 PM

> "The Dominance of PostgreSQL Continues"

It seems like the author is more focused on database features than user base. Every metric I can find online says that MySQL/MariaDB is more popular than PostgreSQL. PostgreSQL seems "better" (more features, better standards compliance) but MySQL/MariaDB works fine for many people. Am I living in a bubble?

➕ show 4 replies

cloutiertyler • yesterday at 3:28 PM

How is SpacetimeDB not mentioned here?

➕ show 1 reply

dmarwicke • yesterday at 3:16 PM

we had to restrict ours to views only because it kept trying to run updates. still breaks sometimes when it hallucinates column names but at least it can't do anything destructive

quotemstr • yesterday at 6:43 PM

Why does "database" is surveys like this not include DuckDB and SQLite, which are great [1] embedded answers to Clickhouse and PostgreSQL. Both are excellent and useful databases; DuckDB's reasonable syntax, fast vectorized everything, and support for ingesting the hairiest of data as in-DB ETL make me reach for it first these days, at least for the things I want to do.

Why is it that in "I'm a serious database person" circles, the popular embedded databases don't count?

[1] Yes, I know it's not an exact comparison.

shekispeaks • yesterday at 6:54 PM

TiDB has gained some momentum in silicon valley with companies looking to adopt it. Does he have any commentary on TiDB which is an OLTP and OLAP hybrid?

SchwKatze • yesterday at 2:01 PM

Can we even say that Anyblox is a file format? By my understanding of the project it's "just" a decoder for other file formats to solve the MxN problem.

maximgeorge • yesterday at 10:29 AM

[dead]

cryptica • yesterday at 3:47 PM

It's so weird how everyone nowadays is using Postgres. It's not like end users can see your database.

It's disturbing how everyone is gravitating towards the same tools. This started happening since React and kept getting worse. Software development sucks nowadays.

All technical decisions about which tools to use are made by people who don't have to use the tools. There is no nuance anymore. There's a blanket solution for every problem and there isn't much to choose from. Meanwhile, software is less reliable than it's ever been.

It's like a bad dream. Everything is bad and getting worse.

➕ show 2 replies

alt Hacker News

Databases in 2025: A Year in Review

Comments