logoalt Hacker News

liampullestoday at 2:54 PM7 repliesview on HN

Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.

In practice though, for most enterprise web services, a lot of real world performance comes down to how efficiently you are calling external services (including the database). Just converting a loop of queries into bulk ones can help loads (and then tweaking the query to make good use of indexes, doing upserts, removing unneeded data, etc.)

I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.


Replies

laughing_mantoday at 7:29 PM

I've always found ORMs to be performance killers. It always worked out better to write the SQL directly. The idea that you should have a one-to-one correspondence between your data objects and your database objects is disastrous unless your data storage is trivial.

show 1 reply
j-vogeltoday at 3:34 PM

Author here. DB and external service calls are often the biggest wins, thanks for calling that out.

In my demo app, the CPU hotspots were entirely in application code, not I/O wait. And across a fleet, even "smaller" gains in CPU and heap compound into real cost and throughput differences. They're different problems, but your point is valid. Goal here is to get more folks thinking about other aspects of performance especially when the software is running at scale.

Seattle3503today at 5:26 PM

> I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.

Maybe we can ditch active models like those we see in sqlalchemy, but the typed query builders that come with ORMs are going to become more important, not less. Leveraging the compiler to catch bad queries is a huge win.

show 1 reply
ackfoobartoday at 6:16 PM

> ditch ORMs ... make good use of SQL

I think Java (or other JVM languages) are then best positioned, because of jooq. Still the best SQL generation library I've used.

show 2 replies
cogman10today at 4:04 PM

Easy to get wrong as well.

There's a balance with a DB. Doing 1 or 2 row queries 1000 times is obviously inefficient, but making a 1M row query can have it's own set of problems all the same (even if you need that 1M).

It'll depend on the hardware, but you really want to make sure that anything you do with a DB allows for other instances of your application a chance to also interact with the DB. Nothing worse than finding out the 2 row insert is being blocked by a million row read for 20 seconds.

There's also a question of when you should and shouldn't join data. It's not always a black and white "just let the DB handle it". Sometimes the better route to go down is to make 2 queries rather than joining, particularly if it's something where the main table pulls in 1000 rows with only 10 unique rows pulled from the subtable. Of course, this all depends on how wide these things are as well.

But 100% agree, ORMs are the worst way to handle all these things. They very rarely do the right thing out of the box and to make them fast you ultimately end up needing to comprehend the SQL they are emitting in the first place and potentially you end up writing custom SQL anyways.

show 1 reply
sigbottletoday at 4:51 PM

> Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.

I recently fixed a treesitter perf issue (for myself) in neovim by just dfsing down the parse tree instead of what most textobject plugins do, which is:

-> walk the entire tree for all subtrees that match this metadata

-> now you have a list of matching subtrees, iterate through said subtree nodes, and see which ones are "close" to your cursor.

But in neovim, when I type "daf", I usually just want to delete the function right under my cursor. So you can just implement the same algorithm by just... dfsing down the parse tree (which has line numbers embedded per nodes) and detecting the matches yourself.

In school, when I did competitive programming and TCS, these gains often came from super clever invariants that you would just sit there for hours, days, weeks, just mulling it over. Then suddenly realize how to do it more cleverly and the entire problem falls away (and a bunch of smart people praise you for being smart :D). This was not one of them - it was just, "go bypass the API and do it faster, but possibly less maintainably".

In industry, it's often trying to manage the tradeoff between readability, maintainability, etc. I'm very much happy to just use some dumb n^2 pattern for n <= 10 in some loop that I don't really care much about, rather than start pulling out some clever state manipulation that could lead to pretty "menial" issues such as:

- accidental mutable variables and duplicating / reusing them later in the code

- when I look back in a week, "What the hell am I doing here?"

- or just tricky logic in general

I only noticed the treesitter textobject issue because I genuinely started working with 1MB autogen C files at work. So... yeah...

I could go and bug the maintainers to expose a "query over text range* API (they only have query, and node text range separately, I believe. At least of the minimal research I have done; I haven't kept up to date with it). But now that ties into considerations far beyond myself - does this expose state in a way that isn't intuitive? Are we adding composable primitives or just ad hoc adding features into the library to make it faster because of the tighter coupling? etc. etc.

I used to think of all of that as just kind of "bs accidentals" and "why shouldn't we just be able to write the best algorithms possible". As a maintainer of some systems now... nah, the architectural design is sometimes more fun!

I may not have these super clever flashes of insight anymore but I feel like my horizons have broadened (though part of it is because GPT Pro started 1 shotting my favorite competitive programming problems circa late 2025 D: )

philipwhiuktoday at 4:29 PM

> external services (including the database)

Or even the local filesystem :)

CPU calls are cheap, memory is pretty cheap, disk is bad, spinning disk is very bad, network is 'good luck'.

You can O(pretty bad) most of the time as long as you stay within the right category of those.