This is as much an indictment of AWS compute as it is anything else.
When I teach, I use "big data" for data that won't fit in a single machine. "Small data" fits on a single machine in memory and medium data on disk.
Having said that duckDB is awesome. I recently ported a 20 year old Python app to modern Python. I made the backend swappable, polars or duckdb. Got a 40-80x speed improvement. Took 2 days.
> The cloud instances have network-attached disks
Props for identifying the issue immediately, but armed with that knowledge, why not redo the benchmark on a different instance type that has local storage? E.g. why not try a `c8id.2xlarge` or `c8id.4xlarge` (which bracket the `c6a.4xlarge`'s cost)?
as a broke ecologist, this little computer can do everything I need in R and word and is a phenomenal build for the price. I'm really enjoying it thus far.
Would it not also work on a raspberry.
With I/O streaming and efficient transformation I do big data on my consumer PC and good old cheap HDDs just fine.
I think it’s relevant to first read [1] to see why they’re doing this. It’s basically done as a meme.
I adore DuckDB.
Did a PoC on a AWS Lambda for data that was GZ'ed in a s3 bucket.
It was able to replace about 400 C# LoC with about 10 lines.
Amazing little bit of kit.
I would have benchmarked with an instance that has local nvme, like c8gd.4xlarge.
The DuckDB team benchmarked with an r7i.16xlarge which uses EBS - that's the expected bottleneck. A fairer comparison would be an i4i or c8gd with local NVMe, where you'd likely see the laptop and cloud instance much closer in practice.
This is awesome.
I wish more companies would do showcases like this of what kind of load you can expect from commodity-ish hardware.
That's not Big Data. If you "need to process Big Data on the move" - what you need is a network.
You could get a laptop with an Nvidia GPU, 16gb ram, 512 ssd... or a 'cheap' Macbook.
I totally understand if you need to compile for iphones. We need to make apps for the lower and middle class people that think a $40/mo cellphone is a status symbol. I get it.
But if you are not... why? I hate windows, but we have Fedora... and you get an Nvidia. Is it just a status symbol? And I have a hard time believing people who tell me stories about low power consumption, because no one had cared about that until Apple pretended people cared about it.
Funny just yesterday I almost bought one but got cold feet and opted for a low range MacBook with M5 chip. The Apple sales rep was not convinced it would be enough when i described using it for vibecoding and deploying so kind of talked me out of getting the Neo. I normally use a mix of LLMs, then connect to Github and do a one-click deploy on CreateOS. Do you think I over-reacted? The price of the Neo is SO attractive, a clean half price compared to what I got.
> compared to 3–5 GB/s
Their numbers are a bit outdated. M5 Macbook pro SSDs are literally 5x this speed. It's wild.
For the TPC-DS results it would also have been nice to show how the macbook neo compares to the AWS instances.
Or am I missing something?
> TL;DR: How does the latest entry-level MacBook perform on database workloads? We benchmarked it to find out.
That's not tldr, that's just subheader.
That c8g.metal-48xl instance costs $7.63008 on demand[1], so for the price of the laptop, you could run queries on it for about ~90 hours.
:shrug: as to whether that makes the laptop or the giant instance the better place to do one's work…
I'm interested by one (not for big data) but only 8 GB or RAM is kinda really sad.
My good old LG Gram (from 2017? 2015? don't even remember) already had 24 GB of RAM. That was 10 years ago.
A decade later I cannot see myself being a laptop with 1/3rd the mem.
this has a phone CPU/memory
other test:
2025-09-08 : "Big Data on the Move: DuckDB on the Framework Laptop 13"
"TL;DR: We put DuckDB through its paces on a 12-core ultrabook with 128 GB RAM, running TPC-H queries up to SF10,000."
https://duckdb.org/2025/09/08/duckdb-on-the-framework-laptop...
Queue the endless blog posts about running tech on the potato macbook and being stunned it’s functional with massive trade-offs. Groundbreaking stuff.
[dead]
[dead]
[flagged]
Mind blown, if you need to handle "big" data on the move - the macbook neo is not the right choice. - Who would have guessed that outcome?
That's an awesome idea to get a bricked MacBook Neo really fast because those idiots soldered the SSD inside
Seems completely unnecessary, there is probably 0 overlap between people who buy a cheap MacBook and people running DuckDB locally
>Can I expect good performance from the MacBook Neo with Slack, Microsoft Office, and Google Chrome signed into Atlassian and a CRM, all running simultaneously?
No.
>Do I reject a world where all of the above is necessary to realize value from an entry-level MacBook?
In theory, yes.
I’ve been tempted to buy one and do “real dev work” on it just to show people it’s not this handicapped little machine.
I built multiple iOS apps and went through two start up acquisitions with my M1 MBA as my primary computer, as a developer. And the neo is better than the M1 MBA. I edited my 30-45 min long 4k race videos in FCP on that air just fine.