I’ll plug my series of project ideas that have also been discussed here on HN over the years: Challenging programming projects every programmer should try
This is from codecrafters.io which is a platform that facilitates working on projects like these while essentially providing integration tests to keep you honest, as well some community. You work through well defined requirements to reach the full implementation. I’m currently working on their build your own redis project. It’s quite fun.
I don’t think this is AI generated. They ask the community for new project ideas, this list is probably made up of those they’ve received while plugging the challenges they already have implemented.
Highly recommend writing a BitTorrent client. The spec is easy to grok, it has a bunch of fun subproblems that you can go as deep or as shallow as you want into, and it's super rewarding being able to download something like the Debian kernel after all of your hard work. Magnet links and seeding are two fun things to tackle post basic implementation. It also got me really interested in peer to peer systems and DHTs like Chord!
This is a strange list. #58 is make your own malloc, ok. That's a moderately difficult project for a new developer (made harder if they don't know anything about what malloc actually does under the hood, you may need to study up a bit on operating systems and some other things before you even start). Followed by #59 where they suggest you build your own streaming protocol from scratch...
There are some good projects in there, but the levels of difficulty are all over the place.
I'm curious of HN's opinions on 4chan's /g/ programming challenges. IMO, the difficulties feel a bit arbitrary - for example, claiming that a basic bootloader is more difficult than a C compiler.
https://camo.githubusercontent.com/a4ce28d9d68f8d5443aef3123...
Build something intentionally small and complete a tiny tool or protocol you can understand end-to-end. The satisfaction comes from clarity, constraints, and finishing the whole arc, not scale.
Some of these could take a day, like random tree / forest.
Others are easily within the scope / size of a undergrad final project. Or even a masters degree thesis.
Looking through this list makes me feel as if I am not a terribly good programmer, as these all feel well beyond my capabilities.
This reads a bit similar to the build-your-own-x series
https://github.com/codecrafters-io/build-your-own-x
Feel like one of these things a lot of talk about but very tiny do ...
I'm currently working through the second Ray Tracing in One Weekend book. Fun stuff.
I think the most reliable way to understand a system is to directly implement the internals of a library.
In particular, hands-on experience with networks and file systems is incredibly helpful when writing high-level code.
AI usage verboten? Or erlaubt?
This list seems almost certainly AI-generated.
I asked Gemini 3 Pro about the relative difficulty of each project in the list and got the following (parenthesized notes are also by Gemini). Gemini noted that the time estimate is based on the assumption that you already understand the theory (which time estimate would extremely vary anyway) and only accounts for pure PoC implementation and debugging. The numbers look reasonable at my sketchy glance but of course YMMV.
[Difficulty: Low]
42. Twitter Trends 5--10h (If you understand the probabilistic math)
2. Wordle Solver 5--10h (Pure logic/algorithm)
17. BMP Codec 5--10h
23. Auth Server (JWT) 5--10h
24. Autocomplete System 5--10h
66. Browser Extension 5--15h
15. Diff Tool 8--15h (Algorithms heavy)
9. Six Degrees of Kevin Bacon 10--20h (Classic graph problem)
7. Googlebot (Crawler) 10--20h
65. Make 10--20h
[Difficulty: Moderate]
32. Web Server 10--20h
41. Time Sync Daemon (NTP) 10--20h
53. Malware 10--20h
58. Malloc 10--20h
63. Shell 10--20h
19. Quantum Computer Simulation 15--25h (Assuming you know the linear algebra already)
26. Background Noise Remover 15--25h (Math/Signal Processing heavy)
11. Procedural Crosswords 15--25h
39. CDN Caching 15--25h
47. Ray Tracer 15--25h
57. Load Balancer 15--25h
61. CI System 15--25h
62. Random Forest 15--25h
67. Stock Trading Bot 15--25h
56. Lock-Free Data Structures 15--30h (But debugging is painful)
16. Visualize Object-Oriented Code 15--30h (Language parsing is the bottleneck)
5. Container (No Docker) 15--30h (Requires deep Linux systems knowledge)
8. DNS Server 15--30h (Strict RFC compliance required)
70. OpenGL 15--30h
12. Bitcask (KV Store) 20--30h
38. Wikipedia Search 20--30h
50. Amazon Delivery (Vehicle Routing) 20--30h
46. Zip 20--35h (Algorithms heavy)
1. Bittorrent Client 20--40h (Binary parsing and managing async network states)
18. Filesystem (FUSE) 20--40h (Debugging kernel interfaces can be slow)
60. Smart Home 20--40h (Hardware integration eats time)
40. TikTok (Feed) 20--40h (Mostly frontend/UI state complexity)
21. Redis Clone 20--40h
29. Road Network 20--40h
31. Evolutionary Design 20--40h
34. Git 20--40h
59. Netflix (Streaming) 20--40h
69. Automated Journal 20--40h
13. Audio Fingerprinting 25--40h (DSP is sensitive to parameters)
52. Knowledge Graph 25--45h
64. Bitcoin Node 25--45h
14. Dangerous Dave (Game) 30--50h
48. Programming Language 30--50h
[Difficulty: High]
33. Depth Estimation 25--50h (Computer Vision math)
35. GDB (Debugger) 30--50h (Low-level systems programming)
72. Audio Multicast 30--50h (Syncing audio clocks over network is hard)
43. SQL Optimizer 30--50h
36. Neural Networks 30--60h (Debugging gradient calculations is tough)
71. Laser Tag 30--60h (Hardware debugging)
3. Deepfake (Optimal Transport) 30--60h (Math-heavy; debugging matrix operations is difficult)
51. Kafka Broker 30--60h
20. VLC (Video Player) 40--60h (A/V sync drift is very difficult to get right)
28. Google Maps 40--60h
30. Collaborative Editor 40--70h (CRDTs are conceptually dense)
37. Chess 40--70h (Performance optimization is a rabbit hole)
45. VPN 40--70h
27. Dropbox Clone 40--80h (Conflict resolution and sync logic are extremely error-prone)
4. Spreadsheet 40--80h (Cycle detection and UI state management are tricky)
10. RAFT 40--80h (Distributed systems are notoriously hard to debug due to race conditions)
68. Browser Engine 40--80h
73. Decentralized Internet 40--80h
49. Messenger 50--100h
22. Video Editor (Client-side) 50--80h (Browser constraints + heavy compute)
[Difficulty: Very High]
44. Anonymous Voting 40--80h (Cryptography is unforgiving)
6. Geometric Theorem Proving 50--100h+ (Essentially building a symbolic AI engine)
55. TCP/IP Stack 60--100h (TCP state machines are massive)
25. SQLite Clone 60--120h (A database engine combines almost every discipline of CS)
54. Game Boy Advance Emulator 80--150h (Requires extremely precise bit-twiddling and timing)This list is ridiculous, I was expecting something like A* pathfinding, or even kernel extensions.
This is just AI generated slop with things being all over the map with no details/notes etc.
A far better way is to go through the book series The Architecture of Open Source Applications and pick one which catches your fancy - https://aosabook.org/en/ There are enough details/notes here from experts to show one how to think about an application so that you have something concrete to start from.
[dead]
I see comments suspecting this list is AI-generated. That might be true. But ironically, the practice of "building from scratch" is the best antidote to AI dependency.
Writing from Japan, we call this process "Shugyo" (austere training). A master carpenter spends years learning to sharpen tools, not because it's efficient, but to understand the nature of the steel.
Building your own Redis or Git isn't about the result (which AI can give you instantly). It is about the friction. That friction builds a mental model that no LLM can simulate.
Whether this post is marketing or not, the "Shugyo" itself is valid.