This is incredible. I once assembled a collection of 100,000 tracks for research on exploration of large music libraries. Essentially vector search. I was limited in storage and processing power to a single machine.
If I were to do it today, I could get so much farther with hyperscaler products and this dataset.