logoalt Hacker News

Backing up Spotify

1678 pointsby vitplisteryesterday at 6:28 PM562 commentsview on HN

Comments

lelouch9099yesterday at 7:24 PM

How legal is this with regards to copyright laws?

show 7 replies
krackersyesterday at 9:31 PM

New multimodal training set just dropped.

walthamstowyesterday at 11:25 PM

Very interesting that a white noise track for babies is the 4th most popular track on Spotify.

show 2 replies
schmuckonwheelstoday at 12:34 AM

I want to time-travel back to 2000 like Old Biff with the sports almanac so I can tell Shawn Fanning to use the "it's for historical preservation" defense.

simmo9000today at 1:21 PM

We need insane for culture to survive.

markstostoday at 3:03 AM

> ≥70% of songs are ones almost no one ever listens to (stream count < 1000).

So much interesting but undiscovered music is out there!

show 1 reply
thih9today at 8:37 AM

This is conspiracy theory territory but I wonder if big tech is sponsoring efforts like this as an easy way to get training data.

romanovcodetoday at 12:51 PM

`spotdl download "https://open.spotify.com/user/{username}" --user-auth --output '{list-name}/{title} - {artists}.{output-ext}'`

This is literally all you need to back up Spotify.

show 1 reply
shmerltoday at 5:07 PM

Just buy music DRM-free in the first place.

ikammyesterday at 9:08 PM

I really don't understand how focusing on source quality files is supposed to be a "major issue" with the music preservation community. It's bizarre for them to talk about these being barriers for creating a "full archive of all music that humanity has ever produced" have and their answer be scraping Spotify to end up with a music library comprised of many AI and bulk produced songs at 75/160kbps.

littlecranky67yesterday at 11:09 PM

For some reason, the link does not work for me (spain). Works perfect at the same time in tor browser.

shomptoday at 6:20 AM

If only Spotify paid musicians their fair share

sneakyesterday at 11:04 PM

199GB, only metadata released for now.

Magnet link found here: https://annas-archive.li/torrents/spotify

Are magnet links allowed on HN?

artninja1988yesterday at 7:28 PM

Wow. Anna is a godsend. Hopefully now we get some really good open source music models

show 1 reply
msephtonyesterday at 11:25 PM

Is this all regions? I'm assuming so but I can't be sure

zzzeekyesterday at 9:18 PM

great. Spotify just removes things all the time (things I actively listen to and work on for my jazz practices, one day just go "poof" because they didn't want to pay the record company anymore), and they are not as a company deserving of the role of "keeper of all the world's music". They don't give a shit and they'd vastly prefer we all listen to their AI generated royalty free crap and Joe Rogan.

gyrgtyntoday at 1:42 AM

is there a torrent client already that is be good at partial downloads? I didn't realize how popcorn time worked until I read this thread.

show 1 reply
rendawtoday at 3:39 AM

Looking at the analysis, I'm totally surprised opera and psytrance are so prolific.

Psy-trance... I thought it was the same as any other electronic genres, but do people get high and just start shoveling psy-trance tracks out or something?

Opera I thought was a very strict discipline, needing rigorous somewhat esoteric training in order to produce the right sounds. How could there be so many opera artists?

I mean, I'm sure there's some misclassification, but chamber music is basically a couple people with any sort of music training on classical instruments so that doesn't surprise me nearly as much... I can easily imagine there being _lots_ of those, and you might come up with a different artist name for each unique set of people you collaborate with.

show 3 replies
reactordevyesterday at 9:54 PM

Oh this is going to go over real well in Nashville, TN.

siquickyesterday at 9:26 PM

Is there a way to see the shape of the metadata?

dbacartoday at 8:38 AM

Now, anyone with some decent info on signal processing and machine learning can build his/her own Shazam.

dmixyesterday at 10:44 PM

I hope they get the new lossless versions

peterburkimshertoday at 12:56 PM

For a fully-legal alternative of metadata archiving, I suggest the iTunes EPF (Enterprise Partner Feed). https://performance-partners.apple.com/epf

The best metadata I've found, though, is the MySpace Dragon Hoard: https://archive.org/details/myspace_dragon_hoard_2010

That included the artist location, allowing me to tag songs based on their country. I then created playlists such as "NERAS" Non-English Rock Artist Sample, where the one most popular song for a particular artist was chosen, and only when the country of origin was not English-speaking, and the genre was Rock. I like listening to music while working, but English lyrics distract me because I understand what they're saying.

After discovering music via the MySpace archive, I've since purchased 73 songs from 35 artists that I'd never heard of before digging into the data. I rebuilt my playlist on Spotify, but got greyed out tracks, and YouTube Music, but got "unavailable video". So I still prefer purchasing tracks via the iTunes Music Store, Qobuz, Bandcamp, and 7digital.

Other data sources such as the MP3.com rescue barge, PureVolume archive, and Anna's Spotify archive lack the country-of-origin metadata, so are of less interest to me. It may be possible to use an LLM to guess the language of each track title, but someone else will have to do that.

Meanwhile, if you're interested in the genre-by-country MySpace data, or have questions about the iTunes EPF, feel free to reach out and we can discuss your research.

827ayesterday at 8:52 PM

Holy crap. This is going to trigger a five-alarm fire at Spotify Engineering. This has got to be among the largest proprietary datasets ever unintentionally publicized by a company.

show 2 replies
7erotoday at 12:19 PM

free the music

m00dytoday at 6:30 AM

Congrats! I’m sure the Spotify lawyers are gonna have some sleepless nights ahead.

snoozebuttontoday at 1:09 AM

is this not highly illegal?

nutjob2yesterday at 8:26 PM

I wonder how definitive their collection is and how much ripping Google Music/YouTube would improve on this.

A distributed ripping project to do that would be a fine thing.

verisimitoday at 9:04 AM

Yes, but do they have the one that goes like: to-to-to dotodoo? Hmmm? Do they?

zoklet-enjoyeryesterday at 7:34 PM

Wow. Now I just need some hard drives and a way to download that without my ISP doing something about it. That's amazing.

show 1 reply
jimmydoeyesterday at 10:39 PM

[flagged]

pcbluestoday at 10:11 AM

[flagged]

lysacetoday at 12:07 AM

This reinforces my belief that this effort ("anna's...") is financially backed by Russia/Putin. The HN crowd probably won't see it though.

Think from a geopolitical perspective, not (just) a "copyright shouldn't exist" perspective. They claim "communism" as a motivation; Putin is looking to re-establish the Stalin Soviet Union.

show 2 replies
basiswordyesterday at 7:43 PM

Am I understanding this wrong? Ripping the metadata I'm fine with. But it sounds like they've ripped every song from Spotify and they're going to release them?

Edit: It seems like they are. Stealing from tens of thousands of artists, big and small, and calling it "preservation" or "archiving" is scummy.

show 12 replies
1dryyesterday at 11:02 PM

Yuck. Just to make it easier to train slop machines. The point of art is not to have completionist archives of EVERYthing that’s ever been made! Let it die. Death is the most natural part of life. Art is about the human experience, not “for researchers”.

The point is human connection. Art is a living reflection and record of human experience. Art will persevere- the kinds of folks who prioritize what they like based on popularity were never the supporters artists (contrast with craftspeople trying to make a buck) counted on in the first place. Enjoy your derivative slop - we’ll continue on our imperfect, messy, individual, human artistic lives.

show 1 reply
linhnstoday at 7:54 AM

Unlike books, which are massively overpriced, this will hurt artists a lot as they need the fees paid by Spotify to make ends meet.

show 1 reply