I've tried self-hosting with navidrome [0] / plex / jellyfin but the thing I miss most is music discovery via radios / discover weekly. I've tried replicating it a few times with embedding vectors + vector search but at best it finds songs in the (sub)-genre with the tempo / mood being pretty different.
Maybe I just need better data, been meaning to try again when that spotify crawl by annas-archive gets released. I've just been using musicbrainz [1] and youtube. Model-wise I've tried off-the-shelf ones like [2] and [3] and training auto-encoders like VAEs / MAEs [4]
[0] - https://www.navidrome.org/
[1] - https://musicbrainz.org/
[2] - https://github.com/LAION-AI/CLAP
This is the main factor that makes me leery to try going the full self host route.
With the recent data leak of Spotify‘s entire playlist database, you might be able to build something considerably better for music discovery now!