Meta Segment Anything Model Audio

169 points • by megaman821 • last Tuesday at 6:26 PM • 23 comments • view on HN

Comments

This is incredible! I wouldn't have thought it was possible to cleanly separate tracks like that. I wonder to what extent the model is filling in gaps, akin to Samsung's "ultra zoom" moon.

hbn • yesterday at 8:48 PM

I hope we keep making progress in isolating tracks in music. I love listening to stems of my favorite songs, I find all sorts of neat parts I missed out on. Listening to isolated harmonies is cool too.

➕ show 1 reply

kace91 • today at 12:43 AM

Funny that:

- This feature is awesome for sample-based music

- Sample music is not what it was due to difficulties related to legal rights

- This model was probably created by not giving a damn about said rights

ortusdux • yesterday at 8:03 PM

Would be great for the hearing impaired and CAPD sufferers when combined with Meta glasses or the like.

➕ show 1 reply

locusofself • yesterday at 10:45 PM

As someone recording myself playing music, I've been meaning to see if any of these tools are good enough yet to not only separate vocals from another instrument (acoustic guitar for example), but do so without any loss of fidelity (or least not a perceivable one).

The reason I'm interested in this is because recording with multiple microphones (one on guitar, one on the vocal), has it's own set of problems with phase relationship and bleed between the microphones, which causes issues when mixing.

Being able to capture a singing guitarist with a single microphone placed in just the right spot, but still being able to process the tracks individually (with EQ, compression, reverb, etc), could be really helpful.

tasty_freeze • today at 12:15 AM

I use moises frequently for track separation for learning songs. It does pretty dang well. I was shocked that the score of moises is ranked way worse than just about everything else, including lalal.ai, which I also used before buying moises. Perhaps lalal.ai has gotten better since I last tried it.

➕ show 1 reply

Escapade5160 • today at 12:52 AM

From my brief testing in the playground, it is not very good. Maybe it needs better prompting than the 1 word examples.

➕ show 1 reply

cyberax • today at 12:39 AM

Can this be used to nuke the laugh tracks?!?

➕ show 1 reply

htrp • last Tuesday at 10:46 PM

super amazing demo performance being able separate out music voice and background noises. do you have to explicitly specify what type of noise to separate?

mwmisner • yesterday at 8:51 PM

Playing with the background I tried to Isolate just the espresso machine and the train sounds in one of their demos and it seemed to fail. Maybe not the desired use case, but I thought it was odd that I could break it so easily on the sample material.

➕ show 1 reply

nmstoker • today at 2:03 AM

Would be interesting to leverage the non spoken/environment noises to guide what level of detail and style of speech a chatbot replied with, for instance being more casual, gentle, with a touch more detail if in a quiet home/office environment, but more curt and concise with emphasized diction if the person is traveling, such as in a noisy train concourse. People tend to do that subconsciously but bots ignorantly wittering on can be annoying and hard to use because they miss the cues.

almosthere • yesterday at 9:31 PM

mSAMA haha, get it

emsign • last Wednesday at 6:36 AM

[flagged]

➕ show 3 replies

alt Hacker News

Meta Segment Anything Model Audio

Comments