Sunday, October 26, 2008

Acoustic fingerprinting and sound origins

This task of going through the millions of sound and image files on YTMND has had me thinking a lot about meta-data that we could be pulling out of files and displaying for everyone to see. The obvious plan of action would be to grab ID3 and EXIF info out of uploaded files and put it in the database somewhere. One issue is that the majority of the files uploaded are edited, chopped, shopped etc, so most of the data (if any) would be useless or only partially relevant.

I spent some time looking into fuzzy image algorithms for the dupecheck system I designed a while back but there didn't seem to be any open source method available that wasn't extremely simple proof-of-concept type code. It was clear from the research I did that it was out of my existing math education and time constraints to even try to implement anything similar.

Recently I've found the Shazam application on the iPhone to be quite helpful. To summarize, you just hold your phone near a source of music, it records for around 8 seconds and then computes a hash somehow and sends it off to their servers where it tries to find a match. I've found the success rate to be fairly good, maybe 70% of the time (usually the more mainstream stuff).

There is also an open project called MusicBrainz which aims to be an open-source CD lookup/identification system and they've created some nice programs that use third party acoustic fingerprinting among other methods to identify MP3 files to add and correct meta-data. The most recent system they've started using for acoustic fingerprinting is MusicDNS which is proprietary but seems fairly young and approachable.

I was thinking about how a system like this could be beneficial to YTMND. Once I am feeling less pressured to get important things done I'd like to play with the SDK on one of these service on some YTMND sound clips. At this point, many users go out of their way to put incorrect info in the origin fields, which I find disappointing. YTMND has been a great way to find new artists and music, so it would be nice to have this information automatically filled out for you.

I would be interested to see if anyone else has played with this sort of technology on non-standard music clips such as "mashups" or even artists like Girl Talk who use multiple clips of other songs in rapid succession. These days it is extremely common for music to use samples from other songs made over the last century. I imagine at this point it would be incredibly hard to accurately figure out the samples of every chunk used in a song or loop, but it would be pretty amazing to have some sort of tree-view that showed songs and what songs they sampled from etc.

Anyway, still waiting on that replacement power supply. Thanks to the IRS I don't currently have the monetary freedom to order a third replacement, so I am patiently harassing this company to ship the correct item.