Swarthmore College Department of Computer Science

Talk by Doug Turnbull, University of California, San Diego

Discover the Music You Want: Building a Music Search Engine Using Audio Content and Social Context
Monday, March 3, 2008
3 pm in the main cs lab, Room 240
Cookies at 2:45!

Technology is changing the way in which music is produced, distributed and experienced. An aspiring musician in West Africa with a desktop computer, a microphone, and free audio editing software can record and produce reasonably high-quality music. She can post her songs on any number of musically-oriented social networks (e.g., MySpace, Last.fm , eMusic) making them accessible to the public. A music consumer in Swarthmore can then rapidly download her songs over a high-bandwidth Internet connection and store them on a 160-gigabyte personal MP3 player. As a result, millions of songs are now instantly available to billions of people. This `Age of Music Proliferation' has created the need for novel music search technologies that move beyond the 'query-by-artist-name' or 'browse-by-genre' paradigms.

In this talk, I will present a computer audition system that can both annotate novel audio tracks with semantically meaningful tags (i.e., short text-based tokens) and retrieve relevant tracks from a database of unlabeled audio content given a text-based query. I consider the related tasks of content-based audio annotation and retrieval as one supervised learning problem in which we model the joint probability of acoustic features and tags. The quality of the music annotations produced by the system is comparable with the performance of humans on the same task. In addition, this 'query-by-description' system can retrieve appropriate songs for a large number of musically-relevant tags. I also show that the system is general by learning a model that can annotate and retrieve sound effects.

Lastly, I will discuss four alternative approaches to collecting music tags: conducting surveys, harvesting tags from social networks, mining web documents, and deploying annotation games. These tags provide social context and can be both directly useful for indexing music and indirectly useful as training data for the computer audition system. The combined set of 'social context tags' and 'audio content tags' make up the backbone of our music search engine.