Antti Laaksonen defends his PhD thesis on Algorithms for Melody Search and Transcription on November 20th, 2015

 

M.Sc.  Antti Laaksonen will defend his doctoral thesis Algorithms for Melody Search and Transcription on Friday the 20th of November 2015 at 12 o'clock in the University of Helsinki Exactum Building, Auditorium CK112 (Gustaf Hällströminkatu 2b) His opponent is Professor Pekka Kilpeläinen (University of Eastern Finland) and custos Professor Esko Ukkonen (University of Helsinki). The defence will be held in Finnish.

Algorithms for Melody Search and Transcription

This thesis studies two problems in music information retrieval: search for a given melody in an audio database, and automatic melody transcription. In both of the problems, the representation of the melody is symbolic, i.e., the melody consists of onset times and pitches of musical notes.

In the first part of the thesis we present new algorithms for symbolic melody search. First, we present algorithms that work with a matrix representation of the audio data, that corresponds to the discrete Fourier transform. We formulate the melody search problem as a generalization of the classical maximum subarray problem. After this, we discuss algorithms that operate on a geometric representation of the audio data. In this case, the Fourier transform is converted into a set of points in the two-dimensional plane.

The main contributions of the first part of the thesis lie in algorithm design. We present new efficient algorithms, most of which are based on dynamic programming optimization, i.e., calculating dynamic programming values more efficiently using appropriate data structures and algorithm design techniques. Finally, we experiment with the algorithms using real-world audio databases and melody queries, which shows that the algorithms can be successfully used in practice. Compared to previous melody search systems, the novelty in our approach is that the search can be performed directly in the Fourier transform of the audio data.

The second part of the thesis focuses on automatic melody transcription. As this problem is very difficult in its pure form, we ask whether using certain additional information would facilitate the transcription. We present two melody transcription systems that extract the main melodic line from an audio signal using additional information.

The first transcription system utilizes as additional information an initial transcription created by the human user of the system. It turns out that users without a musical background are able to provide the system with useful information about the melody, so that the transcription quality increases considerably. The second system takes a chord transcription as additional information, and produces a melody transcription that matches both the audio signal and the harmony given in the chord transcription. Our system is a proof of concept that the connection between melody and harmony can be used in automatic melody transcription.

Availability of the dissertation

An electronic version of the doctoral dissertation is available on the e-thesis site of the University of Helsinki at http://urn.fi/URN:ISBN:978-951-51-1702-1.

Printed copies will be available on request from Antti Laaksonen: tel. 02941 51160 or antti.h.s.laaksonen@cs.helsinki.fi.

 

20.11.2015 - 15:28 Pirjo Moen
04.11.2015 - 17:45 Pirjo Moen