Cohen Courses:Learning Indian Classical Using Sequential Models

From Cohen Courses
Revision as of 13:14, 12 October 2011 by Dkulkarn (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Team Members

Project Idea

Indian Classical music is a very structured when it comes to melody. A composition is (generally) within a constraints of a raag. It has a specific grammar, which lends the emotions to the composition. This aspect of music lends an interesting application of sequential models for note prediction, and raga classification.

Problem Statement

Pakad

Pakad is a string of notes characteristic to a Raga to which a musician frequently returns while improvising in a performance. A pakad has the potential to illustrate the grammar and aesthetics of a raga. For example consider raga Bageshree. The pakad is F G A F D# D C. It can be rendered in various ways as -

  • F G A F D# F DC
  • F G A F G D# F D C
  • F G A , D# F D C

The following are valid sequences in Bageshree, but they are not pakads -

  • F G A G F D# D C
  • F A G F D# F D C

Since pakad enforces a raga, the objective would be to identify a pakad in a sequence of notes.

Questions from William:

  • Don't you need duration and stress as well as the notes?

> Yes. These are additional features. Since I'm using midi files, I do have the stress (velocity) and the duration of the notes (which will be preserved in the annotation). But the baseline doesn't need it.

  • What does the comma mean?

> It means a stop.

  • How do you plan to encode this? as a BIO labeling for notes?

> Right now, I'm planning with attr/non-attr type labels for each notes. I haven't figured out what difference would BIO make.

  • How hard is it to go from midi to a sequence of nodes (maybe with stress and duration, if you need that)?

> I have the code in place for that. It identifies the note; the duration and the stress.

Sorry for all the questions - it's partly my unfamiliarity with the domain... --Wcohen 14:47, 11 October 2011 (UTC)

Baseline

In [2], the pakad matching was done using -Occurence with -Bounded Gaps. This however, fails for the two sequences displayed above.

Grand Idea

The grand idea is to view this task as a sequence alignment problem. There has been considerable work in machine translation. The challenge would be to adapt this work.

Question from William: ie, you would be learning a similarity metric? or constructing alignments between a midi file and some designated prototypes? please explain in more detail what the inputs and outputs of the system would be. --Wcohen 14:43, 11 October 2011 (UTC) > I'm constucting alignments between a midi file and designated prototypes.

Dataset

There are midi files available at http://www.cse.iitk.ac.in/users/tvp/music/. These will be manually annotated for pakads.

Questions from William:

  • how long will it take to do the annotation (do you have a clear idea yet)? It seems like this might be a hard annotation task, since you're labeling subsequences of the song rather than just adding labels to a complete song.

> I plan to complete the annotations by this weekend. I'm not doing inter-annotator agreement to start with.

  • What will be your baseline method? I see the related work, but I don't know if that is a difficult thing to re-implement or not. Is there some sort of off-the-shelf learning method that can be used?

> I want to compare it with the existing technique to demonstrate that using sequence alignment makes sense.

--Wcohen 14:42, 11 October 2011 (UTC)

References

1. http://www.slideshare.net/butest/music-and-machine-learning

2. TANSEN : A SYSTEM FOR AUTOMATIC RAGA IDENTIFICATION

3. C. S. Iliopoulos and M. Kurokawa: "StringMatching with Gaps for Musical Melodic Recognition": Proc. Prague Stringology Conference, pp. 55-64: 2002.