Cohen Courses:Learning Indian Classical Using Sequential Models

From Cohen Courses
Revision as of 13:14, 12 October 2011 by Dkulkarn (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Team Members

Project Idea

Indian Classical music is a very structured when it comes to melody. A composition is (generally) within a constraints of a raag. It has a specific grammar, which lends the emotions to the composition. This aspect of music lends an interesting application of sequential models for note prediction, and raga classification.

Problem Statement


Pakad is a string of notes characteristic to a Raga to which a musician frequently returns while improvising in a performance. A pakad has the potential to illustrate the grammar and aesthetics of a raga. For example consider raga Bageshree. The pakad is F G A F D# D C. It can be rendered in various ways as -

  • F G A F D# F DC
  • F G A F G D# F D C
  • F G A , D# F D C

The following are valid sequences in Bageshree, but they are not pakads -

  • F G A G F D# D C
  • F A G F D# F D C

Since pakad enforces a raga, the objective would be to identify a pakad in a sequence of notes.

Questions from William:

  • Don't you need duration and stress as well as the notes?

> Yes. These are additional features. Since I'm using midi files, I do have the stress (velocity) and the duration of the notes (which will be preserved in the annotation). But the baseline doesn't need it.

  • What does the comma mean?

> It means a stop.

  • How do you plan to encode this? as a BIO labeling for notes?

> Right now, I'm planning with attr/non-attr type labels for each notes. I haven't figured out what difference would BIO make.

  • How hard is it to go from midi to a sequence of nodes (maybe with stress and duration, if you need that)?

> I have the code in place for that. It identifies the note; the duration and the stress.

Sorry for all the questions - it's partly my unfamiliarity with the domain... --Wcohen 14:47, 11 October 2011 (UTC)


In [2], the pakad matching was done using -Occurence with -Bounded Gaps. This however, fails for the two sequences displayed above.

Grand Idea

The grand idea is to view this task as a sequence alignment problem. There has been considerable work in machine translation. The challenge would be to adapt this work.

Question from William: ie, you would be learning a similarity metric? or constructing alignments between a midi file and some designated prototypes? please explain in more detail what the inputs and outputs of the system would be. --Wcohen 14:43, 11 October 2011 (UTC) > I'm constucting alignments between a midi file and designated prototypes.


There are midi files available at These will be manually annotated for pakads.

Questions from William:

  • how long will it take to do the annotation (do you have a clear idea yet)? It seems like this might be a hard annotation task, since you're labeling subsequences of the song rather than just adding labels to a complete song.

> I plan to complete the annotations by this weekend. I'm not doing inter-annotator agreement to start with.

  • What will be your baseline method? I see the related work, but I don't know if that is a difficult thing to re-implement or not. Is there some sort of off-the-shelf learning method that can be used?

> I want to compare it with the existing technique to demonstrate that using sequence alignment makes sense.

--Wcohen 14:42, 11 October 2011 (UTC)




3. C. S. Iliopoulos and M. Kurokawa: "StringMatching with Gaps for Musical Melodic Recognition": Proc. Prague Stringology Conference, pp. 55-64: 2002.