Candid Code - MusicXML Exploration With Note-Seq

MusicXML Exploration With Note-Seq

Note Seq, built by Google, is an awesome tooling library for parsing MusicXML data in Python. I believe it was built to accompany Google’s Magenta project, which is a “research project exploring the role of machine learning in music”.

MusicXML, is an open format for “exchanging digital sheet music”

In our exploration of a single sample musicXML file, we’ll use note-seq (albeit a slightly forked version) to parse our data and perform some basic functions.

import os
import sys
module_path = os.path.abspath(os.path.join('./note_seq_fork'))
if module_path not in sys.path:
    sys.path.append(module_path)

import note_seq_fork.note_seq.musicxml_parser as parser

from PIL import Image
Image.open('./sample.png')

Now let’s take a look at the original data. There’s too much scaffolding to cover the entirety of the XML. But the raw data looks like this:

<measure number="2" width="200.19">
  <harmony print-frame="no">
    <root>
      <root-step>C</root-step>
      </root>
    <kind text="7" use-symbols="yes">major-seventh</kind>
    </harmony>
  <note default-x="15.50" default-y="-10.00">
    <pitch>
      <step>D</step>
      <octave>5</octave>
      </pitch>
    <duration>1</duration>
    <voice>1</voice>
    <type>quarter</type>
    <stem>down</stem>
    <notehead color="#0000FF">normal</notehead>
    <lyric number="1" default-x="6.50" default-y="-44.95" relative-y="-30.00">
      <syllabic>single</syllabic>
      <text>Up</text>
    </lyric>
  </note>
</measure>

There are a lot of information given from this image, especially if you are new to musical notation. Let’s focus on three pieces of information:

Notes: The specific melody that is played (Measures -> Notes)
Chords: The harmony that supports the underlying notes
Lyrics: The words which are sang

# Parse the sample document. The original was generated from MuseScore Version 3
source = parser.MusicXMLDocument('./sample.musicxml')

# We can fetch a single chord from a list of chords. We can also produce a human readable format.
single_chord = source.get_chord_symbols()[0]
single_chord.get_figure_string()

'Cmaj7'

Building an array of chords

# All of the chords
for chord in source.get_chord_symbols():
  print(chord.get_figure_string())

Cmaj7
Cmaj7
Dm7b5
G7(#9)(#5)
Cmaj7
Cmaj7
Dm7b5
G7(#9)(#5)
Cmaj7
Cmaj7
Dm7b5
G7(#9)(#5)
Cmaj7
Cmaj7
Dm7b5
G7(#9)(#5)

# source -> parts -> measures
single_measure = source.parts[0].measures[0]
single_note = source.parts[0].measures[0].notes[0]
single_state = single_note.state

Building an array of lyrics and notes

lyrics = []
notes = []

for part in source.parts:
  for measure in part.measures:
    for note in measure.notes:
      notes.append(note.pitch[1])
      if note.lyric:
        lyrics.append(note.lyric)

print(' '.join(lyrics))
print('notes: ', notes)

Never Gonna Give You Up
notes:  [62, 72, 72, 72, 74, 74, 74, 74, 76, 76, 76, 76, 77, 77, 77, 77, 72, 72, 72, 72, 74, 74, 74, 74, 76, 76, 76, 76, 77, 77, 77, 77, 72, 72, 72, 72, 74, 74, 74, 74, 76, 76, 76, 76, 77, 77, 77, 77, 72, 72, 72, 72, 74, 74, 74, 74, 76, 76, 76, 76, 77, 77, 77, 77]

Congrats! Now you know how to parse MusicXML using Magenta’s Note-Seq. If you don’t need to parse lyrics, you can simply use the original note-seq repository. Otherwise, feel free to use my fork.