Outline of the Standard MIDI File Structure Go Back
A standard MIDI file is composed of "chunks". It starts with a header
chunk and is followed by one or more track chunks. The header chunk
contains data that pertains to the overall file. Each track chunk
defines a logical track.
SMF = <header_chunk> + <track_chunk> [+ <track_chunk> ...]
A chunk always has three components, similar to Microsoft RIFF files
(the only difference is that SMF files are big-endian, while RIFF
files are usually little-endian). The three parts to each chunk
are:
- The track ID string which is four charcters long. For
example, header chunk IDs are "MThd", and Track chunk IDs are
"MTrk".
- next is a four-byte unsigned value that specifies the number
of bytes in the data section of the track (part 3).
- finally comes the data section of the chunk. The size of the
data is specified in the length field which follows the
chunk ID (part 2).
Header Chunk
The header chunk consists of a literal string denoting the header, a
length indicator, the format of the MIDI file, the number of tracks in
the file, and a timing value specifying delta time units. Numbers
larger than one byte are placed most significant byte first.
header_chunk = "MThd" + <header_length> + <format> + <n> + <division>
- "MThd" 4 bytes
- the literal string MThd, or in hexadecimal notation: 0x4d546864.
These four characters at the start of the MIDI file indicate that
this is a MIDI file.
- <header_length> 4 bytes
- length of the header chunk (always 6 bytes long--the size
of the next three fields which are considered the header chunk).
- <format> 2 bytes
- 0 = single track file format
1 = multiple track file format
2 = multiple song file format (i.e., a series of type 0 files)
- <n> 2 bytes
- number of track chunks that follow the header chunk
- <division> 2 bytes
- unit of time for delta timing. If the value is positive, then it
represents the units per beat. For example, +96 would mean 96
ticks per beat. If the value is negative, delta times are in SMPTE
compatible units.
Track Chunk
A track chunk consists of a literal identifier string, a length
indicator specifying the size of the track, and actual event data
making up the track.
track_chunk = "MTrk" + <length> + <track_event> [+ <track_event> ...]
- "MTrk" 4 bytes
- the literal string MTrk. This marks the beginning of a track.
- <length> 4 bytes
- the number of bytes in the track chunk following this number.
- <track_event>
- a sequenced track event.
Track Event
A track event consists of a delta time since the last event, and one
of three types of events.
track_event = <v_time> + <midi_event> | <meta_event> | <sysex_event>
- <v_time>
- a variable length value specifying the elapsed time (delta time)
from the previous event to this event.
- <midi_event>
- any MIDI channel message such as note-on or note-off. Running
status is used in the same manner as it is used between MIDI
devices.
- <meta_event>
- an SMF meta event.
- <sysex_event>
- an SMF system exclusive event.
Meta Event
Meta events are non-MIDI data of various sorts consisting of a fixed
prefix, type indicator, a length field, and actual event data..
meta_event = 0xFF + <meta_type> + <v_length> + <event_data_bytes>
- <meta_type> 1 byte
- meta event types:
Type | Event
| Type | Event
|
0x00 | Sequence number
| 0x20 | MIDI channel prefix assignment
|
0x01 | Text event
| 0x2F | End of track
|
0x02 | Copyright notice
| 0x51 | Tempo setting
|
0x03 | Sequence or track name
| 0x54 | SMPTE offset
|
0x04 | Instrument name
| 0x58 | Time signature
|
0x05 | Lyric text
| 0x59 | Key signature
|
0x06 | Marker text
| 0x7F | Sequencer specific event
|
0x07 | Cue point
|
- <v_length>
- length of meta event data expressed as a variable length value.
- <event_data_bytes>
- the actual event data.
System Exclusive Event
A system exclusive event can take one of two forms:
sysex_event = 0xF0 + <data_bytes> 0xF7
or
sysex_event = 0xF7 + <data_bytes> 0xF7
In the first case, the resultant MIDI data stream would include the
0xF0. In the second case the 0xF0 is omitted.
Variable Length Values
Several different values in SMF events are expressed as variable
length quantities (e.g. delta time values). A variable length value
uses a minimum number of bytes to hold the value, and in most
circumstances this leads to some degree of data compresssion.
A variable length value uses the low order 7 bits of a byte to
represent the value or part of the value. The high order bit is an
"escape" or "continuation" bit. All but the last byte of a variable
length value have the high order bit set. The last byte has the high
order bit cleared. The bytes always appear most significant byte
first.
Here are some examples:
Variable length Real value
0x7F 127 (0x7F)
0x81 0x7F 255 (0xFF)
0x82 0x80 0x00 32768 (0x8000)