Someone over at stackoverflow recommended using the Python library Construct for parsing binary files. Python already provides the struct module for parsing binary string data, but this gives a flat view of the data and does not reflect structure. Construct promises a more intuitive way of making a parser. Always happy to discover a new, shiny hammer, I decided to try it out on a real life, albeit “old skool”, problem.
Back in the days when I taught myself programming Turbo Pascal, I was very fascinated by the demo scene. An integral part of the scene, were the MOD, or Module, files. MOD is a file format for sequenced music, much like MIDI files, except MOD files also contain sampled instruments. The file format is more or less specified here. I’ll describe the relevant parts as we move along.
The files mod.py and utils.py can be found at GitHub.
To better understand how the parser is built, it’s best to start with the most high level construct first. The overall structure of a MOD file is as follows:
EDIT: This is from the file mod.py.
90 91 | mod = Struct("mod", String("title", 20, padchar="\x00"), |
The String construct denotes a fixed length string, in this case 20 characters, named title. The string is padded with zeroes. Construct also provides PascalString for length prefixed string and CString for null terminated strings.
92 | StrictRepeater(31, sample_info), |
The StrictRepeater is used to specify that the sample_info sub construct is to be repeated an exact number of times. A MOD file has 31 samples, but if less samples are needed, the sample_info contains a length of zero for the unused ones.
93 94 | UBInt8("num_positions"), Padding(1), |
UBInt8 specifies an unsigned 8 bit integer in big endian format. The value num_positions is the number of patterns to play.
95 | StrictRepeater(128, ULInt8("pattern_table")), |
The pattern table is a list of bytes that describe the order in which the patterns are to be played. I mistakenly used ULInt8 for these bytes, but with a byte length of 8 bits it doesn’t matter if the little endian or big endian interpretation is used.
96 97 98 | OneOf(String("signature", 4), signature_vs_channels.keys()), Value("num_channels", lambda ctx: signature_vs_channels[ctx.signature]), |
The OneOf construct is used to specify that the sub construct must be equal to exactly one of a list of candidate values. The signature field tells us how many audio channels the MOD file has. I put the mapping from signatures to number of channels into a dictionary, so I gave the keys as the list of possible values for the OneOf construct. The Value construct can be used to compute a value from the previously parsed constructs. Value takes a function with a single parameter, the parsing context, applies the function and assigns the resulting value to the given name. Here, the value for num_channels is simply looked up in the signature vs number of channels dictionary.
99 100 | MetaRepeater(lambda ctx: int(max(ctx.pattern_table)) + 1, patterns), |
The MetaRepeater differs from the StrictRepeater in that the number of repetitions is given as a function of the parsing context. A MOD file can contain up to 128 patterns. The number of actual patterns is the maximum value from the pattern table plus one, so the number of repetitions is simply given as a function that calculates this value.
101 102 103 | MetaRepeater(lambda ctx: len(ctx.sample_info), Bytes("samples", sample_length)) ) |
The samples’ data is stored at the end of the file. The number of samples is given as the length of sample_info, but will 31 with the current parser. The sample data is stored as an array of bytes. This is where some ugliness creeps in. For each sample to be read, its length must be looked up in the sample_info array, but in Construct itself, there is no way to access the current index in the Repeater constructs. The function sample_length, defined in utils.py, takes care of looking up the length and keeping track of which sample that is read.
6 7 8 9 10 11 12 | signature_vs_channels = {"M.K.": 4, "M!K!": 4, "6CHN": 6, "8CHN": 8, "12CH": 12, "28CH": 28, "FLT4": 4} |
A simple mapping between signatures and number of channels.
14 15 16 | sample_info = Struct("sample_info", String("name", 22, padchar="\x00"), WordsToBytesAdapter(UBInt16("length")), |
The sample_info sub structure is repeated once for each sample. Nothing new here except the WordsToBytesAdapter, which is defined in utils.py. The length value stored in the file is the number of 16 bit words in the sample. The WordsToBytesAdapter simply doubles the length to give the number of bytes instead.
17 18 19 | Embed(BitStruct(None, Padding(4), BitField("finetune", 4, signed=True))), |
One of the strengths of Construct is the ability to easily handle variable length bit fields. Normally, the parser operates on bytes. To operate on bits, a BitStruct is needed. In this case the BitStruct is enclosed in an Embed construct. This means that the members of the BitStruct will be included in the enclosing struct.
The finetune field is the lower 4 bits of a byte value, interpreted as a signed 4 bit integer, so we first insert a Padding struct to get rid of the upper 4 bits.
20 21 22 | UBInt8("volume"), WordsToBytesAdapter(UBInt16("repeat_offset")), WordsToBytesAdapter(UBInt16("repeat_length"))) |
Samples loop forever if the repeat_length field is positive. When the end of a sample is reached, playback is continued from the repeat_offset. Once repeat_offset + repeat_length is reached, playback is once again continued from the repeat_offset.
85 86 87 88 | patterns = Struct("patterns", StrictRepeater(64, MetaRepeater(lambda ctx: ctx._.num_channels, channel_data))) |
Patterns consist of 64 divisions, each with 4 bytes of data per channel.
75 76 77 78 | channel_data = BitStruct("channel_data", Nibble("sample_hi"), BitField("period", 12), Nibble("sample_lo"), |
For some reason the sample number is split into two nibbles, separated by the period, or pitch, so play the sample at. I have used the Value construct to join the nibbles into a byte below. This still leaves sample_hi and sample_lo hanging around and there is, as far as I can tell, no way to get rid of them.
79 80 | major_effect,
Embed(efx_and_params), |
The effect to be applied for this particular channel and division is a 4 bit integer in the range 0-15. An effect has two 4 bit parameters, x and y, that follow. However, an effect value of 14 is an extended effect. The x then specifies which extended effect, and the y is its parameter.
81 82 83 | Value("sample", lambda ctx: ctx.sample_hi << 4 | ctx.sample_lo) ) |
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | major_effect = Enum(Nibble("major_effect"), Arpeggio = 0, SlideUp = 1, SlideDown = 2, SlideToNote = 3, Vibrato = 4, ContinueSlideToNotePlusVolumeSlide = 5, ContinueVibratoPlusVolumeSlide = 6, Tremolo = 7, SetSampleOffset = 9, VolumeSlide = 10, PositionJump = 11, SetVolume = 12, PatternBreak = 13, extended_effect = 14, #This is never seen outside parser SetSpeed = 15) extended_effect = Enum(Nibble("extended_effect"), ToggleFilter = 0, FineslideUp = 1, FineslideDown = 2, ToggleGlissando = 3, SetVibratoWaveform = 4, SetFinetuneValue = 5, LoopPattern = 6, SetTremoloWaveform = 7, RetriggerSample = 9, FineVolumeSlideUp = 10, FineVolumeSlideDown = 11, CutSample = 12, DelaySample = 13, DelayPattern = 14, InvertLoop = 15) |
Enums are used to map from integers to symbols.
70 71 72 73 | efx_and_params = IfThenElse(None, lambda ctx: ctx.major_effect == "extended_effect", efx_params, fx_params) |
The IfThenElse construct allows for conditional parsing. Here we check if the effect is really an extended effect. If it is, the sub construct efx_params is tried, otherwise fx_params.
65 66 67 68 | fx_params = Struct(None, Nibble("x"), Nibble("y"), Alias("effect", "major_effect")) |
An effect has two parameters, x and y. The Alias construct creates an additional name for a symbol.
59 60 61 62 63 | efx_params = Struct(None, extended_effect, Nibble("x"), Value("y", lambda ctx: None), Alias("effect", "extended_effect")) |
As mentioned, if the effect is an extended effect, its first parameter, or nibble, tells us which extended effect. The second parameter is the one and only extended effect parameter. Since the extended effect doesn’t have a second paramter, we simply give it a value of None using the Value construct. This time, the alias effect is assigned the value contained in extended_effect.
To clarify how the channel data will look when parsed, we can do a test with GLiPy:

We can try it out on a file as well. I chose the file demomod3.mod. The following screenshot shows how to access parts of the parsed file:

Printing the module object looks like this, except I have edited the output to a reasonable length:
Container:
title = 'demomodul#3'
sample_info = [
Container:
name = "by laxity/kefrens '9"
length = 9444
finetune = 0
volume = 64
repeat_offset = 0
repeat_length = 2
Container:
name = ''
length = 4780
finetune = 0
volume = 60
repeat_offset = 458
repeat_length = 3862
<29 more>
]
num_positions = 40
pattern_table = [
2
1
3
4
14
0
7
8
<120 more>
]
signature = 'M.K.'
num_channels = 4
patterns = [
Container:
channel_data = [
[
Container:
sample_hi = 1
period = 226
sample_lo = 7
major_effect = 'SetSpeed'
x = 0
y = 6
effect = 'SetSpeed'
sample = 23
Container:
sample_hi = 0
period = 254
sample_lo = 11
major_effect = 'Arpeggio'
x = 0
y = 0
effect = 'Arpeggio'
sample = 11
Container:
sample_hi = 0
period = 254
sample_lo = 1
major_effect = 'Arpeggio'
x = 0
y = 0
effect = 'Arpeggio'
sample = 1
Container:
sample_hi = 0
period = 0
sample_lo = 0
major_effect = 'Arpeggio'
x = 0
y = 0
effect = 'Arpeggio'
sample = 0
]
<63 more>
]
Container:
channel_data = [
[
Container:
sample_hi = 0
period = 0
sample_lo = 0
major_effect = 'Arpeggio'
x = 0
y = 0
effect = 'Arpeggio'
sample = 0
Container:
sample_hi = 0
period = 0
sample_lo = 0
major_effect = 'Arpeggio'
x = 0
y = 0
effect = 'Arpeggio'
sample = 0
Container:
sample_hi = 0
period = 254
sample_lo = 1
major_effect = 'Arpeggio'
x = 0
y = 0
effect = 'Arpeggio'
sample = 1
Container:
sample_hi = 0
period = 0
sample_lo = 0
major_effect = 'extended_effect'
extended_effect = 'FineVolumeSlideDown'
x = 1
y = None
effect = 'FineVolumeSlideDown'
sample = 0
]
<63 more>
]
<32 more>
]
samples = [
"\x00\x00\x00\x00\x00\x00\x02\r\x16\x1b\x1b\x1c\x1c\x1c\x1c\x1c\x1c\x1c\x1b\x1b\x17\x16\x16\x13\r\x03\x00\x00\xed..."
<30 more>
]
Now, that was quite easy! Try it out for yourself.
