1. Let's see the basics
  2. Lossy compression
  3. Removing repetitive parts and the last pause
  4. Storing notes on half bytes
  5. Packing the already encoded data
  6. Final tweaks
  7. Download page and source code

How can we compress the musical data for tiny intros?

Music data compression lesson #1 - Hungarian Toons

We will examine how far we can go with handmade compression (compared to general compression algorithms).

Let's see the basics

You can play a musical note A which is 440Hz like this:

A4 = 105000000/88/440

 MOV AX,A4              ; counter (word)
beep:
 OUT 42H,AL             ; lowbyte of the counter
 MOV AL,AH
 OUT 42H,AL             ; highbyte of the counter
 SALC                   ; if pause (no carry) then speaker turn off
 OUT 61H,AL             ; else turn on

So one note needs one word data.

Lossy compression

Our ears not so sensitive to the correct note frequenties. So we can approximate them with byte only data and a Multiplier.

M = 59
A4 = 105000000/88/440/M
; or with less numeric error: A4 = (10500000000/88/44000+M/2)/M

 MOV AL,A4              ; data (byte)
 MOV AH,M               ; Multiplier
 MUL AH                 ; counter = data * Multiplier
beep:

At cost of 3 bytes extra code we were able to halve the data.

This method has much shorter decoder code than using a table of words for the sound-frequenties, and also we don't need to store the table itself.

Our example has 256 bytes data:

Music:
; part A
DB  C4,D4,C4,00,  00,00,00,00
DB  A3,A3,A3,00,  00,00,00,00
DB  C4,C4,00,00,  H3,H3,00,00
DB  A3,A3,00,00,  G3,G3,00,00
DB  A3,C4,A3,C4,  00,00,00,00
DB  F3,F3,F3,F3,  00,00,00,00
DB  G3,G3,G3,G3,  00,00,00,00
DB  C3,C3,C3,C3,  00,00,00,00

; part B
DB  F3,F3,00,00,  G3,G3,00,00
DB  A3,A3,00,00,  H3,H3,00,00
DB  C4,C4,C4,C4,  00,00,00,00
DB  C4,C4,00,00,  D4,D4,00,00
DB  C4,D4,C4,D4,  C4,D4,C4,D4
DB  C4,D4,C4,D4,  C4,D4,00,00
DB  C4,C4,C4,C4,  00,00,00,00
DB  00,00,00,00,  00,00,00,00

; part A
DB  C4,D4,C4,00,  00,00,00,00
DB  A3,A3,A3,00,  00,00,00,00
DB  C4,C4,00,00,  H3,H3,00,00
DB  A3,A3,00,00,  G3,G3,00,00
DB  A3,C4,A3,C4,  00,00,00,00
DB  F3,F3,F3,F3,  00,00,00,00
DB  G3,G3,G3,G3,  00,00,00,00
DB  C3,C3,C3,C3,  00,00,00,00

; part C
DB  F3,F3,00,00,  G3,G3,00,00
DB  A3,A3,00,00,  H3,H3,00,00
DB  A3,A3,A3,A3,  00,00,00,00
DB  G3,G3,G3,G3,  00,00,00,00
DB  F3,F3,F3,F3,  F3,F3,F3,F3
DB  F3,F3,00,00,  00,G3,F3,E3
DB  F3,F3,F3,F3,  00,00,00,00
DB  00,00,00,00,  00,00,00,00

M = 59
D4 = (10500000000/88/29366+M/2)/M
C4 = (10500000000/88/26163+M/2)/M
H3 = (10500000000/88/23347+M/2)/M
A3 = (10500000000/88/22000+M/2)/M
G3 = (10500000000/88/19600+M/2)/M
F3 = (10500000000/88/17461+M/2)/M
E3 = (10500000000/88/16392+M/2)/M
C3 = (10500000000/88/13081+M/2)/M

Look at P3.ASM

Removing repetitive parts and the last pause

Reading all the bytes looks something like this:

 MOV AL,[BX+Music]      ; data (byte)
 INC BL
 MOV AH,M               ; Multiplier
 MUL AH                 ; counter = data * Multiplier
beep:

After removing repetitive parts and the last pause of course the code a bit longer, but we can omit 104 bytes of data!

 MOV AL,BL              ; get timer counter
 INC BL
 MOV SI,Music
 CMP AL,208+12          ; 2nd variaton?
 JC .1
 MOV SI,Music+36
.1:
 AND AX,127             ; put zero in AH
 SUB AL,12              ; last pause?
 JC mute
 XCHG BP,AX
 MOV AL,[BP+SI]         ; get data
 MOV AH,M               ; Multiplier
mute:
 MUL AH                 ; counter = data * Multiplier
beep:
Music:
; part A
DB  C4,D4,C4,00,  00,00,00,00
DB  A3,A3,A3,00,  00,00,00,00
DB  C4,C4,00,00,  H3,H3,00,00
DB  A3,A3,00,00,  G3,G3,00,00
DB  A3,C4,A3,C4,  00,00,00,00
DB  F3,F3,F3,F3,  00,00,00,00
DB  G3,G3,G3,G3,  00,00,00,00
DB  C3,C3,C3,C3,  00,00,00,00
; part B
DB  F3,F3,00,00,  G3,G3,00,00
DB  A3,A3,00,00,  H3,H3,00,00
DB  C4,C4,C4,C4,  00,00,00,00
DB  C4,C4,00,00,  D4,D4,00,00
DB  C4,D4,C4,D4,  C4,D4,C4,D4
DB  C4,D4,C4,D4,  C4,D4,00,00
DB  C4,C4,C4,C4;,  00,00,00,00
;DB 00,00,00,00,  00,00,00,00
; part A
;DB C4,D4,C4,00,  00,00,00,00
;DB A3,A3,A3,00,  00,00,00,00
;DB C4,C4,00,00,  H3,H3,00,00
;DB A3,A3,00,00,  G3,G3,00,00
;DB A3,C4,A3,C4,  00,00,00,00
;DB F3,F3,F3,F3,  00,00,00,00
;DB G3,G3,G3,G3,  00,00,00,00
;DB C3,C3,C3,C3,  00,00,00,00
; part C
;DB F3,F3,00,00,  G3,G3,00,00
;DB A3,A3,00,00,  H3,H3,00,00
DB  A3,A3,A3,A3,  00,00,00,00
DB  G3,G3,G3,G3,  00,00,00,00
DB  F3,F3,F3,F3,  F3,F3,F3,F3
DB  F3,F3,00,00,  00,G3,F3,E3
DB  F3,F3,F3,F3,;  00,00,00,00
;DB 00,00,00,00,  00,00,00,00

+4 bytes code vs -24 bytes for pause
+14 bytes code vs -80 bytes for repeats

So now 152 bytes data left... P4.ASM

Storing notes on half bytes

With tunning the M (multipler) we are able to reduce the distance between the data values. We'd like to fit all the data into the range of 0...15 values.

But this range can start easily with other value than zero, then we just need a constant adder value. Here it is 16 (hex 0x10).

And if only one value is out of range, and we don't use all the values between 0 and 15 (or between 16 and 31) then we can use a marker value (like 2 or 18) and we can swap easily this value to an other one (like hex 0x27).

Here is the decoder code:

 ...
 XCHG BP,AX
 SHR BP,1
 MOV AL,[BP+SI]         ; get data
 JNC .2
 SHR AL,4
.2:
 AND AL,15
 JZ mute
 ADD AL,16              ; adder value
 CMP AL,18              ; marker value
 JNE .3
 MOV AL,C3              ; extra value: 0x27
.3:
 MOV AH,M               ; Multiplier
mute:

+13 bytes code for decoding half bytes
+6 bytes code for substituting one value
-76 bytes data

Music:
DD 0x00000313;DB  03,01,03,00,	00,00,00,00
DD 0x00000777;DB  07,07,07,00,	00,00,00,00
DD 0x00660033;DB  03,03,00,00,	06,06,00,00
DD 0x00AA0077;DB  07,07,00,00,	10,10,00,00
DD 0x00003737;DB  07,03,07,03,	00,00,00,00
DD 0x0000DDDD;DB  13,13,13,13,	00,00,00,00
DD 0x0000AAAA;DB  10,10,10,10,	00,00,00,00
DD 0x00002222;DB  02,02,02,02,	00,00,00,00
DD 0x00AA00DD;DB  13,13,00,00,	10,10,00,00
DD 0x00660077;DB  07,07,00,00,	06,06,00,00
DD 0x00003333;DB  03,03,03,03,	00,00,00,00
DD 0x00110033;DB  03,03,00,00,	01,01,00,00
DD 0x13131313;DB  03,01,03,01,	03,01,03,01
DD 0x00131313;DB  03,01,03,01,	03,01,00,00
DD 0x77773333;DB  03,03,03,03,	07,07,07,07
DD 0xAAAA0000;DB  00,00,00,00,	10,10,10,10
DD 0xDDDD0000;DB  00,00,00,00,	13,13,13,13
DD 0x00DDDDDD;DB  13,13,13,13,	13,13,00,00
DD 0xDDDDFDA0;DB  00,10,13,15,	13,13,13,13

We have left 76 bytes data... P5.ASM

Packing the already encoded data

Because of the structure of the musical notes and due to its nature we have a good chance to shrink further the music data values.

We are trying to omit every second byte (every second pair of notes) or replacing every second byte by only one bit. Then we will save 7 bits on every 2 bytes!

In general this song has long note, short note + pause, and arpeggio with two notes.

For a long note the second byte repates the first byte. For a short note the second byte is zero (pause). For an arpeggio the second byte also repeats the first byte. So we can decide on only one bit that the second byte repeats the first one or it is zero.

At the very beginning of the data and also at the end this rule doesn't work. But most of the data is nicely compressable by this rule!

Here is the unpacker code:

 MOV EDX,011111110111001100001111111100001B
 MOV SI,packed
 MOV DI,Music
 MOVSW                    ; not packed bytes (3 words)
 MOVSW
 MOVSW
 MOV CL,33
.1:
 SHR EDX,1
 SALC
 AND AL,[SI]
 MOVSB
 STOSB
 LOOP .1
 MOVSW                    ; not packed bytes (2 words)
 MOVSW

And the data is 43 bytes long... A12.ASM

packed: 		; notes data, 0 -> pause

DD 0x00000313;DB  03,01,03,00,	00,00,00,00
;DD 0x00000777;DB  07,07,07,00,  00,00,00,00
DW 0x0777
DB 0x00

;DD 0x00660033;DB  03,03,00,00,  06,06,00,00
;DD 0x00AA0077;DB  07,07,00,00,  10,10,00,00
DD 0xAA776633

;DD 0x00003737;DB  07,03,07,03,  00,00,00,00
;DD 0x0000DDDD;DB  13,13,13,13,  00,00,00,00
DD 0x00DD0037

;DD 0x0000AAAA;DB  10,10,10,10,  00,00,00,00
;DD 0x00002222;DB  02,02,02,02,  00,00,00,00
DD 0x002200AA

;DD 0x00AA00DD;DB  13,13,00,00,  10,10,00,00
;DD 0x00660077;DB  07,07,00,00,  06,06,00,00
DD 0x6677AADD

;DD 0x00003333;DB  03,03,03,03,  00,00,00,00
;DD 0x00110033;DB  03,03,00,00,  01,01,00,00
DD 0x11330033

;DD 0x13131313;DB  03,01,03,01,  03,01,03,01
;DD 0x00131313;DB  03,01,03,01,  03,01,00,00
DD 0x13131313

;DD 0x77773333;DB  03,03,03,03,  07,07,07,07
DW 0x7733

;DD 0xAAAA0000;DB  00,00,00,00,  10,10,10,10
;DD 0xDDDD0000;DB  00,00,00,00,  13,13,13,13
DD 0xDD00AA00

;DD 0x00DDDDDD;DB  13,13,13,13,  13,13,00,00
DW 0xDDDD
DD 0xDDDDFDA0;DB  00,10,13,15,	13,13,13,13

Final tweaks

The unpacked bytes not necessary to copy.

 MOV EDX,11111111011100110000111111110000B
 MOV SI,packed
 MOV DI,unpack
.0:
 SALC
 AND AL,[SI]
 MOVSB
 STOSB
 SHR EDX,1
 JNZ .0
 STOSW
 MOVSW
 STOSB
 STOSB

The final data consits of 34 bytes packed data + 6 bytes unpacked data + 32 bits.

packed:
DB 0x00
DD 0xAA776633
DD 0x00DD0037
DD 0x002200AA
DD 0x6677AADD
DD 0x11330033
DD 0x13131313
DW 0x7733
DD 0xDD00AA00
DB 0xDD
DW 0xFDA0

Music:
DD 0x00000313;DB  03,01,03,00,	00,00,00,00
DW 0x0777

unpack:

And the icing on the cake the first music data value can be reused as the number of the video mode :)

 LODSB ;MOV AL,13H
 INT 10H

Which will save the last byte for us!

Look at A20.ASM


If you liked this writeup, then leave a comment at the download page :)