Sizecoding Blog by TomCat

While we are waiting for the Demobit 256b compo day by day I will share only one thought about one of my tiny intro...

First I draw the path of the scroll with 7 x 35 elements. Then we have no other job, just rotating the colors in the color palette:
 MOV    CH,3            ; 256*3 color components
 REP    OUTSB           ; changing the whole color palette
 MOV    CX,3*7*35	; rotating the colors in the puffer
 DEC    SI
 MOV    DL,[SI-7*3]     ; by 7 colors forward
 MOV    [SI],DL
 LOOP   .2
 ...                    ; we put one new column to the puffer
download the full source code
I store characters in 5 bits. I have less than 32 different characters: 25 lowercase letters (no Z in the text), the point, the space, the double space and the new line character. After skipping the first 8 letters the title's 15 chars are converted to uppercase, and every first and every fifth character of the paragraphs too:
 MOV    CL,8            ; offset of the title
 MOV    DH,00100001B    ; flags for capital letters
 LOOP   skip            ; no converting
 INC    CX              ; CX = 1 -> LOOP doesn't skip anymore
 SHL    DI,1            ; DI:0FFFEH -> 15 captital letters
 JB     convert
 SHR    DH,1            ; every first and fifth chars
 JC     convert
download the full source code
I was browsing the Mandelbrot set as deep what is allowed by the FPU's 80-bit precision floatingpoint numbers. I was searching for nice and interesting forms and structures. The intro zooms in and out to these coordinates:
+0.34390699597256746411, -0.70062002023500613567
-1.25764648790205013639, +0.11831488894193964434
-0.93789936639955584496, +0.31736094066985742756
-0.72568075954437072372, -0.27254962894836931575
-0.15713601278801156424, -1.10494558202452419770
+0.34390699597256746411, +0.70062002023500613567
-1.25764648790205013639, -0.11831488894193964434
-0.93789936639955584496, -0.31736094066985742756
-0.72568075954437072372, +0.27254962894836931575
-0.15713601278801156424, +1.10494558202452419770
download the full source code
Storing the bits in horizontal direction it would be logic, but I store them from the upper-right corner to the lower-left corner in vertical direction. Because in this way I can spare 3 bits at the start of data and 18 bits at the end.
    MOV    SI,26*14-16          ; number of bits
    MOV    DL,14                ; coloums of the image
.3: MOV    CL,26                ; raws of the image
.4: BT     [BP-text+bits],SI    ; the bit means color #0 or #255
    DEC    SI                   ; counting the bits left
    LOOPNZ .4                   ; repeat until bottom of scr or end of bits
    SUB    DI,320*7*26+8
    DEC    DX                   ; go to next coloum
    JNZ    .3
download the full source code
8 gray shaded photo looked poor quality, but 16 shades wasn't needed. I used 12 shades with a small blur which allowed me storing 54 circles (and ellipsoids). To find the shapes, I had a formula to summarize the differences between the original photo and the current image. With brute force I've tried every possible object, and kept the 2d-shape which gave the lowest result... and again... the next shape.
 ADC    AL,12H          ; AL = previous+shift+correction
 MOV    AH,BH           ; AH = new color
 AAD    1               ; AL = new+prev color, AH = 0
 SHR    AL,1            ; horizontal blur
 STOSB                  ; put pixel
 LOOP  .1               ; next pixel
download the full source code
The screen consists modulo 256 cells. When the radius of spheres is less than 256, then easy to make 3D textures with cheap 16-bit instructions:
 CMP    AX,633          ; AX:X2+Y2
 JNA    white
 CMP    AX,1392
 JNA    black
 CWD                    ; AX:Y
 XOR    DX,AX           ; DX = abs(Y)
 CMP    DL,11
 JNA    black
 JS     white
download the full source code
I wasn't able to select the right tempo, so finally I left many variations in the intro. The speed of the music will be faster and faster, because I'm speeding up the system BIOS timer:
 MOV    AX,[GS:046CH]   ; get the BIOS timer counter
 SUB    DI,AX           ; negate the counter to count down
 SHR    AX,8+5          ; the tempo will change after 2^13 ticks
 INC    AX              ; at least one -> freq divisor will be at least 257
 OUT    40H,AL          ; IRQ0 speedup (faster BIOS timer tick)
;OUT    40H,AL          ; lower and higher bytes will be the same (in 2 pass)
download the full source code
There are no separate parts and transitions between them. The whole intro is only one effect. From the beginning to the end, the red constant is decreased from 9 to 0 (and there is a pause at value 4).
void mainImage(out vec4 o,vec2 u)
    vec3 R = iResolution, 
         p = R-R; p.z = 4.;
    while (R.z++<64.)
        p +=  vec3((u+u-R.xy)/R.x,.5) 
            * (length(vec2(o.a=length(p.xz)-4.,p.y))-4.);
    o = vec4 ( 1&int(7.*(atan(p.y,o.a)-atan(p.z,p.x)-iTime)) );                 
download the full source code
The first and the 3rd effect run the same routine, but calculate with more or less bars. For rotating the bars I use the triangle wave function instead of sine&cosine.
.1: MOV    AX,BX       ; AL: time = [0...255], CH: 0 
    SUB    AL,CH       ; shift triangle wave by PI/2
    XOR    AL,AH       ; AL = triangle wave = [0...127]
    SUB    AL,64       ; AL = [-64...63]
    IMUL   DH          ; DH: dy, DL: dx
    XCHG   BP,AX       ; BP = b*dy
    XCHG   DH,DL
    XOR    CH,64       ; 64 -> PI/2 (256 -> 2PI)
    JNZ    .1          ; loop 2x
    ADD    AX,BP       ; AX = c = a*dx + b*dy
download the full source code
Spheres are ordered by the distance from eye. So the first hit gives the closest object. The reflected hit order is not correct, but who cares?
.2: DEC    DX           ; AX:16,  DX:16
.3: XCHG   AX,DX
    STOSW               ; X = DX = [15...-15]
    XCHG   AX,DX
    XCHG   AX,CX
    STOSW               ; Y = CX = [24...1]
    XCHG   AX,CX
    STOSW               ; Z = AX = 16
    NEG    DX
    JS     .3
    JNZ    .2
    LOOP   .1           ; 31x24 spheres
download the full source code
I play arpeggios alternately from two kind of chords, and I vary the tempo of arpeggios. Thanks to this, 2x3 notes are enough for a whole song :)
 SHR    EBP,CL          ; EBP: time counter, CL: tempo of arpeggio
 AND    BP,3            ; BP = index of the note in a chord
 TEST   SI,8192*2       ; depends on time counter (SI)
 JNZ    @F              ; alternate between chord1 and chord2
 MOV    AH,[BP+NOTESA1] ; get the note
NOTESA1: DB 128,144,192,0
NOTESA2: DB 128,152,192,0
download the full source code
The standard VGA 256 color palette has some structures. If you increase the color index by 72 and 72, you get a darker and darker color. When the background pattern is behind a moving sphere, I do this.
 CS LODSW               ; get the mirrored offset
 MOV    AL,[BX+DI]      ; get the background color
 TEST   BX,BX           ; sphere hit test 
 JZ     .3
 ADD    AL,72           ; make it darker
 STOSB                  ; put pixel
download the full source code
Only a big gradient drawn on the screen from left to right, and the color palette continuously renewing along triangle waves. Each components of a color has different phase. This difference given by the bytebeat sample.
 MOV    BX,SI           ; BX = 256
 MOV    AL,[BX]         ; phase of component
 SUB    AL,CL           ; next angle
 XOR    AL,AH           ; triangle wave [0...127]
 SHR    AL,1            ; AL = [0...63]
 OUT    DX,AL           ; set color component
 INC    BX
 JPO    rgb             ; loop 3x
download the full source code
Drawing in HiRes TrueColor video mode we need a fast code, so symmetry is our friend again. The left side of a line is mirrored to the right, and for more speed every line is doubled:
 MOV    [SI+BX],AL      ; horizontal mirror
 MOV    [SI+BP+RESX*4/2-1],AL
 REP    MOVSD           ; twice as fast than REP MOVSW
 POP    SI
 DEC    BP
 JPE    twice           ; duplicate scanline
download the full source code
First Lyric Video in 256 byte :) Music data including the markers of lyric is compressed by a special method: 1 => fill by zero (silent); pos values => 1 byte copy then fill by the next byte (text marker and note); neg value => copy already uncompressed data (repeated patterns).
 JS	copy
 JZ	fill
download the full source code
I store 3 different properties only in one byte: shape (0-2), size (0-6), color (0-12). If you have only three different kind of shapes, You don't need to store this data on 2 bits (one and a half bits are enough :) and AAM instruction is more powerful than AND or SHR:
 AAM    21
 MOV    CH,AH           ; CH = color
 AAM    7               ; AL = radius, AH = shape
 DB 10*21+7*1+6         ; color(*21) + (7*)shape + radius
 DB  1*21+7*1+4
download the full source code
This was my first intro when any sound and any pixel in the intro depends on the time. So it will run at the same speed on every PC or DOSBox config, but the slower PC draws the less frames. The soundsample and the pixelcolor has it's own math formula. This kind of sound generation called bytebeat:
t = time
sound = 1500/(y=t&8191)&1)*35
 + (x=t*"6689"[t>>15&3]/24&127)*y/40000
 + ((t>>7^t>>9|t>>13|x)&63
download the full source code
Because not enough free DOS memory, I use the 4th byte of every pixel in the 32 bit video mode and my Z buffer stored in the video memory.
 CMP    [ES:DI+3],DL    ; Z buffer test
 JA     pixelok
 STOSB                  ; write RGB color
 INC    BX
 JPO    bgr             ; loop 3x
 XCHG   AX,DX           ; AL = Z coord
 STOSB                  ; write Z buffer
download the full source code
Every visual effect in the intro is symmetric about the center of the screen. So I'm drawing the pixels from the topleft corner and from the bottomright corner at the same time.
 MOV    BH,320*200/256
 DEC    BX
 MOV    [ES:BX],AL      ; mirroring for speedup
 CMP    DI,BX           ; check halfscreen
 JC     nextpixel
download the full source code