Changes

Programming:Integer Multiplication

1,719 bytes added, 04:25, 4 April 2008
Added new 8bit*bit routine
ret
</pre>
 
== Very fast 8bit * 8bit Unsigned with only 1K of tables ==
 
Here's a new routine I've developed which uses the formula ab = ((a + b)<sup>2</sup> - (a - b)<sup>2</sup>) / 4. It's based on a routine for the 6502 by Stephen Judd in a C= Hacking article. Because of differences between the way the 6502 does register indexing it was quite difficult to actually get this working, but it's a great compromise between speed and space since it only uses 1K of tables (as opposed to the 16K or 8K table routines above), and can still manage to do the job in a maximum of 28 microseconds.
 
<br>Firstly, once again, we need some code to generate the tables. These tables contain values for
 
<br>x<sup>2</sup>/4 for 9 bit values of x, with the LSB when bit 8 is zero first followed by the MSB.
<pre>.gen_sq4
xor a
ld de,umul_tab_lo + #1ff
ld (de),a
dec d
ld (de),a
ld h,d
ld l,e
inc e
ld c,e
ld b,2
 
.sq4_lp
ld a,b
cp 2
ld a,e
rra
add (hl)
ld (de),a
inc h
inc d
ld a,(hl)
adc c
ld (de),a
dec d
ld h,d
inc l
inc e
jr nz,sq4_lp
inc d
inc d
djnz sq4_lp
ret
 
align #100
.umul_tab ds #400</pre>
Now for the actual multiply routine:
 
'''Input:''' A = ''Multiplier'', L = ''Multiplicand''
 
'''Output:''' DE = ''Product''
<pre>ld h,umul_tab / #100 ; 2
ld b,h ; 3
add l ; 4
ld c,a ; 5
jr nc,@noovf ; 7
inc b ; 8
inc b ; 9
.@noovf
sub l ; 10
sub l ; 11
jr nc,@noneg ; 13
neg ; 15
.@noneg
ld l,a ; 16
ld a,(bc) ; 18
sub (hl) ; 20
ld e,a ; 21
inc b ; 22
inc h ; 23
ld a,(bc) ; 25
sbc (hl) ; 27
ld d,a ; 28</pre>
This code could easily be converted to a macro as it's only 24 bytes. I've tried to optimise it further but with no luck!
 
[[User:Executioner|Executioner]] 06:25, 4 April 2008 (CEST)
== 16bit * 16bit Unsigned ==
'''Output:''' A,HL = ''Product''
 <pre> ld ix,0
ld hl,0
.mul24b1:
db #dd:ld d,h
ex de,hl
ret</pre>
== Web links ==
151
edits