</pre>
== FasterFastest, accurate 8bit * 8bit Unsigned  with 16 KB tables ==
'''Input:''' L = Multiplier, C = Multiplicand
</pre>
16K is a lot of memory to use for tables== Fast, but I'm working on a way to reduce this to 8K while maintaining similar performance (around 26μs). The idea is rather than using accurate 8bit * 8bit Unsigned  with 8 KB tables for the low and high nibbles, use tables for alternate bits. So there would be 16 tables for the values #00, #01, #04, #05, #10, #11, #14, #15, #40, #41, #44, #45, #50, #51, #54, #55. The values can be shifted by 1 (rather than 4) to give values for the alternate bits using a simple ADD HL,HL.==
This routine should be easy to adapt to a 16bit * 8bit unsigned multiplication or even a 16bit * 16bit'''Input:''' L = Multiplier, using simple additions.C = Multiplicand
[[User'''Output:Executioner|Executioner]] 03:58, 8 May 2007 (CEST)''' DE = Product
'''CPC Cycles:''' 100 = 25 usec
'''Size:''' 19 bytes of code + 8 KB tables
 
16K is a lot of memory to use for tables, but I'm working on a way to reduce this to 8K while maintaining similar performance (around 26μs). The idea is rather than using tables for the low and high nibbles, use tables for alternate bits. So there would be 16 tables for the values #00, #01, #04, #05, #10, #11, #14, #15, #40, #41, #44, #45, #50, #51, #54, #55. The values can be shifted by 1 (rather than 4) to give values for the alternate bits using a simple ADD HL,HL.
 
This routine should be easy to adapt to a 16bit * 8bit unsigned multiplication or even a 16bit * 16bit, using simple additions.
 
[[User:Executioner|Executioner]] 03:58, 8 May 2007 (CEST)
Ok, here is the new 8K routine, first the routine to build the tables:
'''Output''': DE = Product
'''CPC Cycles''': 104-112 (108 on average) = 26-28 (27)usec
'''Size''': 24 bytes of code and 1024 bytes for the tables
[[User:Executioner|Executioner]] 06:25, 4 April 2008 (CEST)
 
 
'''Output''': A:E = Product
 
'''CPC Cycles''': 136-172 (154 on average) = 26-28 (27) usec
 
'''Size''': 44 bytes of code and 512 bytes for the tables
 
Fast 8 bit unsigned multiplication with 16 bit result. It uses formula
<pre>
  x*y = ((x+y)/2)^2 - ((x-y)/2)^2, if x+y is even
      = ((x+y-1)/2)^2 - ((x-y-1)/2)^2 + y, if x+y is odd and x>=y
</pre>
 
<pre>    cp b
    jr nc,l1
 
    ld e,a
    ld a,b
    ld b,e
 
l1  ld c,a
    sub b
    rra
    ld d,a
    ld a,c
    add a,b
    rra
    ld l,a
    ld h,high(sqrlo)
    ld a,(hl)
    ld e,l
    ld l,d
    jr nc,l2
 
    sub (hl)   ;odd
    ld l,e
    ld e,a
    inc h      ;loads high(sqrhi)
    ld a,(hl)
    ld l,d
    sbc a,(hl)
    ld d,a
 
    ld a,e
    add a,b
    ld e,a
    ld a,d
    adc a,0
    ret
 
l2  sub (hl)   ;even
    ld l,e
    ld e,a
    inc h
    ld a,(hl)
    ld l,d
    sbc a,(hl)
    ret
 
sqrlo ;low(x*x)   should be at the page border
    db 0,1,4,9,$10,$19,$24,$31,$40,$51,$64,$79,$90,$a9,$c4,$e1
    db 0,$21,$44,$69,$90,$b9,$e4,$11,$40,$71,$a4,$d9,$10,$49,$84,$c1
    db 0,$41,$84,$c9,$10,$59,$a4,$f1,$40,$91,$e4,$39,$90,$e9,$44,$a1
    db 0,$61,$c4,$29,$90,$f9,$64,$d1,$40,$b1,$24,$99,$10,$89,4,$81
    db 0,$81,4,$89,$10,$99,$24,$b1,$40,$d1,$64,$f9,$90,$29,$c4,$61
    db 0,$a1,$44,$e9,$90,$39,$e4,$91,$40,$f1,$a4,$59,$10,$c9,$84,$41
    db 0,$c1,$84,$49,$10,$d9,$a4,$71,$40,$11,$e4,$b9,$90,$69,$44,$21
    db 0,$e1,$c4,$a9,$90,$79,$64,$51,$40,$31,$24,$19,$10,9,4,$1
    db 0,1,4,9,$10,$19,$24,$31,$40,$51,$64,$79,$90,$a9,$c4,$e1
    db 0,$21,$44,$69,$90,$b9,$e4,$11,$40,$71,$a4,$d9,$10,$49,$84,$c1
    db 0,$41,$84,$c9,$10,$59,$a4,$f1,$40,$91,$e4,$39,$90,$e9,$44,$a1
    db 0,$61,$c4,$29,$90,$f9,$64,$d1,$40,$b1,$24,$99,$10,$89,4,$81
    db 0,$81,4,$89,$10,$99,$24,$b1,$40,$d1,$64,$f9,$90,$29,$c4,$61
    db 0,$a1,$44,$e9,$90,$39,$e4,$91,$40,$f1,$a4,$59,$10,$c9,$84,$41
    db 0,$c1,$84,$49,$10,$d9,$a4,$71,$40,$11,$e4,$b9,$90,$69,$44,$21
    db 0,$e1,$c4,$a9,$90,$79,$64,$51,$40,$31,$24,$19,$10,9,4,$1
sqrhi ;high(x*x)
    db 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
    db 1,1,1,1,1,1,1,2,2,2,2,2,3,3,3,3
    db 4,4,4,4,5,5,5,5,6,6,6,7,7,7,8,8
    db 9,9,9,$a,$a,$a,$b,$b,$c,$c,$d,$d,$e,$e,$f,$f
    db $10,$10,$11,$11,$12,$12,$13,$13,$14,$14,$15,$15,$16,$17,$17,$18
    db $19,$19,$1a,$1a,$1b,$1c,$1c,$1d,$1e,$1e,$1f,$20,$21,$21,$22,$23
    db $24,$24,$25,$26,$27,$27,$28,$29,$2a,$2b,$2b,$2c,$2d,$2e,$2f,$30
    db $31,$31,$32,$33,$34,$35,$36,$37,$38,$39,$3a,$3b,$3c,$3d,$3e,$3f
    db $40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$4a,$4b,$4c,$4d,$4e,$4f
    db $51,$52,$53,$54,$55,$56,$57,$59,$5a,$5b,$5c,$5d,$5f,$60,$61,$62
    db $64,$65,$66,$67,$69,$6a,$6b,$6c,$6e,$6f,$70,$72,$73,$74,$76,$77
    db $79,$7a,$7b,$7d,$7e,$7f,$81,$82,$84,$85,$87,$88,$8a,$8b,$8d,$8e
    db $90,$91,$93,$94,$96,$97,$99,$9a,$9c,$9d,$9f,$a0,$a2,$a4,$a5,$a7
    db $a9,$aa,$ac,$ad,$af,$b1,$b2,$b4,$b6,$b7,$b9,$bb,$bd,$be,$c0,$c2
    db $c4,$c5,$c7,$c9,$cb,$cc,$ce,$d0,$d2,$d4,$d5,$d7,$d9,$db,$dd,$df
    db $e1,$e2,$e4,$e6,$e8,$ea,$ec,$ee,$f0,$f2,$f4,$f6,$f8,$fa,$fc,$fe
</pre>
 
[[User:Litwr|Litwr]] 10:25, 24 October 2015 (CEST)
== 16bit * 16bit Unsigned ==