Difference between revisions of "Programming:Fast Square Root"

From CPCWiki - THE Amstrad CPC encyclopedia!
Jump to: navigation, search
(Added optimised version (much much faster!))
(Faster Again)
 
(2 intermediate revisions by the same user not shown)
Line 66: Line 66:
  
 
<pre>
 
<pre>
CP 8 * 8
+
SqrtA: CP 8 * 8
 
JR NC,ge8
 
JR NC,ge8
 
CP 4 * 4
 
CP 4 * 4
Line 74: Line 74:
 
OR A
 
OR A
 
RET Z
 
RET Z
INC A
+
LD A,1
 
RET
 
RET
 
ge2: CP 3 * 3
 
ge2: CP 3 * 3
LD A, 3
+
SBC A
SBC 0
+
ADD 3
 
RET
 
RET
 
ge4: CP 6 * 6
 
ge4: CP 6 * 6
 
JR NC,ge6
 
JR NC,ge6
 
CP 5 * 5
 
CP 5 * 5
LD A, 5
+
SBC A
SBC 0
+
ADD 5
 
RET
 
RET
 
ge6: CP 7 * 7
 
ge6: CP 7 * 7
LD A, 7
+
SBC A
SBC 0
+
ADD 7
 
RET
 
RET
 
ge8: CP 12 * 12
 
ge8: CP 12 * 12
Line 95: Line 95:
 
JR NC,ge10
 
JR NC,ge10
 
CP 9 * 9
 
CP 9 * 9
LD A,9
+
SBC A
SBC 0
+
ADD 9
 
RET
 
RET
 
ge10: CP 11 * 11
 
ge10: CP 11 * 11
LD A,11
+
SBC A
SBC 0
+
ADD 11
 
RET
 
RET
 
ge12: CP 14 * 14
 
ge12: CP 14 * 14
 
JR NC,ge14
 
JR NC,ge14
 
CP 13 * 13
 
CP 13 * 13
LD A,13
+
SBC A
SBC 0
+
ADD 13
 
RET
 
RET
 
ge14: CP 15 * 15
 
ge14: CP 15 * 15
LD A,15
+
SBC A
SBC 0
+
ADD 15
 
RET
 
RET
 
</pre>
 
</pre>
 +
 +
The average time for this routine, including CALL and RET is 26.9us.
 +
 
[[Category:Programming]]
 
[[Category:Programming]]

Latest revision as of 20:03, 18 September 2007

Input: A = The number you want to find the square root of

Output: A = Square root of the number supplied in A

Destroys: A,B,DE

Call: CALL SqrtA

;this routine written 10 - 28 - 2003
;ported from the 68k version (by Frank Yaul) by konrad meyer
;this routine is not as memory efficient as it could be; this is
;done to save cycles

;note: this routine is using the fastest available method
;and will come within 1 below the #
;note: this routine is inaccurate in #s below 4

SqrtA:
	LD	(Asqr),A
	SRL	A
	JR	DataOver
Asqr:
	.DB	0
Bsqr:
	.DB	0
Csqr:
	.DB	0
DataOver:
	LD	(Bsqr),A
	LD	B,A
	LD	(Csqr),A
iterate:
	LD	A,(Asqr)
	LD	B,(Bsqr)
	LD	D,A
	LD	E,B
divideDbyEreturnA:
	RL	D
	RLA
	SUB	E
	JR	nc,$+3
	ADD	A,E
	LD	E,A
	LD	A,D
	CPL
	LD	B,(Bsqr)
	ADD	A,B
	SRL	A
	LD	(Bsqr),A
	LD	A,(Csqr)
	DEC	A
	LD	B,(Bsqr)
	CP	B
	JR	z,done
	LD	(Csqr),B
	JR	iterate
done:
	LD	A,(Bsqr)
	RET

Faster Again

Since there are only 15 possible results for the square root of an 8 bit number, it's really just a series of comparisons, optimised to use a binary comparison. This routine also leaves all other registers intact:

SqrtA:	CP 8 * 8
	JR NC,ge8
	CP 4 * 4
	JR NC,ge4
	CP 2 * 2
	JR NC,ge2
	OR A
	RET Z
	LD A,1
	RET
ge2:	CP 3 * 3
	SBC A
	ADD 3
	RET
ge4:	CP 6 * 6
	JR NC,ge6
	CP 5 * 5
	SBC A
	ADD 5
	RET
ge6:	CP 7 * 7
	SBC A
	ADD 7
	RET
ge8:	CP 12 * 12
	JR NC,ge12
	CP 10 * 10
	JR NC,ge10
	CP 9 * 9
	SBC A
	ADD 9
	RET
ge10:	CP 11 * 11
	SBC A
	ADD 11
	RET
ge12:	CP 14 * 14
	JR NC,ge14
	CP 13 * 13
	SBC A
	ADD 13
	RET
ge14:	CP 15 * 15
	SBC A
	ADD 15
	RET

The average time for this routine, including CALL and RET is 26.9us.