* To set A to zero, XOR A is one byte smaller and one M-cycle faster than LD A,0.
* To check if A is zero, AND A or OR A are one byte smaller and one M-cycle faster than CP 0.
* Unrolled An unrolled LDI loop is faster than LDIR. The same applies to other Z80 block instructions.
* If tables are aligned to a 256-byte boundary, the contents can be accessed by placing the index in a register such as L and the table address in H. This is faster than loading the full unaligned 16-bit address and adding a 16-bit index to it, and makes accessing tables with a size of 256 bytes or less very convenient.