In r008.5.14, M1 reached same time of IO_ACK are ignored (not M1) in WAIT_n generator.
In r008.5.14, MEM_WR has an OSD menu choice to switch between "quick" and "slow", "slow" mode does insert ONE WAIT_n during detection of MEM_WR. This switch exists because somes games are running in "slow" mode and others in "quick" mode. In fact it exists several instruction making MEM_wr, and adding each one ONE WAIT_n does result in different case of synchronization. If it's about managing GA reading pixels, perhaps not only M1 signal is truly synchronized but also the MEM_RD and MEM_WR accesses at another offset.
[http://www.cpcwiki.eu/forum/emulators/cpc-z80-timing/ CPC Z80 timing]