Last modified on 4 September 2024, at 17:03

6502

Revision as of 17:03, 4 September 2024 by Phi2x (Talk | contribs) (Oddities)

The 6502 CPU

The MOS Technology 6502 is an 8-bit microprocessor designed by Chuck Peddle for MOS Technology in 1975. When it was introduced it was the least expensive full featured CPU on the market by far, at about 1/6th the price, or less, of competing designs from larger companies such as Motorola and Intel. It was nevertheless faster than most of them, and, along with the Zilog Z80, sparked off a series of computer projects that would eventually result in the home computer revolution of the 1980s. The 6502 design was originally second-sourced by Rockwell and Synertek and later licensed to a number of companies; it is still made for embedded systems.

Originally the CPC was destined to be designed around the 6502 processor. But when Amstrad approached Locomotive Software to develop a Basic for it with a very tight deadline, Locomotive PLC, who already had a Z80 Basic in the works, urged and convinced Amstrad to switch to the Z80.


Description

The 6502 microprocessor is an 8-bit CPU with an 8-bit ALU and a 16-bit address bus capable of direct access to 64KB of memory space. Like the Z80, the 6502 is also a little-endian CPU, meaning it stores 16-bit values with the least significant byte first, followed by the most significant byte. The 6502 has 151 instructions, which are composed of 56 distinct opcodes across various addressing modes.

Although it lacks the raw processing power of processors like the Intel 80x86 or the Motorola 68000 series, the 6502 was known for its efficiency and affordability, making it a popular choice for embedded systems and early home computers. Its simple design contributed to lower manufacturing costs and simplified integration.

The 6502 chip is made up of 4528 transistors (3510 enhancement transistors and 1018 depletion pullup transistors). It comes in a 40-pin DIP package. It has been produced by various manufacturers and used in a wide range of applications, from early gaming consoles like the Atari VCS and Nintendo Entertainment System to personal computers like the Apple II and Commodore 64.


Registers

Register Size Description Notes
A (Accumulator) 8-bit Main register for arithmetic, logic, and data transfer Most operations use this register
X (Index Register X) 8-bit Used for indexing memory and loop counters Can be used for addressing modes like Indexed Indirect, Zero Page Indexed, and Absolute Indexed
Y (Index Register Y) 8-bit Used for indexing memory and loop counters Often used in Absolute and Zero Page Indexed addressing
P (Processor Status) 8-bit
  • bit7 - NF - Negative Flag
  • bit6 - VF - Overflow Flag
  • bit5 - Unused (always set to 1)
  • bit4 - BF - Break Command
  • bit3 - DF - Decimal Mode Flag
  • bit2 - IF - Interrupt Disable Flag
  • bit1 - ZF - Zero Flag
  • bit0 - CF - Carry Flag
Flags are affected by most operations
S (Stack Pointer) 8-bit Points to the current location in the stack Stack is located in page 1 ($0100-$01FF), 8-bit S is offset to this base
PC (Program Counter) 16-bit Points to the next instruction to be executed Automatically increments as instructions are executed


Decimal Mode

On NMOS, when Decimal Mode is on, the ADC and SBC instructions update NF, VF and ZF based on the binary result before the decimal correction is applied. Only CF is updated correctly. On CMOS, all the flags are updated correctly, at the cost of 1 additional cycle.

On NMOS, DF is not defined after RESET. On CMOS, DF is automatically cleared on RESET.

On NMOS, DF is unchanged when entering an interrupt of any kind. This can cause unexpected bugs in the interrupt handler if Decimal Mode is on when an interrupt occurs. On CMOS, DF is automatically cleared on interrupt. Upon returning from an interrupt, the processor restores the status register from the stack, including DF.


Memory Access

The address space that the 6502 uses is split into pages. There are 256 pages and each page is 256 bytes in size, ranging from page 0 to page 255.

In order to make up for the lack of registers, the 6502 includes a zero page addressing mode ($0000-$00FF) that uses only 1 address byte in the instruction instead of the 2 that are needed to address the full 64 KB of memory. This provides fast access to the first 256 bytes of RAM by using shorter instructions.

The stack is permanently located in page 1 ($0100-$01FF) and managed by the 8-bit stack pointer (S), with an initial value of $FF. It grows downward as data is pushed onto the stack. The stack has a 256-byte limit, and overflow occurs if not managed properly.

Instructions PHA and PHP push the accumulator and processor status onto the stack, while PLA and PLP pull them back. Subroutine calls with JSR store the return address on the stack, and RTS retrieves it to continue execution. Similarly, interrupts (BRK) push the program counter and status, while RTI restores them.

All I/O operations are memory-mapped. There are no port-based I/O instructions.

BRK / IRQ / NMI / RESET

On a RESET, the CPU loads the vector from $FFFC/$FFFD into the program counter and continues fetching instructions from there.

On an NMI, the CPU pushes the low byte and the high byte of the program counter as well as the processor status onto the stack, disables interrupts and loads the vector from $FFFA/$FFFB into the program counter and continues fetching instructions from there.

On an IRQ, the CPU does the same as in the NMI case, but uses the vector at $FFFE/$FFFF.

On a BRK instruction, the CPU does the same as in the IRQ case, but sets bit #4 (B flag) in the copy of the status register that is saved on the stack.

Half Cycles

The 6502 divides each clock cycle into two phases (ϕ1 and ϕ2):

  • During the ϕ1 half-cycle, no bus access occurs. This phase is dedicated to internal CPU operations.
  • During the ϕ2 half-cycle, the CPU accesses the external bus for memory reads/writes or I/O operations.

The use of half-cycles ensures that memory and I/O devices have predictable timing windows when the CPU will access the bus, while still allowing the CPU to perform internal operations in parallel.

Unlike most microprocessors, the 6502 does not make memory accesses on an "as needed" basis. It always does a fetch or store on every single clock cycle. When there isn't anything to be fetched or stored, a "garbage" fetch or store occurs. This is mainly of importance with the memory-mapped I/O devices:

  • On NMOS, when adding a carry to the MSB of an address, a fetch occurs at a garbage address. On CMOS, the last byte of the instruction is refetched.
  • On NMOS, when doing a fetch-modify-store instruction (INC, DEC, ASL, LSR, ROL, ROR), garbage is stored into the location during the "modify" cycle... followed by the "real" store cycle which stores the correct data. On CMOS, a second fetch is performed instead of a garbage store.


NMOS 6502 Instruction set

Standard instructions

Cycles are shown in parenthesis for each opcode. p=1 if page is crossed. t=1 if branch is taken.

Mnemonic Addressing Modes Flags Operation Description
No arg A #$nn $nnnn $nnnn,X $nnnn,Y ($nnnn) $nn $nn,X $nn,Y ($nn,X) ($nn),Y rel N V - B D I Z C
ADC 69 (2) 6D (4) 7D (4+p) 79 (4+p) 65 (3) 75 (4) 61 (6) 71 (5+p) N V - - - - Z C A + M + CF → A, CF Add Memory to Accumulator with Carry
AND 29 (2) 2D (4) 3D (4+p) 39 (4+p) 25 (3) 35 (4) 21 (6) 31 (5+p) N - - - - - Z - A ∧ M → A "AND" Memory with Accumulator
ASL 0A (2) 0E (6) 1E (7) 06 (5) 16 (6) N - - - - - Z C CF ← /M7...M0/ ← 0 Arithmetic Shift Left
BCC 90 (2+t+p) - - - - - - - - Branch on CF = 0 Branch on Carry Clear
BCS B0 (2+t+p) - - - - - - - - Branch on CF = 1 Branch on Carry Set
BEQ F0 (2+t+p) - - - - - - - - Branch on ZF = 1 Branch on Result Zero
BIT 2C (4) 24 (3) N V - - - - Z - A ∧ M, M7 → NF, M6 → VF Test Bits in Memory with Accumulator
BMI 30 (2+t+p) - - - - - - - - Branch on NF = 1 Branch on Result Minus
BNE D0 (2+t+p) - - - - - - - - Branch on ZF = 0 Branch on Result Not Zero
BPL 10 (2+t+p) - - - - - - - - Branch on NF = 0 Branch on Result Plus
BRK 00 (7) - - - - - 1 - - PC + 2↓, [FFFE] → PCL, [FFFF] → PCH Break Command
BVC 50 (2+t+p) - - - - - - - - Branch on VF = 0 Branch on Overflow Clear
BVS 70 (2+t+p) - - - - - - - - Branch on VF = 1 Branch on Overflow Set
CLC 18 (2) - - - - - - - 0 0 → CF Clear Carry Flag
CLD D8 (2) - - - - 0 - - - 0 → DF Clear Decimal Mode
CLI 58 (2) - - - - - 0 - - 0 → IF Clear Interrupt Disable
CLV B8 (2) - 0 - - - - - - 0 → VF Clear Overflow Flag
CMP C9 (2) CD (4) DD (4+p) D9 (4+p) C5 (3) D5 (4) C1 (6) D1 (5+p) N - - - - - Z C A - M Compare Memory and Accumulator
CPX E0 (2) EC (4) E4 (3) N - - - - - Z C X - M Compare Index Register X To Memory
CPY C0 (2) CC (4) C4 (3) N - - - - - Z C Y - M Compare Index Register Y To Memory
DEC CE (6) DE (7) C6 (5) D6 (6) N - - - - - Z - M - 1 → M Decrement Memory By One
DEX CA (2) N - - - - - Z - X - 1 → X Decrement Index Register X By One
DEY 88 (2) N - - - - - Z - Y - 1 → Y Decrement Index Register Y By One
EOR 49 (2) 4D (4) 5D (4+p) 59 (4+p) 45 (3) 55 (4) 41 (6) 51 (5+p) N - - - - - Z - A ⊻ M → A "Exclusive OR" Memory with Accumulator
INC EE (6) FE (7) E6 (5) F6 (6) N - - - - - Z - M + 1 → M Increment Memory By One
INX E8 (2) N - - - - - Z - X + 1 → X Increment Index Register X By One
INY C8 (2) N - - - - - Z - Y + 1 → Y Increment Index Register Y By One
JMP 4C (3) 6C (5) - - - - - - - - [PC + 1] → PCL, [PC + 2] → PCH JMP Indirect
JSR 20 (6) - - - - - - - - PC + 2↓, [PC + 1] → PCL, [PC + 2] → PCH Jump To Subroutine
LDA A9 (2) AD (4) BD (4+p) B9 (4+p) A5 (3) B5 (4) A1 (6) B1 (5+p) N - - - - - Z - M → A Load Accumulator with Memory
LDX A2 (2) AE (4) BE (4+p) A6 (3) B6 (4) N - - - - - Z - M → X Load Index Register X From Memory
LDY A0 (2) AC (4) BC (4+p) A4 (3) B4 (4) N - - - - - Z - M → Y Load Index Register Y From Memory
LSR 4A (2) 4E (6) 5E (7) 46 (5) 56 (6) 0 - - - - - Z C 0 → /M7...M0/ → CF Logical Shift Right
NOP EA (2) - - - - - - - - No operation No Operation
ORA 09 (2) 0D (4) 1D (4+p) 19 (4+p) 05 (3) 15 (4) 01 (6) 11 (5+p) N - - - - - Z - A ∨ M → A "OR" Memory with Accumulator
PHA 48 (3) - - - - - - - - A↓ Push Accumulator On Stack
PHP 08 (3) - - - 1 - - - - P↓ Push Processor Status on Stack
PLA 68 (4) N - - - - - Z - (S)↑ → A Pull Accumulator From Stack
PLP 28 (4) N V - B D I Z C (S)↑ → P Pull Processor Status From Stack
ROL 2A (2) 2E (6) 3E (7) 26 (5) 36 (6) N - - - - - Z C CF ← /M7...M0/ ← CF Rotate One Bit Left (Memory or Accumulator)
ROR 6A (2) 6E (6) 7E (7) 66 (5) 76 (6) N - - - - - Z C CF → /M7...M0/ → CF Rotate One Bit Right (Memory or Accumulator)
RTI 40 (6) N V - B D I Z C (S)↑ → P, (S)↑ → PCL, (S)↑ → PCH Return From Interrupt
RTS 60 (6) - - - - - - - - (S)↑ → PCL, (S)↑ → PCH, PC + 1 → PC Return From Subroutine
SBC E9 (2) ED (4) FD (4+p) F9 (4+p) E5 (3) F5 (4) E1 (6) F1 (5+p) N V - - - - Z C A - M - (1 - CF) → A Subtract Memory from Accumulator with Borrow
SEC 38 (2) - - - - - - - 1 1 → CF Set Carry Flag
SED F8 (2) - - - - 1 - - - 1 → DF Set Decimal Mode
SEI 78 (2) - - - - - 1 - - 1 → IF Set Interrupt Disable
STA 8D (4) 9D (5) 99 (5) 85 (3) 95 (4) 81 (6) 91 (6) - - - - - - - - A → M Store Accumulator in Memory
STX 8E (4) 86 (3) 96 (4) - - - - - - - - X → M Store Index X in Memory
STY 8C (4) 84 (3) 94 (4) - - - - - - - - Y → M Store Index Y in Memory
TAX AA (2) N - - - - - Z - A → X Transfer Accumulator to Index X
TAY A8 (2) N - - - - - Z - A → Y Transfer Accumulator to Index Y
TSX BA (2) N - - - - - Z - S → X Transfer Stack Pointer to Index X
TXA 8A (2) N - - - - - Z - X → A Transfer Index X to Accumulator
TXS 9A (2) - - - - - - - - X → S Transfer Index X to Stack Pointer
TYA 98 (2) N - - - - - Z - Y → A Transfer Index Y to Accumulator

Illegal instructions

The opcodes in bold are unstable. Only 2 of those 7 opcodes ($8B, $AB) are actually unstable in the sense that they may produce a truly unpredictable result. The other 5 opcodes actually produce predictable results – but the conditions under which they do that and the produced results are a bit unexpected.

Mnemonic Addressing Modes Flags Operation Description
No arg #$nn $nnnn $nnnn,X $nnnn,Y $nn $nn,X $nn,Y ($nn,X) ($nn),Y N V - B D I Z C
ANC (ANC2) 0B, 2B (2) N - - - - - Z C A ∧ M → A, NF → CF "AND" Memory with Accumulator then Move Negative Flag to Carry Flag
ARR 6B (2) N V - - - - Z C (A ∧ M) / 2 → A "AND" Accumulator then Rotate Right
ASR (ALR) 4B (2) 0 - - - - - Z C (A ∧ M) / 2 → A "AND" then Logical Shift Right
DCP (DCM) CF (6) DF (7) DB (7) C7 (5) D7 (6) C3 (8) D3 (8) N - - - - - Z C M - 1 → M, A - M Decrement Memory By One then Compare with Accumulator
ISC (ISB, INS) EF (6) FF (7) FB (7) E7 (5) F7 (6) E3 (8) F3 (8) N V - - - - Z C M + 1 → M, A - M → A Increment Memory By One then SBC then Subtract Memory from Accumulator with Borrow
JAM (KIL, HLT) 02, 12, 22,

32, 42, 52,

62, 72, 92,

B2, D2, F2 (X)

- - - - - - - - Stop execution Halt the CPU
LAS (LAR) BB (4+p) N - - - - - Z - M ∧ S → A, X, S "AND" Memory with Stack Pointer
LAX (LXA) AB (2) AF (4) BF (4+p) A7 (3) B7 (4) A3 (6) B3 (5+p) N - - - - - Z - M → A, X Load Accumulator and Index Register X From Memory
NOP (DOP, TOP) 1A, 3A, 5A,

7A, DA, FA (2)

80, 82, 89,

C2, E2 (2)

0C (4) 1C, 3C, 5C,

7C, DC, FC (4+p)

04, 44, 64 (3) 14, 34, 54,

74, D4, F4 (4)

- - - - - - - - No operation No Operation
RLA 2F (6) 3F (7) 3B (7) 27 (5) 37 (6) 23 (8) 33 (8) N - - - - - Z C CF ← /M7...M0/ ← CF, A ∧ M → A Rotate Left then "AND" with Accumulator
RRA 6F (6) 7F (7) 7B (7) 67 (5) 77 (6) 63 (8) 73 (8) N V - - - - Z C CF → /M7...M0/ → CF, A + M + CF → A Rotate Right and Add Memory to Accumulator
SAX (AXS, AAX) 8F (4) 87 (3) 97 (4) 83 (6) - - - - - - - - A ∧ X → M Store Accumulator "AND" Index Register X in Memory
SBC (USBC) EB (2) N V - - - - Z C A - M - ~CF → A Subtract Memory from Accumulator with Borrow
SBX (AXS, SAX) CB (2) N - - - - - Z C (A ∧ X) - M → X Subtract Memory from Accumulator "AND" Index Register X
SHA (AHX, AXA) 9F (5) 93 (6) - - - - - - - - A ∧ X ∧ V → M Store Accumulator "AND" Index Register X "AND" Value
SHS (TAS, XAS) 9B (5) - - - - - - - - A ∧ X → S, S ∧ (H + 1) → M Transfer Accumulator "AND" Index Register X to Stack Pointer then Store Stack Pointer "AND" Hi-Byte In Memory
SHX (SXA, XAS) 9E (5) - - - - - - - - X ∧ (H + 1) → M Store Index Register X "AND" Value
SHY (SYA, SAY) 9C (5) - - - - - - - - Y ∧ (H + 1) → M Store Index Register Y "AND" Value
SLO (ASO) 0F (6) 1F (7) 1B (7) 07 (5) 17 (6) 03 (8) 13 (8) N - - - - - Z C M * 2 → M, A ∨ M → A Arithmetic Shift Left then "OR" Memory with Accumulator
SRE (LSE) 4F (6) 5F (7) 5B (7) 47 (5) 57 (6) 43 (8) 53 (8) N - - - - - Z C M / 2 → M, A ⊻ M → A Logical Shift Right then "Exclusive OR" Memory with Accumulator
XAA (ANE) 8B (2) N - - - - - Z - (A ∨ V) ∧ X ∧ M → A Non-deterministic Operation of Accumulator, Index Register X, Memory and Bus Contents


Opcode matrix

Addressing modes: A - accumulator, # - immediate, zpg - zero page, abs - absolute, ind - indirect, rel - relative. Uncolored cells are illegal opcodes.
High nibble Low nibble
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 BRK ORA (ind,X) JAM SLO (ind,X) NOP zpg ORA zpg ASL zpg SLO zpg PHP ORA # ASL A ANC # NOP abs ORA abs ASL abs SLO abs
1 BPL rel ORA (ind),Y JAM SLO (ind),Y NOP zpg,X ORA zpg,X ASL zpg,X SLO zpg,X CLC ORA abs,Y NOP SLO abs,Y NOP abs,X ORA abs,X ASL abs,X SLO abs,X
2 JSR abs AND (ind,X) JAM RLA (ind,X) BIT zpg AND zpg ROL zpg RLA zpg PLP AND # ROL A ANC # BIT abs AND abs ROL abs RLA abs
3 BMI rel AND (ind),Y JAM RLA (ind),Y NOP zpg,X AND zpg,X ROL zpg,X RLA zpg,X SEC AND abs,Y NOP RLA abs,Y NOP abs,X AND abs,X ROL abs,X RLA abs,X
4 RTI EOR (ind,X) JAM SRE (ind,X) NOP zpg EOR zpg LSR zpg SRE zpg PHA EOR # LSR A ALR # JMP abs EOR abs LSR abs SRE abs
5 BVC rel EOR (ind),Y JAM SRE (ind),Y NOP zpg,X EOR zpg,X LSR zpg,X SRE zpg,X CLI EOR abs,Y NOP SRE abs,Y NOP abs,X EOR abs,X LSR abs,X SRE abs,X
6 RTS ADC (ind,X) JAM RRA (ind,X) NOP zpg ADC zpg ROR zpg RRA zpg PLA ADC # ROR A ARR # JMP (ind) ADC abs ROR abs RRA abs
7 BVS rel ADC (ind),Y JAM RRA (ind),Y NOP zpg,X ADC zpg,X ROR zpg,X RRA zpg,X SEI ADC abs,Y NOP RRA abs,Y NOP abs,X ADC abs,X ROR abs,X RRA abs,X
8 NOP # STA (ind,X) NOP # SAX (ind,X) STY zpg STA zpg STX zpg SAX zpg DEY NOP # TXA ANE # STY abs STA abs STX abs SAX abs
9 BCC rel STA (ind),Y JAM SHA (ind),Y STY zpg,X STA zpg,X STX zpg,Y SAX zpg,Y TYA STA abs,Y TXS TAS abs,Y SHY abs,X STA abs,X SHX abs,Y SHA abs,Y
A LDY # LDA (ind,X) LDX # LAX (ind,X) LDY zpg LDA zpg LDX zpg LAX zpg TAY LDA # TAX LXA # LDY abs LDA abs LDX abs LAX abs
B BCS rel LDA (ind),Y JAM LAX (ind),Y LDY zpg,X LDA zpg,X LDX zpg,Y LAX zpg,Y CLV LDA abs,Y TSX LAS abs,Y LDY abs,X LDA abs,X LDX abs,Y LAX abs,Y
C CPY # CMP (ind,X) NOP # DCP (ind,X) CPY zpg CMP zpg DEC zpg DCP zpg INY CMP # DEX AXS # CPY abs CMP abs DEC abs DCP abs
D BNE rel CMP (ind),Y JAM DCP (ind),Y NOP zpg,X CMP zpg,X DEC zpg,X DCP zpg,X CLD CMP abs,Y NOP DCP abs,Y NOP abs,X CMP abs,X DEC abs,X DCP abs,X
E CPX # SBC (ind,X) NOP # ISC (ind,X) CPX zpg SBC zpg INC zpg ISC zpg INX SBC # NOP SBC # CPX abs SBC abs INC abs ISC abs
F BEQ rel SBC (ind),Y JAM ISC (ind),Y NOP zpg,X SBC zpg,X INC zpg,X ISC zpg,X SED SBC abs,Y NOP ISC abs,Y NOP abs,X SBC abs,X INC abs,X ISC abs,X


Block Diagram

Mcs6502-block-diagram.svg


Oddities

  • On NMOS, when the indirect address crosses a page boundary, the 6502 does not add the carry to calculate the address of the high byte. For example, JMP ($19FF) will use the contents of $19FF and $1900 for the JMP address. On CMOS, this issue was fixed, at the cost of 1 additional cycle. In our example, JMP ($19FF) will use the contents of $19FF and $2000 for the JMP address.
  • On NMOS, the simultaneous assertion of a hardware interrupt line and execution of BRK was not accounted for in the design — the BRK instruction will be ignored in such a case. On CMOS, this situation is correctly handled by executing BRK and then servicing the interrupt.
  • Some instructions, particularly those involving branches or indexed addressing modes, incur an extra cycle if the processor has to cross a memory page boundary. This is problematic for time-sensitive code.
  • Conditional jumps are only 8-bit relative. And unconditional jumps are only 16-bit absolute.
  • LDX absolute,Y and LDY absolute,X instructions exist but not the corresponding instructions STX absolute,Y and STY absolute,X.
  • The CLV (Clear Overflow Flag) instruction exist but not the SEV (Set Overflow Flag) instruction.
  • The ROR instruction didn't exist in the very earliest (pre-1977) chips.
  • The NOP instruction takes 2 full-cycles. This is the minimum amount of cycles an instruction can take.
  • The 65C02 fixed multiple bugs of the original NMOS 6502, but also removed access to all illegal instructions. And it added some extra instructions. In fact, there are multiple implementations of the 65C02 (WDC 65C02, WDC 65C02S, Rockwell R65C02, CSG 65CE02), each with its own variant of the instruction set.


Links