In this chapter, we’re going to give our Game Boy CPU the ability to make decisions and reuse code. We’ll implement the crucial control flow instructions: JP (Jump), JR (Jump Relative), CALL, and RET (Return), along with their conditional variants. These instructions are fundamental to how programs execute, allowing them to branch, loop, and call subroutines.
By the end of this milestone, your emulator will be able to follow more complex program paths, enabling it to execute actual Game Boy program logic beyond simple linear instruction sequences. This is a significant step towards running real Game Boy ROMs, as it unlocks the ability for programs to react to different states and organize their code efficiently.
Project Overview
Our overarching goal is to build a functional Game Boy emulator in F# from first principles. This involves meticulously replicating the behavior of the Game Boy’s hardware components, including its custom 8-bit CPU (SM83), Memory Management Unit (MMU), Picture Processing Unit (PPU), and more. Each chapter incrementally adds a critical piece of this complex system.
Tech Stack
For this project, we are leveraging:
- F# (latest stable release via .NET SDK): A functional-first language on the .NET platform, chosen for its strong type system, conciseness, and excellent tooling. Its immutability-first approach helps manage complex state in an emulator.
- .NET SDK (latest stable release): Provides the runtime, libraries, and build tools for F# applications. As of 2026-05-05, this would typically be .NET 9 or .NET 10.
- SDL.NET (or similar): While not directly implemented in this chapter, a cross-platform graphics library like SDL.NET will eventually be used for rendering the Game Boy’s display output.
Milestones and Build Plan
This chapter focuses on enabling the CPU to alter its execution path. Our plan includes:
- Update CPU State: Introduce the Stack Pointer (SP) register to our
CpuStaterecord. - Implement Stack Operations: Create
pushWordandpopWordfunctions in the MMU to handle 16-bit data on the stack, respecting Game Boy’s little-endian architecture. - Integrate Control Flow Opcodes: Extend the
executeInstructionfunction withmatchcases forJP,JR,CALL, andRETinstructions, including their conditional variants. - Verify Execution: Test stack operations with unit tests and validate CPU control flow by observing CPU state during the execution of simple test ROMs.
Architecture
Implementing control flow requires a deeper interaction between the CPU and memory, specifically concerning the stack. The stack is a region of memory used by the CPU to temporarily store return addresses for CALL instructions and other local data.
The Stack and Stack Pointer
The Game Boy’s CPU (a custom Z80 variant, often called SM83) has a 16-bit Stack Pointer (SP) register. The stack grows downwards from higher memory addresses to lower ones. When a value is “pushed” onto the stack, the SP is decremented, and the value is written to the new SP address. When a value is “popped,” the value is read from SP, and SP is incremented. This Last-In, First-Out (LIFO) structure is crucial for managing subroutine calls.
Stack Behavior Summary:
PUSH value:SPdecrements by 2 (for a 16-bit value), thenvalueis written to[SP]. Due to little-endian ordering, the low byte goes toSP, and the high byte toSP+1.POP value:valueis read from[SP](low byte atSP, high byte atSP+1), thenSPincrements by 2.
Control Flow Instruction Types
We’re implementing three primary categories of control flow instructions:
Jumps (
JP,JR): These instructions alter theProgram Counter(PC) to a new address, either unconditionally or based on CPU flags.JP nn: Jump to an absolute 16-bit addressnn.JR e: Jump relative by a signed 8-bit offsetefrom the current PC.- Conditional Jumps:
JP Z, nn(jump if Zero flag is set),JP NZ, nn(jump if Zero flag is not set),JP C, nn(jump if Carry flag is set),JP NC, nn(jump if Carry flag is not set). Similar conditional variants exist forJR.
Calls (
CALL): These instructions are used to call subroutines. They save the current execution context (the address of the next instruction) before jumping.CALL nn: Push the address of the next instruction (after theCALLitself) onto the stack, then jump to absolute 16-bit addressnn.- Conditional Calls:
CALL Z, nn,CALL NZ, nn,CALL C, nn,CALL NC, nn.
Returns (
RET): These instructions return from subroutines, restoring the program flow to where theCALLoriginated.RET: Pop a 16-bit address from the stack, and jump to that address.- Conditional Returns:
RET Z,RET NZ,RET C,RET NC.
CPU Decision Flow for Conditional Jumps
The CPU’s logic for handling conditional control flow is straightforward: fetch, check condition, then act.
This diagram illustrates how the CPU evaluates a conditional jump or call. If the condition is met, the PC is updated to the target address. Otherwise, the PC simply advances past the instruction.
Step-by-Step Implementation
We’ll start by updating our CpuState to include the SP, then implement the necessary stack operations in Mmu.fs, and finally integrate the new opcodes into Cpu.fs.
1. Update CpuState
First, open Cpu.fs and add the SP register to our CpuState record. We’ll also update the initialCpuState to reflect a typical Game Boy boot-up value for SP.
// src/GbSharp.Core/Cpu.fs
module GbSharp.Core.Cpu
// ... (existing types)
type CpuState =
{
A: byte
F: byte
B: byte
C: byte
D: byte
E: byte
H: byte
L: byte
PC: uint16
SP: uint16 // Add the Stack Pointer here
IME: bool // Interrupt Master Enable
Halted: bool
Stopped: bool
Flags: CpuFlags
}
member this.BC with get() = (uint16 this.B <<< 8) ||| (uint16 this.C) and set(value) = { this with B = byte (value >>> 8); C = byte (value &&& 0xFFu) }
member this.DE with get() = (uint16 this.D <<< 8) ||| (uint16 this.E) and set(value) = { this with D = byte (value >>> 8); E = byte (value &&& 0xFFu) }
member this.HL with get() = (uint16 this.H <<< 8) ||| (uint16 this.L) and set(value) = { this with H = byte (value >>> 8); L = byte (value &&& 0xFFu) }
// ... (rest of the module)
let initialCpuState =
{
A = 0x01uy // Typically 0x01 after boot ROM
F = 0xB0uy // Typically 0xB0 after boot ROM
B = 0x00uy
C = 0x13uy
D = 0x00uy
E = 0xD8uy
H = 0x01uy
L = 0x4Duy
PC = 0x0100u // Start after boot ROM, or 0x0000 if no boot ROM
SP = 0xFFFEu // Initial stack pointer
IME = false
Halted = false
Stopped = false
Flags = { Z = true; N = false; H = true; C = true } // Corresponds to F = 0xB0
}
We initialize SP to 0xFFFE. This is the typical starting point for the stack pointer after the Game Boy’s boot ROM executes. This address is within the high RAM area, which is typically used for the stack.
2. Implement Stack Operations in Mmu.fs
The Game Boy’s CPU is little-endian, meaning the least significant byte (LSB) of a 16-bit value is stored at the lower memory address, and the most significant byte (MSB) at the higher address. We need to account for this when pushing and popping 16-bit values to/from memory.
Open Mmu.fs and add the following functions:
// src/GbSharp.Core/Mmu.fs
module GbSharp.Core.Mmu
open GbSharp.Core.Cpu
// ... (existing types and functions)
/// Writes a 16-bit word to memory, little-endian.
/// The low byte (LSB) is written to 'addr', and the high byte (MSB) to 'addr + 1'.
let writeWord (mmu: MmuState) (addr: uint16) (value: uint16) : MmuState =
mmu
|> writeByte addr (byte (value &&& 0xFFu)) // Write low byte to addr
|> fun newMmu -> writeByte (addr + 1u) (byte (value >>> 8)) newMmu // Write high byte to addr + 1
/// Reads a 16-bit word from memory, little-endian.
/// The low byte (LSB) is read from 'addr', and the high byte (MSB) from 'addr + 1'.
let readWord (mmu: MmuState) (addr: uint16) : uint16 =
let lowByte = readByte mmu addr
let highByte = readByte mmu (addr + 1u)
(uint16 highByte <<< 8) ||| (uint16 lowByte)
/// Pushes a 16-bit word onto the stack.
/// SP is decremented by 2, then the value is written to the new SP address.
let pushWord (mmu: MmuState) (sp: uint16) (value: uint16) : MmuState * uint16 =
let newSp = sp - 2u // Stack grows downwards
let newMmu = writeWord mmu newSp value
(newMmu, newSp)
/// Pops a 16-bit word from the stack.
/// The value is read from the current SP address, then SP is incremented by 2.
let popWord (mmu: MmuState) (sp: uint16) : uint16 * uint16 * MmuState =
let value = readWord mmu sp
let newSp = sp + 2u // Stack grows upwards after pop
(value, newSp, mmu)
๐ Key Idea: The pushWord and popWord functions are crucial for managing the call stack. They abstract away the little-endian byte order and the SP manipulation, making CALL and RET logic cleaner and less error-prone in the CPU emulation.
3. Implement Control Flow Opcodes in Cpu.fs
Now, we’ll extend the executeInstruction function in Cpu.fs to handle the new opcodes. This will involve reading operands, checking flags, and updating PC and SP. We’ll also add helper functions for conditional checks to keep the match cases clean.
// src/GbSharp.Core/Cpu.fs
module GbSharp.Core.Cpu
open GbSharp.Core.Mmu
// ... (existing types and functions)
/// Helper to check if Zero flag condition is met (condition = true for Z, false for NZ)
let checkZ (flags: CpuFlags) (condition: bool) =
if condition then flags.Z else not flags.Z
/// Helper to check if Carry flag condition is met (condition = true for C, false for NC)
let checkC (flags: CpuFlags) (condition: bool) =
if condition then flags.C else not flags.C
let executeInstruction (cpu: CpuState) (mmu: MmuState) : CpuState * MmuState * int =
let opcode = Mmu.readByte mmu cpu.PC
let getOperand8 () = Mmu.readByte mmu (cpu.PC + 1u)
let getOperand16 () = Mmu.readWord mmu (cpu.PC + 1u)
let mutable cycles = 0
let newCpu, newMmu =
match opcode with
// ... (existing opcodes)
// Jumps (JP nn)
| 0xC3uy -> // JP nn (unconditional)
let addr = getOperand16 ()
cycles <- 16
{ cpu with PC = addr }, mmu // PC directly set to new address
| 0xC2uy -> // JP NZ, nn (Jump if Zero flag NOT set)
let addr = getOperand16 ()
cycles <- if checkZ cpu.Flags false then 16 else 12
if checkZ cpu.Flags false then
{ cpu with PC = addr }, mmu
else
{ cpu with PC = cpu.PC + 3u }, mmu // Skip operand
| 0xCAuy -> // JP Z, nn (Jump if Zero flag SET)
let addr = getOperand16 ()
cycles <- if checkZ cpu.Flags true then 16 else 12
if checkZ cpu.Flags true then
{ cpu with PC = addr }, mmu
else
{ cpu with PC = cpu.PC + 3u }, mmu
| 0xD2uy -> // JP NC, nn (Jump if Carry flag NOT set)
let addr = getOperand16 ()
cycles <- if checkC cpu.Flags false then 16 else 12
if checkC cpu.Flags false then
{ cpu with PC = addr }, mmu
else
{ cpu with PC = cpu.PC + 3u }, mmu
| 0xDAuy -> // JP C, nn (Jump if Carry flag SET)
let addr = getOperand16 ()
cycles <- if checkC cpu.Flags true then 16 else 12
if checkC cpu.Flags true then
{ cpu with PC = addr }, mmu
else
{ cpu with PC = cpu.PC + 3u }, mmu
| 0xE9uy -> // JP (HL) (Jump to address in HL)
cycles <- 4
{ cpu with PC = cpu.HL }, mmu
// Relative Jumps (JR e)
| 0x18uy -> // JR e (unconditional)
let offset = int8 (getOperand8 ()) // Signed 8-bit offset
cycles <- 12
// PC points to opcode. Instruction is 2 bytes (opcode + operand).
// So, new PC = (current PC + 2) + offset.
{ cpu with PC = uint16 (int cpu.PC + 2 + int offset) }, mmu
| 0x20uy -> // JR NZ, e
let offset = int8 (getOperand8 ())
cycles <- if checkZ cpu.Flags false then 12 else 8
if checkZ cpu.Flags false then
{ cpu with PC = uint16 (int cpu.PC + 2 + int offset) }, mmu
else
{ cpu with PC = cpu.PC + 2u }, mmu // Skip operand
| 0x28uy -> // JR Z, e
let offset = int8 (getOperand8 ())
cycles <- if checkZ cpu.Flags true then 12 else 8
if checkZ cpu.Flags true then
{ cpu with PC = uint16 (int cpu.PC + 2 + int offset) }, mmu
else
{ cpu with PC = cpu.PC + 2u }, mmu
| 0x30uy -> // JR NC, e
let offset = int8 (getOperand8 ())
cycles <- if checkC cpu.Flags false then 12 else 8
if checkC cpu.Flags false then
{ cpu with PC = uint16 (int cpu.PC + 2 + int offset) }, mmu
else
{ cpu with PC = cpu.PC + 2u }, mmu
| 0x38uy -> // JR C, e
let offset = int8 (getOperand8 ())
cycles <- if checkC cpu.Flags true then 12 else 8
if checkC cpu.Flags true then
{ cpu with PC = uint16 (int cpu.PC + 2 + int offset) }, mmu
else
{ cpu with PC = cpu.PC + 2u }, mmu
// Calls (CALL nn)
| 0xCDuy -> // CALL nn (unconditional)
let addr = getOperand16 ()
cycles <- 24
// Return address is PC of instruction AFTER CALL (opcode + 2 operand bytes = 3 bytes total)
let returnAddr = cpu.PC + 3u
let mmuAfterPush, newSp = pushWord mmu cpu.SP returnAddr
{ cpu with PC = addr; SP = newSp }, mmuAfterPush
| 0xC4uy -> // CALL NZ, nn
let addr = getOperand16 ()
cycles <- if checkZ cpu.Flags false then 24 else 12
if checkZ cpu.Flags false then
let returnAddr = cpu.PC + 3u
let mmuAfterPush, newSp = pushWord mmu cpu.SP returnAddr
{ cpu with PC = addr; SP = newSp }, mmuAfterPush
else
{ cpu with PC = cpu.PC + 3u }, mmu // Skip operand
| 0xCCuy -> // CALL Z, nn
let addr = getOperand16 ()
cycles <- if checkZ cpu.Flags true then 24 else 12
if checkZ cpu.Flags true then
let returnAddr = cpu.PC + 3u
let mmuAfterPush, newSp = pushWord mmu cpu.SP returnAddr
{ cpu with PC = addr; SP = newSp }, mmuAfterPush
else
{ cpu with PC = cpu.PC + 3u }, mmu
| 0xD4uy -> // CALL NC, nn
let addr = getOperand16 ()
cycles <- if checkC cpu.Flags false then 24 else 12
if checkC cpu.Flags false then
let returnAddr = cpu.PC + 3u
let mmuAfterPush, newSp = pushWord mmu cpu.SP returnAddr
{ cpu with PC = addr; SP = newSp }, mmuAfterPush
else
{ cpu with PC = cpu.PC + 3u }, mmu
| 0xDCuy -> // CALL C, nn
let addr = getOperand16 ()
cycles <- if checkC cpu.Flags true then 24 else 12
if checkC cpu.Flags true then
let returnAddr = cpu.PC + 3u
let mmuAfterPush, newSp = pushWord mmu cpu.SP returnAddr
{ cpu with PC = addr; SP = newSp }, mmuAfterPush
else
{ cpu with PC = cpu.PC + 3u }, mmu
// Returns (RET)
| 0xC9uy -> // RET (unconditional)
cycles <- 16
let returnAddr, newSp, mmuAfterPop = popWord mmu cpu.SP
{ cpu with PC = returnAddr; SP = newSp }, mmuAfterPop
| 0xC0uy -> // RET NZ
cycles <- if checkZ cpu.Flags false then 20 else 8
if checkZ cpu.Flags false then
let returnAddr, newSp, mmuAfterPop = popWord mmu cpu.SP
{ cpu with PC = returnAddr; SP = newSp }, mmuAfterPop
else
{ cpu with PC = cpu.PC + 1u }, mmu // Skip opcode
| 0xC8uy -> // RET Z
cycles <- if checkZ cpu.Flags true then 20 else 8
if checkZ cpu.Flags true then
let returnAddr, newSp, mmuAfterPop = popWord mmu cpu.SP
{ cpu with PC = returnAddr; SP = newSp }, mmuAfterPop
else
{ cpu with PC = cpu.PC + 1u }, mmu
| 0xD0uy -> // RET NC
cycles <- if checkC cpu.Flags false then 20 else 8
if checkC cpu.Flags false then
let returnAddr, newSp, mmuAfterPop = popWord mmu cpu.SP
{ cpu with PC = returnAddr; SP = newSp }, mmuAfterPop
else
{ cpu with PC = cpu.PC + 1u }, mmu
| 0xD8uy -> // RET C
cycles <- if checkC cpu.Flags true then 20 else 8
if checkC cpu.Flags true then
let returnAddr, newSp, mmuAfterPop = popWord mmu cpu.SP
{ cpu with PC = returnAddr; SP = newSp }, mmuAfterPop
else
{ cpu with PC = cpu.PC + 1u }, mmu
// ... (other opcodes)
| _ ->
// Fallback for unimplemented opcodes
failwithf "Unimplemented opcode: 0x%02X at PC: 0x%04X" opcode cpu.PC
// Advance PC for non-jump/call/return instructions if not already handled
let finalCpu =
if not (cpu.Halted || cpu.Stopped) && newCpu.PC = cpu.PC then // Only advance if PC wasn't explicitly set by a control flow instruction
match opcode with
// Instructions that take 2 bytes (opcode + 8-bit operand)
| 0x18uy | 0x20uy | 0x28uy | 0x30uy | 0x38uy -> { newCpu with PC = newCpu.PC + 2u }
// Instructions that take 3 bytes (opcode + 16-bit operand)
| 0xC2uy | 0xC3uy | 0xCAuy | 0xD2uy | 0xDAuy
| 0xC4uy | 0xCCuy | 0xD4uy | 0xDCuy -> { newCpu with PC = newCpu.PC + 3u }
// Instructions that take 1 byte (all others)
| _ -> { newCpu with PC = newCpu.PC + 1u }
else
newCpu
(finalCpu, newMmu, cycles)
๐ง Important: The PC calculation for relative jumps (JR e) is critical. The offset e is signed and added to PC + 2 (the current PC plus the length of the JR e instruction itself). For CALL nn, the returnAddr pushed to the stack is PC + 3 (the current PC plus the length of the CALL nn instruction). Pay close attention to these offsets; off-by-one errors are a very common source of bugs in emulators.
โก Quick Note: The cycle counts included are based on common Game Boy CPU documentation (like Pan Docs). These are approximations and will need fine-tuning as we build out more of the emulator’s timing system. For now, they provide a reasonable baseline for instruction execution cost.
Testing & Verification
With control flow implemented, we can start verifying that our CPU can correctly alter its execution path. This involves both isolated unit tests for stack operations and observing the CPU’s behavior with simple test ROMs.
Unit Tests for Stack Operations
Let’s add some basic unit tests for pushWord and popWord in MmuTests.fs to ensure our little-endian handling and SP manipulation are correct.
// tests/GbSharp.Core.Tests/MmuTests.fs
module GbSharp.Core.Tests.MmuTests
open Xunit
open GbSharp.Core
open GbSharp.Core.Mmu
// ... (existing tests)
[<Fact>]
let ``pushWord and popWord should correctly handle 16-bit values and SP`` () =
let initialMmu = Mmu.create () // Assuming Mmu.create initializes memory
let initialSp = 0xFFFEu
let testValue = 0x1234u
// Push word
let mmuAfterPush, spAfterPush = Mmu.pushWord initialMmu initialSp testValue
Assert.Equal(0xFFFCu, spAfterPush) // SP should decrement by 2 (0xFFFE -> 0xFFFC)
// Verify little-endian storage: LSB at SP, MSB at SP+1
Assert.Equal(0x34uy, Mmu.readByte mmuAfterPush 0xFFFCu) // Low byte (0x34) at 0xFFFC
Assert.Equal(0x12uy, Mmu.readByte mmuAfterPush 0xFFFDu) // High byte (0x12) at 0xFFFD
// Pop word
let poppedValue, spAfterPop, _ = Mmu.popWord mmuAfterPush spAfterPush
Assert.Equal(testValue, poppedValue) // Verify the correct value was popped
Assert.Equal(initialSp, spAfterPop) // SP should return to its original value (0xFFFC -> 0xFFFE)
Verification with Simple Test ROMs
While full Game Boy ROMs will require more components (especially the PPU), we can use small, custom-written ROMs (or early stages of well-known test suites like Blargg’s CPU instruction tests) to verify these opcodes.
Consider a simple Game Boy assembly program:
; Example Assembly (RGBDS syntax)
; Filename: control_flow_test.asm
SECTION "Test Code",ROM0[$100] ; Start at 0x100, past typical boot ROM area
LD SP, $FFFE ; Initialize stack pointer
LD HL, $C050 ; Load an arbitrary value into HL for JP (HL) test
Start:
CALL Subroutine1 ; Call a subroutine
LD A, $AA ; This instruction should execute after Subroutine1 returns
JR NZ, SkipJump ; Conditional relative jump (if Z flag NOT set)
JP $0000 ; Should not be reached if NZ is true
SkipJump:
LD B, $BB ; Should execute
JP $0100 ; Jump back to Start for a simple loop (or halt here for a real test)
Subroutine1:
LD C, $CC ; Set C register
JR Z, SkipRet ; Conditional relative jump (if Z flag SET, should NOT jump if Z is false)
RET ; Return from subroutine
SkipRet:
LD D, $DD ; Should not be reached in normal flow
RET
To test this:
- Assemble the ROM: Use an assembler like
RGBDS(RGBASM) to compile this.asmfile into a.gbROM.rgbasm -o control_flow_test.o control_flow_test.asm rgblink -o control_flow_test.gb control_flow_test.o - Load into Emulator: Modify your
Program.fs(or main emulation loop) to loadcontrol_flow_test.gb. - Log CPU State: Implement detailed logging of
cpu.PC,cpu.SP, and relevant registers (A,B,C,D,E,H,L,F) at each instruction step. - Trace Execution: Manually trace the
PCandSPvalues in your logs:- Observe
SPinitializing to0xFFFE. CALL Subroutine1should push0x103(address ofLD A, $AA) onto the stack, andPCshould jump toSubroutine1’s address.SPshould decrement to0xFFFC.- Inside
Subroutine1,LD C, $CCexecutes. JR Z, SkipRetshould not jump if the Z flag is false (which it typically is afterLD C, $CCunlessCCwas 0).RETshould pop0x103from the stack, restoringPCto0x103.SPshould increment back to0xFFFE.LD A, $AAshould then execute.JR NZ, SkipJumpshould jump toSkipJump(if Z flag is still false).LD B, $BBshould execute.JP $0100should jumpPCback to0x0100(Start).
- Observe
This manual inspection, possibly augmented with a debugger or detailed logging, is invaluable for understanding the subtle interactions of hardware components.
โก Real-world insight: Emulator development heavily relies on detailed logging of CPU state (PC, SP, registers, flags) and memory dumps. This “printf debugging” approach, especially in early stages, is often more effective than traditional debuggers for understanding the subtle, cycle-accurate behavior of emulated hardware.
Production Considerations
The correctness and performance of control flow instructions are paramount. Any error here will quickly lead to a crashed or misbehaving emulator, making the system unstable and unpredictable.
- Correctness is King: A single byte off in a
PCcalculation, an incorrect signed offset forJR, or an endianness error in stack operations will cause the CPU to execute garbage data or jump to incorrect locations. This leads to immediate crashes or subtle, hard-to-debug issues later in development. Thoroughly testPCupdates andpushWord/popWordagainst official documentation. - Performance: Jumps, calls, and returns are frequent operations in any program. Our current F# implementation, while functional and clear, might involve some overhead due to immutable state updates and pattern matching. For a Game Boy, which runs at approximately 4.19 MHz, the CPU emulation loop needs to be extremely fast. We’ll revisit performance optimizations later, but for now, focus on absolute correctness.
- Stack Overflow/Underflow: While Game Boy software typically manages the stack within its limits, a robust emulator might include checks to detect
SPgoing out of the expected RAM range (e.g.,0xC000-0xDFFFfor WRAM, or0xFFFEdownwards). This could indicate a bug in the emulated program or, more critically, in the emulator itself. Detecting this early can prevent memory corruption.
๐ฅ Optimization / Pro tip: For extreme performance-critical sections, especially the main CPU loop, some functional emulators might resort to highly optimized, potentially more imperative, internal structures or leverage F#’s mutable features carefully. However, prioritize correctness and clarity first; optimize only when profiling reveals a bottleneck.
Common Issues & Solutions
Incorrect
PCafter Jump/Call:- Issue: The CPU jumps to the wrong address, or returns to an incorrect instruction, leading to unexpected program flow or crashes.
- Cause: Miscalculation of operand addresses (
PC + 1u,PC + 2u), or incorrect handling of signed offsets forJR. ForCALL, pushingPCinstead ofPC + instruction_length(the address after theCALLinstruction). - Solution: Double-check the Game Boy CPU documentation (e.g., Pan Docs, GBDEV Wiki) for each instruction’s operand length and how
PCis modified. LogPCbefore and after each instruction to trace the flow.โ ๏ธ What can go wrong:ForJR e, theeis a signed 8-bit value. Usinguint8directly without casting toint8before arithmetic operations will lead to incorrect signed extension, causing jumps to wildly wrong addresses. Always castbytefromreadBytetoint8before performing arithmetic with it for relative offsets.
Stack Corruption:
- Issue:
CALLs push incorrect return addresses, orRETs pop garbage values, leading to program crashes or jumps to random memory locations. - Cause: Incorrect
pushWord/popWordimplementation, particularly regarding the little-endian byte order orSPincrement/decrement logic. - Solution: Thoroughly test
pushWordandpopWordin isolation (as shown in unit tests). PrintSPand the stack memory region (e.g.,0xFFF0to0xFFFF) before and after calls/returns to verify values are pushed and popped correctly.โก Real-world insight:The stack is a common vector for security exploits in real systems. While our emulator isn’t directly exposed to external threats, a corrupted stack in emulation can mimic such issues, making it a valuable learning experience in debugging memory-related problems.
- Issue:
Conditional Jumps/Calls Not Working:
- Issue: Instructions like
JP Z, nnalways jump, or never jump, regardless of the Zero flag’s state, leading to incorrect program logic. - Cause: Incorrectly checking the
CpuFlagsrecord, or the flags themselves are not being set correctly by previous arithmetic/logic instructions (from previous chapters). - Solution: Ensure the
checkZandcheckChelper functions are correct. Verify that arithmetic and logic instructions are correctly updating theZ,N,H,Cflags inCpuState.Flags. Debug by inspecting theFlagsstate immediately before a conditional instruction.
- Issue: Instructions like
๐ง Check Your Understanding
- Why is the stack pointer initialized to
0xFFFEin the Game Boy, and what does this imply about the direction the stack grows? - Explain the critical difference between
JP nnandJR einstructions, both in terms of addressing and common use cases. - What is the significance of pushing
PC + 3uas the return address for aCALL nninstruction, rather than justPC?
โก Mini Task
Modify a simple Game Boy assembly program (or create a new one if you’re comfortable with assembly) to include a CALL to a subroutine, and then a RET. Add an LD A, $00 instruction before a JP Z, $XXXX and another LD A, $01 before a JP NZ, $YYYY. Load this ROM into your emulator and trace the PC and SP values, along with the Zero flag, to confirm correct conditional execution.
๐ Scenario
You’ve implemented all control flow instructions, but when running a test ROM, the emulator consistently crashes with an “Unimplemented opcode” error after a specific CALL instruction, even though the opcode itself is implemented. You suspect stack corruption. How would you approach debugging this, focusing on the stack, PC, and memory dumps? What specific data points would you log or inspect?
๐ TL;DR
- CPU control flow instructions (
JP,JR,CALL,RET) enable complex program execution paths. - The Stack Pointer (
SP) and memory stack are fundamental forCALL/REToperations, managing return addresses. - Correctly handling 16-bit values (little-endian) and
SPmanipulation inpushWord/popWordis crucial for stack integrity. - Precise
PCcalculation, especially for relative jumps and call return addresses, is a common source of emulator bugs.
๐ง Core Flow
- Extend
CpuState: Add theSPregister to track the stack’s current top. - Implement Stack Primitives: Create
pushWordandpopWordinMmu.fsto safely interact with memory, handling little-endian byte order andSPupdates. - Integrate Opcodes: Add
matchcases for allJP,JR,CALL, andRETvariants into theexecuteInstructionloop inCpu.fs. - Verify Logic: Ensure
PCandSPare updated correctly, paying close attention to instruction lengths and conditional flag checks. - Test and Debug: Use unit tests for stack operations and detailed logging with simple test ROMs to trace execution flow.
๐ Key Takeaway
Mastering CPU control flow is the gateway to executing complex program logic. The interplay between the Program Counter, Stack Pointer, and memory stack is a fundamental pattern in computer architecture, and understanding its precise, cycle-accurate implementation is crucial for any low-level system development.
References
- Pan Docs: The Ultimate Game Boy Technical Manual. Essential for Game Boy hardware details, including CPU, MMU, and PPU specifications. https://gbdev.io/pandocs/
- F# Language Reference: Microsoft Learn. Official F# documentation, providing syntax, language features, and best practices. https://learn.microsoft.com/en-us/dotnet/fsharp/
- .NET SDK: Microsoft Learn. Official .NET documentation, covering the runtime, libraries, and tools for F# development. https://learn.microsoft.com/en-us/dotnet/
- Game Boy CPU Opcodes: GBDEV Wiki. Detailed opcode information for the SM83 CPU, including instruction formats, effects on flags, and cycle counts. https://gbdev.io/wiki/index.php?title=CPU_Instruction_Set
This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.