Introduction

So far, our Game Boy emulator can execute CPU instructions and manage memory, but it’s a silent, black box. This chapter changes that. We’re about to bring the Game Boy to life by tackling the Picture Processing Unit (PPU) โ€“ the hardware responsible for all the visuals. This is a significant milestone, as seeing actual graphics from a ROM is incredibly rewarding and validates much of our prior work.

In this first part of PPU emulation, we’ll focus on understanding the Game Boy’s display architecture, specifically how Video RAM (VRAM) stores graphical data, how tiles are defined, and how the PPU renders the background layer. By the end of this chapter, our emulator will be able to display static backgrounds from simple Game Boy ROMs, making the project visually verifiable.

The PPU is one of the most complex components of the Game Boy, demanding precise timing and careful interpretation of hardware specifications. We’ll break it down into manageable pieces, starting with the fundamentals of background rendering.

Planning & Design

The Game Boy’s PPU is responsible for drawing a 160x144 pixel display. It operates by drawing one horizontal line (a “scanline”) at a time, moving from top to bottom. This process involves fetching tile data from VRAM, assembling it into a background layer, and then drawing it to the screen.

Game Boy Display Architecture Overview

The Game Boy’s LCD has a fixed resolution of 160 pixels wide by 144 pixels high. The PPU cycles through different modes as it draws the screen:

  • Mode 2 (OAM Scan): The PPU spends about 80 CPU cycles scanning Object Attribute Memory (OAM) to find sprites that appear on the current scanline.
  • Mode 3 (Drawing Pixels): The PPU spends about 172-289 CPU cycles (variable) fetching tile data, background map data, and sprite data, then drawing the pixels for the current scanline. This is the most computationally intensive phase.
  • Mode 0 (HBlank): After drawing a scanline, the PPU enters a Horizontal Blanking period for about 204 CPU cycles. This is a good time for the CPU to access VRAM or OAM without contention.
  • Mode 1 (VBlank): Once all 144 visible scanlines are drawn, the PPU enters a Vertical Blanking period. This lasts for 10 scanlines (lines 144-153) and takes roughly 4560 CPU cycles in total. This is the ideal time to update the screen buffer.

PPU Memory and Registers

The PPU interacts heavily with specific memory regions and I/O registers:

  • Video RAM (VRAM): 0x8000-0x9FFF

    • 0x8000-0x97FF: Tile Data. This area stores the actual pixel patterns for 8x8 tiles. Each tile uses 16 bytes (2 bytes per row, 8 rows). Each pixel has 2 bits, allowing 4 colors.
    • 0x9800-0x9BFF: Tile Map 0. A 32x32 grid of tile indices for the background layer.
    • 0x9C00-0x9FFF: Tile Map 1. Another 32x32 grid of tile indices. The LCDC register determines which map is active.
  • PPU I/O Registers:

    • 0xFF40 (LCDC): LCD Control. Crucial for enabling/disabling PPU features (LCD, background, sprites, tile map selection, tile data selection).
    • 0xFF41 (STAT): LCD Status. Contains the current PPU mode, LYC compare flag, and interrupt enable bits.
    • 0xFF42 (SCY): Scroll Y. Vertical scroll offset for the background.
    • 0xFF43 (SCX): Scroll X. Horizontal scroll offset for the background.
    • 0xFF44 (LY): LCDC Y-Coordinate. The current scanline being drawn (0-153).
    • 0xFF45 (LYC): LY Compare. An interrupt can be triggered when LY equals LYC.
    • 0xFF47 (BGP): Background Palette. Defines the 4 colors used for the background.
    • 0xFF48 (OBP0), 0xFF49 (OBP1): Object Palettes. Define colors for sprites (covered in Part 2).

Modeling the PPU State

We’ll introduce a new PpuState record to encapsulate all PPU-related registers and internal state. This state will be updated by the CPU in cycles.

// In Ppu.fs
type PpuMode =
    | HBlank = 0
    | VBlank = 1
    | OAMScan = 2
    | DrawingPixels = 3

type PpuState = {
    mutable Lcdc : byte // LCD Control (0xFF40)
    mutable Stat : byte // LCD Status (0xFF41)
    mutable Scy : byte  // Scroll Y (0xFF42)
    mutable Scx : byte  // Scroll X (0xFF43)
    mutable Ly : byte   // LCDC Y-Coordinate (0xFF44)
    mutable Lyc : byte  // LY Compare (0xFF45)
    mutable Bgp : byte  // Background Palette (0xFF47)
    mutable Obp0 : byte // Object Palette 0 (0xFF48)
    mutable Obp1 : byte // Object Palette 1 (0xFF49)

    mutable CyclesThisScanline : int
    mutable CurrentFrameBuffer : byte [] // 160 * 144 * 4 bytes for RGBA
    mutable FrameReady : bool
}

PPU Rendering Flow

The core PPU logic will reside in an updatePpu function, which will be called by the main emulator loop for every CPU cycle (or a block of cycles).

flowchart TD Start[PPU Update] --> Check_LCD_Enable{LCD Enabled} Check_LCD_Enable -->|No| Reset_PPU[Reset PPU State] Check_LCD_Enable -->|Yes| Process_Cycles_Mode[Process Cycles and Mode] Process_Cycles_Mode --> Update_LY_Interrupt[Update LY and Interrupt] Update_LY_Interrupt --> Render_Scanline[Render Scanline] Render_Scanline --> End[End PPU Update] Reset_PPU --> End

PPU Update Cycle:

  1. Accumulate Cycles: The updatePpu function receives the number of CPU cycles executed.
  2. Mode Progression: Based on accumulated cycles, the PPU transitions between modes (OAM Scan, Drawing Pixels, HBlank).
  3. Scanline Increment: When a scanline completes (after HBlank), the LY register is incremented.
  4. VBlank: When LY reaches 144, the PPU enters VBlank, and a VBlank interrupt is triggered. The FrameReady flag is set.
  5. Render Scanline (Mode 3): During the “Drawing Pixels” mode, we’ll implement the logic to fetch tile data and draw pixels to our internal frame buffer.

Graphics Library: SDL2-CS

For rendering, we’ll use SDL2-CS, which is a C# binding for the SDL2 library, fully compatible with F# and .NET. SDL (Simple DirectMedia Layer) is a cross-platform development library designed to provide low-level access to audio, keyboard, mouse, joystick, and graphics hardware via OpenGL and Direct3D.

โšก Quick Note: SDL2-CS is a NuGet package that wraps the native SDL2 library. You’ll need the native SDL2 runtime binaries installed on your system or bundled with your application. For Windows, this typically means SDL2.dll in your executable directory. For macOS and Linux, it’s usually libSDL2.dylib or libSDL2.so.

Step-by-Step Implementation

We’ll start by defining the PPU state, then integrate it into our MMU for register access, and finally implement the core updatePpu and renderScanline logic.

1. Update Memory.fs for PPU Registers

First, we need to ensure our MMU can read and write to the PPU’s I/O registers. We’ll add specific handling for the PPU’s memory-mapped registers.

File: src/GameBoy/Memory.fs

module GameBoy.Memory

open System

// ... (existing MemoryState type)

type MemoryState = {
    // ... (existing fields)
    mutable Vram : byte [] // 0x8000-0x9FFF
    mutable Oam : byte []  // 0xFE00-0xFE9F
    mutable Lcdc : byte // 0xFF40
    mutable Stat : byte // 0xFF41
    mutable Scy : byte  // 0xFF42
    mutable Scx : byte  // 0xFF43
    mutable Ly : byte   // 0xFF44
    mutable Lyc : byte  // 0xFF45
    mutable Bgp : byte  // 0xFF47
    mutable Obp0 : byte // 0xFF48
    mutable Obp1 : byte // 0xFF49
    mutable Dma : byte  // 0xFF46 (DMA Transfer)
    mutable If : byte   // 0xFF0F (Interrupt Flag)
    mutable Ie : byte   // 0xFFFF (Interrupt Enable)
}

// ... (existing `createMemoryState` function)
let createMemoryState (romData: byte []) =
    let state = {
        // ... (existing initializations)
        Vram = Array.zeroCreate (0x9FFF - 0x8000 + 1)
        Oam = Array.zeroCreate (0xFE9F - 0xFE00 + 1)
        Lcdc = 0x91uy // Default boot ROM value
        Stat = 0x00uy
        Scy = 0x00uy
        Scx = 0x00uy
        Ly = 0x00uy
        Lyc = 0x00uy
        Bgp = 0xFCuy // Default palette
        Obp0 = 0xFFuy
        Obp1 = 0xFFuy
        Dma = 0x00uy
        If = 0xE1uy // Default boot ROM value
        Ie = 0x00uy
    }
    // ... (existing ROM loading)
    state

// ... (existing `readByte` function)
let readByte (addr: uint16) (state: MemoryState) =
    match addr with
    // ... (existing ranges)
    | _ when addr >= 0x8000us && addr <= 0x9FFFus -> state.Vram.[int (addr - 0x8000us)] // VRAM
    | _ when addr >= 0xFE00us && addr <= 0xFE9Fus -> state.Oam.[int (addr - 0xFE00us)] // OAM
    | 0xFF40us -> state.Lcdc
    | 0xFF41us -> state.Stat
    | 0xFF42us -> state.Scy
    | 0xFF43us -> state.Scx
    | 0xFF44us -> state.Ly
    | 0xFF45us -> state.Lyc
    | 0xFF46us -> state.Dma // DMA register
    | 0xFF47us -> state.Bgp
    | 0xFF48us -> state.Obp0
    | 0xFF49us -> state.Obp1
    | 0xFF0Fus -> state.If
    | 0xFFFFus -> state.Ie
    | _ ->
        // ... (existing default read)
        state.Hram.[int (addr - 0xFF80us)] // HRAM

Explanation:

  • We’ve added Vram and Oam arrays to MemoryState to represent these memory regions.
  • New fields (Lcdc, Stat, Scy, etc.) are added to MemoryState to hold the values of the PPU’s I/O registers. These are mutable because the PPU (and CPU) will modify them.
  • createMemoryState now initializes these PPU registers with typical boot ROM values.
  • The readByte function is updated with match cases to correctly return the values from these new fields when their respective addresses are queried.

File: src/GameBoy/Memory.fs (continued, writeByte)

let writeByte (addr: uint16) (value: byte) (state: MemoryState) =
    match addr with
    // ... (existing ranges)
    | _ when addr >= 0x8000us && addr <= 0x9FFFus -> state.Vram.[int (addr - 0x8000us)] <- value // VRAM
    | _ when addr >= 0xFE00us && addr <= 0xFE9Fus -> state.Oam.[int (addr - 0xFE00us)] <- value // OAM
    | 0xFF40us -> state.Lcdc <- value
    | 0xFF41us -> state.Stat <- value // STAT register. Only bits 3-6 are writable by CPU.
    | 0xFF42us -> state.Scy <- value
    | 0xFF43us -> state.Scx <- value
    | 0xFF44us -> state.Ly <- value // LY is read-only for CPU, PPU writes it. Writing to it resets it to 0.
    | 0xFF45us -> state.Lyc <- value
    | 0xFF46us -> // DMA Transfer
        state.Dma <- value
        // ๐Ÿง  Important: DMA transfer is complex. For now, we'll just store the value.
        // A full DMA implementation would block the CPU for 160 cycles and copy data from (value * 0x100) to OAM.
    | 0xFF47us -> state.Bgp <- value
    | 0xFF48us -> state.Obp0 <- value
    | 0xFF49us -> state.Obp1 <- value
    | 0xFF0Fus -> state.If <- value // Interrupt Flag
    | 0xFFFFus -> state.Ie <- value // Interrupt Enable
    | _ ->
        // ... (existing default write)
        state.Hram.[int (addr - 0xFF80us)] <- value // HRAM

Explanation:

  • The writeByte function is similarly updated to allow the CPU to write to these PPU registers.
  • LY write: Writing to 0xFF44 (LY) typically resets it to 0. We’ll handle this in the writeByte function.
  • STAT write: Only specific bits of STAT are writable by the CPU. For simplicity, we’ll allow full writes for now, but a production emulator would mask these.
  • DMA: Direct Memory Access (DMA) to OAM is a critical feature for sprites. When the CPU writes to 0xFF46, a block of memory is transferred to OAM. This takes 160 machine cycles and freezes the CPU. We’ll implement this more fully in a later chapter.

2. Create Ppu.fs

Now, let’s create the core PPU logic.

File: src/GameBoy/Ppu.fs

module GameBoy.Ppu

open GameBoy.Memory
open GameBoy.Cpu // Needed for interrupt flags
open System

// PPU Constants
let SCREEN_WIDTH = 160
let SCREEN_HEIGHT = 144
let VBLANK_SCANLINES = 10 // Scanlines 144-153
let TOTAL_SCANLINES = SCREEN_HEIGHT + VBLANK_SCANLINES // 154
let CPU_CYCLES_PER_SCANLINE = 456 // Roughly 456 CPU cycles per scanline (including HBlank)

// PPU Modes
type PpuMode =
    | HBlank = 0 // Mode 0
    | VBlank = 1 // Mode 1
    | OAMScan = 2 // Mode 2
    | DrawingPixels = 3 // Mode 3

// PPU State
type PpuState = {
    mutable Lcdc : byte // LCD Control (0xFF40)
    mutable Stat : byte // LCD Status (0xFF41)
    mutable Scy : byte  // Scroll Y (0xFF42)
    mutable Scx : byte  // Scroll X (0xFF43)
    mutable Ly : byte   // LCDC Y-Coordinate (0xFF44)
    mutable Lyc : byte  // LY Compare (0xFF45)
    mutable Bgp : byte  // Background Palette (0xFF47)
    mutable Obp0 : byte // Object Palette 0 (0xFF48)
    mutable Obp1 : byte // Object Palette 1 (0xFF49)

    mutable CyclesThisScanline : int
    mutable CurrentFrameBuffer : byte [] // 160 * 144 * 4 bytes for RGBA
    mutable FrameReady : bool
}

let createPpuState () = {
    Lcdc = 0x91uy
    Stat = 0x00uy
    Ly = 0x00uy
    Lyc = 0x00uy
    Scy = 0x00uy
    Scx = 0x00uy
    Bgp = 0xFCuy // Default palette: 0xFF, 0xAA, 0x55, 0x00
    Obp0 = 0xFFuy
    Obp1 = 0xFFuy
    CyclesThisScanline = 0
    CurrentFrameBuffer = Array.zeroCreate (SCREEN_WIDTH * SCREEN_HEIGHT * 4) // RGBA
    FrameReady = false
}

// Helper to get pixel color from a 2-bit value and palette
let getPaletteColor (paletteRegister: byte) (pixelValue: byte) =
    let colorIndex = (paletteRegister >>> (int pixelValue * 2)) &&& 0x03uy
    match colorIndex with
    | 0x00uy -> (0xFFuy, 0xFFuy, 0xFFuy, 0xFFuy) // White
    | 0x01uy -> (0xAAuy, 0xAAuy, 0xAAuy, 0xFFuy) // Light Gray
    | 0x02uy -> (0x55uy, 0x55uy, 0x55uy, 0xFFuy) // Dark Gray
    | _      -> (0x00uy, 0x00uy, 0x00uy, 0xFFuy) // Black

// Function to read a tile's pixel data from VRAM
// tileIndex: 0-255
// tileDataAddress: base address for tile data (0x8000 or 0x8800)
let readTilePixel (mmu: MemoryState) (tileIndex: byte) (tileDataAddress: uint16) (yInTile: int) (xInTile: int) =
    let baseAddr =
        if tileDataAddress = 0x8000us then
            // Unsigned addressing (0-255)
            uint16 (0x8000 + (int tileIndex * 16))
        else
            // Signed addressing (0-127, -128- -1) mapping to 0-255
            // If tileIndex is 0-127, it maps to 0x8800-0x8FFF.
            // If tileIndex is 128-255 (interpreted as -128 to -1), it maps to 0x9000-0x97FF.
            let signedTileIndex = sbyte tileIndex
            if signedTileIndex >= 0s then
                uint16 (0x8800 + (int signedTileIndex * 16))
            else
                uint16 (0x9000 + ((int signedTileIndex + 256) * 16)) // Equivalent to (signedTileIndex + 128 + 128) * 16

    let tileLineAddr = baseAddr + uint16 (yInTile * 2)
    let byte1 = Memory.readByte tileLineAddr mmu
    let byte2 = Memory.readByte (tileLineAddr + 1us) mmu

    let bit1 = (byte1 >>> (7 - xInTile)) &&& 0x01uy
    let bit2 = (byte2 >>> (7 - xInTile)) &&& 0x01uy
    (bit2 <<< 1) ||| bit1 // Combine into 2-bit pixel value

// Renders a single scanline to the frame buffer
let renderScanline (ppu: PpuState) (mmu: MemoryState) =
    let bgDisplayEnable = (ppu.Lcdc &&& 0x01uy) = 0x01uy // LCDC Bit 0
    if not bgDisplayEnable then
        // If background display is disabled, clear the scanline to white
        for x = 0 to SCREEN_WIDTH - 1 do
            let pixelIndex = ((int ppu.Ly * SCREEN_WIDTH) + x) * 4
            ppu.CurrentFrameBuffer.[pixelIndex] <- 0xFFuy // R
            ppu.CurrentFrameBuffer.[pixelIndex + 1] <- 0xFFuy // G
            ppu.CurrentFrameBuffer.[pixelIndex + 2] <- 0xFFuy // B
            ppu.CurrentFrameBuffer.[pixelIndex + 3] <- 0xFFuy // A
        ()
    else
        let bgTileMapAddress =
            if (ppu.Lcdc &&& 0x08uy) = 0x08uy then 0x9C00us // LCDC Bit 3: 1=9C00-9FFF, 0=9800-9BFF
            else 0x9800us

        let bgTileDataAddress =
            if (ppu.Lcdc &&& 0x10uy) = 0x10uy then 0x8000us // LCDC Bit 4: 1=8000-8FFF, 0=8800-97FF
            else 0x8800us

        let scrollY = int ppu.Scy
        let scrollX = int ppu.Scx
        let currentLy = int ppu.Ly

        let yInMap = (currentLy + scrollY) % 256 // Wrapped Y coordinate in the 256x256 background map
        let yInTile = yInMap % 8 // Y coordinate within the 8x8 tile

        for x = 0 to SCREEN_WIDTH - 1 do
            let xInMap = (x + scrollX) % 256 // Wrapped X coordinate in the 256x256 background map
            let xInTile = xInMap % 8 // X coordinate within the 8x8 tile

            let tileMapX = xInMap / 8
            let tileMapY = yInMap / 8

            let tileMapOffset = uint16 (tileMapY * 32 + tileMapX)
            let tileIndex = Memory.readByte (bgTileMapAddress + tileMapOffset) mmu

            let pixelValue = readTilePixel mmu tileIndex bgTileDataAddress yInTile xInTile
            let (r, g, b, a) = getPaletteColor ppu.Bgp pixelValue

            let pixelIndex = ((currentLy * SCREEN_WIDTH) + x) * 4
            ppu.CurrentFrameBuffer.[pixelIndex] <- r
            ppu.CurrentFrameBuffer.[pixelIndex + 1] <- g
            ppu.CurrentFrameBuffer.[pixelIndex + 2] <- b
            ppu.CurrentFrameBuffer.[pixelIndex + 3] <- a

// Main PPU update function, called by the CPU
let updatePpu (cycles: int) (ppu: PpuState) (mmu: MemoryState) =
    ppu.CyclesThisScanline <- ppu.CyclesThisScanline + cycles

    // ๐Ÿง  Important: Check LCD enable/disable. If LCD is off, PPU state should be reset.
    let lcdEnable = (ppu.Lcdc &&& 0x80uy) = 0x80uy // LCDC Bit 7
    if not lcdEnable then
        ppu.CyclesThisScanline <- 0
        ppu.Ly <- 0x00uy
        // Reset PPU mode to HBlank if LCD is off
        ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.HBlank // Clear bits 0-2 (mode) and set to HBlank
        ppu.FrameReady <- false
        ()
    else
        // Get current PPU mode from STAT register
        let currentMode = (ppu.Stat &&& 0x03uy) |> enum<PpuMode>

        match currentMode with
        | OAMScan ->
            if ppu.CyclesThisScanline >= 80 then
                ppu.CyclesThisScanline <- ppu.CyclesThisScanline - 80
                ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.DrawingPixels // Change mode to Drawing Pixels
        | DrawingPixels ->
            if ppu.CyclesThisScanline >= 172 then // Minimum cycles for DrawingPixels
                ppu.CyclesThisScanline <- ppu.CyclesThisScanline - 172 // Use minimum for simplicity, actual is variable
                renderScanline ppu mmu // Render the completed scanline
                ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.HBlank // Change mode to HBlank
                // Check and trigger HBlank interrupt if enabled (STAT bit 3)
                if (ppu.Stat &&& 0x08uy) = 0x08uy then
                    mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat
        | HBlank ->
            if ppu.CyclesThisScanline >= CPU_CYCLES_PER_SCANLINE - 80 - 172 then // Remaining cycles for HBlank
                ppu.CyclesThisScanline <- 0
                ppu.Ly <- ppu.Ly + 1uy // Increment scanline

                // Check and trigger LYC=LY interrupt if enabled (STAT bit 6)
                if ppu.Ly = ppu.Lyc && (ppu.Stat &&& 0x40uy) = 0x40uy then
                    mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat

                if ppu.Ly >= byte SCREEN_HEIGHT then
                    // Enter VBlank
                    ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.VBlank
                    mmu.If <- mmu.If ||| byte Cpu.InterruptType.VBlank // Trigger VBlank interrupt
                    // Check and trigger VBlank interrupt if enabled (STAT bit 4)
                    if (ppu.Stat &&& 0x10uy) = 0x10uy then
                        mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat
                    ppu.FrameReady <- true // A full frame is ready
                else
                    // Back to OAM Scan for next scanline
                    ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.OAMScan
                    // Check and trigger OAM interrupt if enabled (STAT bit 5)
                    if (ppu.Stat &&& 0x20uy) = 0x20uy then
                        mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat
        | VBlank ->
            if ppu.CyclesThisScanline >= CPU_CYCLES_PER_SCANLINE then
                ppu.CyclesThisScanline <- ppu.CyclesThisScanline - CPU_CYCLES_PER_SCANLINE
                ppu.Ly <- ppu.Ly + 1uy

                // Check and trigger LYC=LY interrupt if enabled (STAT bit 6)
                if ppu.Ly = ppu.Lyc && (ppu.Stat &&& 0x40uy) = 0x40uy then
                    mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat

                if ppu.Ly >= byte TOTAL_SCANLINES then
                    // End of VBlank, reset to scanline 0, OAM Scan mode
                    ppu.Ly <- 0x00uy
                    ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.OAMScan
                    // Check and trigger OAM interrupt if enabled (STAT bit 5)
                    if (ppu.Stat &&& 0x20uy) = 0x20uy then
                        mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat

    // Update LY and STAT registers in MMU for CPU visibility
    mmu.Ly <- ppu.Ly
    mmu.Stat <- ppu.Stat

Explanation:

  • PpuState and createPpuState: Defines the PPU’s internal mutable state and provides an initialization function. We use a byte array for CurrentFrameBuffer to store RGBA pixel data.
  • getPaletteColor: A helper function that takes a palette register value (BGP for background) and a 2-bit pixel value, then returns the corresponding RGBA color tuple. This simplifies color mapping.
  • readTilePixel: This function is crucial. It takes the MemoryState (to access VRAM), a tileIndex, the tileDataAddress (either 0x8000 or 0x8800 depending on LCDC bit 4), and the (xInTile, yInTile) coordinates. It then fetches the two bytes that define the pixel row, extracts the 2-bit pixel value, and returns it. The tileDataAddress logic handles the two different tile data addressing modes.
  • renderScanline: This is the heart of background rendering.
    • It checks LCDC bit 0 to see if the background display is enabled. If not, it clears the current scanline to white.
    • It determines which tile map (0x9800 or 0x9C00) and tile data region (0x8000 or 0x8800) to use based on LCDC bits 3 and 4, respectively.
    • It calculates the (xInMap, yInMap) coordinates, taking SCX and SCY (scroll registers) into account, and wrapping around the 256x256 background map.
    • It then determines the tileMapX and tileMapY (which 8x8 tile in the map) and xInTile, yInTile (pixel within that tile).
    • It reads the tileIndex from the active tile map.
    • It calls readTilePixel to get the 2-bit pixel value.
    • It uses getPaletteColor to convert the pixel value into an RGBA color.
    • Finally, it writes the RGBA bytes into the ppu.CurrentFrameBuffer at the correct (x, y) position.
  • updatePpu: This function simulates the PPU’s internal clock and mode transitions.
    • It accumulates CPU cycles.
    • It checks LCDC bit 7 to see if the LCD is enabled. If not, it resets the PPU state.
    • It progresses through OAMScan, DrawingPixels, HBlank, and VBlank modes based on elapsed cycles.
    • When DrawingPixels completes, it calls renderScanline.
    • When HBlank completes, it increments ppu.Ly.
    • When ppu.Ly reaches SCREEN_HEIGHT (144), it enters VBlank and sets ppu.FrameReady to true.
    • It also handles triggering VBlank and LCDStat interrupts by setting bits in mmu.If.
    • Crucially, it updates mmu.Ly and mmu.Stat so the CPU can read the current PPU state.

3. Integrate PPU into Emulator.fs

Now we need to wire the PPU into our main emulator loop and set up SDL2-CS for display.

File: src/GameBoy/Emulator.fs

module GameBoy.Emulator

open GameBoy.Cpu
open GameBoy.Memory
open GameBoy.Ppu
open SDL2
open System

// Define the main emulator state
type EmulatorState = {
    mutable Cpu : CpuState
    mutable Mmu : MemoryState
    mutable Ppu : PpuState
    mutable MasterInterruptEnable : bool
    mutable TotalCycles : int6}

let createEmulatorState (bootRomData: byte []) (gameRomData: byte []) = {
    Cpu = createCpuState()
    Mmu = createMemoryState gameRomData // Pass game ROM data
    Ppu = createPpuState()
    MasterInterruptEnable = false
    TotalCycles = 0
}

// โšก Quick Note: For simplicity, we'll load the boot ROM into memory here.
// In a real scenario, the boot ROM might be loaded into a separate ROM chip region.
let loadBootRom (bootRomData: byte []) (mmu: MemoryState) =
    for i = 0 to bootRomData.Length - 1 do
        mmu.Rom0.[i] <- bootRomData.[i]

// Main emulation loop function
let runEmulator (emulatorState: EmulatorState) (bootRomData: byte []) =
    // Initialize SDL
    if SDL.SDL_Init(SDL.SDL_INIT_VIDEO) < 0 then
        failwithf "Could not initialize SDL: %s" (SDL.SDL_GetError())

    let window = SDL.SDL_CreateWindow("F# Game Boy Emulator",
                                      SDL.SDL_WINDOWPOS_CENTERED, SDL.SDL_WINDOWPOS_CENTERED,
                                      SCREEN_WIDTH, SCREEN_HEIGHT,
                                      SDL.SDL_WindowFlags.SDL_WINDOW_SHOWN)
    if window = NativePtr.zero then
        failwithf "Could not create window: %s" (SDL.SDL_GetError())

    let renderer = SDL.SDL_CreateRenderer(window, -1, SDL.SDL_RendererFlags.SDL_RENDERER_ACCELERATED)
    if renderer = NativePtr.zero then
        failwithf "Could not create renderer: %s" (SDL.SDL_GetError())

    let texture = SDL.SDL_CreateTexture(renderer,
                                        SDL.SDL_PIXELFORMAT_RGBA32,
                                        int SDL.SDL_TextureAccess.SDL_TEXTUREACCESS_STREAMING,
                                        SCREEN_WIDTH, SCREEN_HEIGHT)
    if texture = NativePtr.zero then
        failwithf "Could not create texture: %s" (SDL.SDL_GetError())

    // Load boot ROM if provided
    if bootRomData.Length > 0 then
        loadBootRom bootRomData emulatorState.Mmu

    let mutable running = true
    let mutable event = SDL.SDL_Event()

    while running do
        // Handle events (e.g., window close)
        while SDL.SDL_PollEvent(&event) = 1 do
            match event.type with
            | SDL.SDL_EventType.SDL_QUIT -> running <- false
            | _ -> ()

        // Execute CPU instruction
        let cycles = Cpu.executeInstruction emulatorState.Cpu emulatorState.Mmu emulatorState.MasterInterruptEnable

        // Update PPU
        Ppu.updatePpu cycles emulatorState.Ppu emulatorState.Mmu

        // Update timers, sound, etc. (future chapters)

        // Handle interrupts
        Cpu.handleInterrupts emulatorState.Cpu emulatorState.Mmu &emulatorState.MasterInterruptEnable

        // Render if frame is ready
        if emulatorState.Ppu.FrameReady then
            let pixelsPtr = NativePtr.ofNativeInt (NativePtr.sizeOf<byte> * 0) // Placeholder, actual update below
            let pitch = SCREEN_WIDTH * 4 // RGBA
            SDL.SDL_UpdateTexture(texture, NativePtr.zero, emulatorState.Ppu.CurrentFrameBuffer, pitch) |> ignore

            SDL.SDL_RenderClear(renderer) |> ignore
            SDL.SDL_RenderCopy(renderer, texture, NativePtr.zero, NativePtr.zero) |> ignore
            SDL.SDL_RenderPresent(renderer)

            emulatorState.Ppu.FrameReady <- false // Reset flag

        emulatorState.TotalCycles <- emulatorState.TotalCycles + cycles

    // Clean up SDL
    SDL.SDL_DestroyTexture(texture)
    SDL.SDL_DestroyRenderer(renderer)
    SDL.SDL_DestroyWindow(window)
    SDL.SDL_Quit()

Explanation:

  • EmulatorState: We add a Ppu field to hold the PpuState.
  • createEmulatorState: Initializes the PpuState when the emulator is created.
  • SDL Initialization: The runEmulator function now includes boilerplate for initializing SDL, creating a window, a renderer, and a texture.
    • SDL.SDL_Init(SDL.SDL_INIT_VIDEO): Initializes the video subsystem.
    • SDL.SDL_CreateWindow: Creates the display window.
    • SDL.SDL_CreateRenderer: Creates a 2D rendering context for the window.
    • SDL.SDL_CreateTexture: Creates a texture that we can update with our PPU’s pixel data. We specify SDL_PIXELFORMAT_RGBA32 to match our byte[] frame buffer.
  • updatePpu Call: After executing CPU instructions, we call Ppu.updatePpu with the cycles consumed. This drives the PPU’s internal clock.
  • Rendering CurrentFrameBuffer:
    • When ppu.FrameReady is true (indicating a full frame has been rendered by the PPU), we use SDL.SDL_UpdateTexture to copy our ppu.CurrentFrameBuffer data to the SDL texture.
    • SDL.SDL_RenderClear, SDL.SDL_RenderCopy, and SDL.SDL_RenderPresent then clear the renderer, copy the texture to it, and display it on the screen.
    • ppu.FrameReady is reset to false until the next frame is complete.
  • SDL Cleanup: Proper cleanup of SDL resources is added.

4. Project File (.fsproj) Updates

To use SDL2-CS, you need to add the NuGet package.

File: src/GameBoy/GameBoy.fsproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <WarnOn>FS3390;NU1605</WarnOn>
    <RootNamespace>GameBoy</RootNamespace>
    <GenerateProgramFile>false</GenerateProgramFile>
  </PropertyGroup>

  <ItemGroup>
    <Compile Include="Cpu.fs" />
    <Compile Include="Memory.fs" />
    <Compile Include="Ppu.fs" />
    <Compile Include="Emulator.fs" />
    <Compile Include="Program.fs" />
  </ItemGroup>

  <ItemGroup>
    <PackageReference Include="SDL2-CS" Version="2.28.0" />
  </ItemGroup>

</Project>

Explanation:

  • We’ve added a <PackageReference> for SDL2-CS version 2.28.0. This is the latest stable version as of 2026-05-05.
  • Ensure Ppu.fs is listed before Emulator.fs in the ItemGroup to maintain correct compilation order.

Installation of Native SDL2 Libraries: For SDL2-CS to work, you need the native SDL2 runtime libraries.

  • Windows: Download SDL2.dll from the official SDL website (https://libsdl.org/download-2.0.php) and place it in your project’s bin/Debug/net8.0 (or bin/Release/net8.0) directory alongside your executable.
  • macOS: Install via Homebrew: brew install sdl2.
  • Linux: Install via package manager: sudo apt-get install libsdl2-2.0-0 (Debian/Ubuntu) or sudo yum install SDL2 (Fedora/RHEL).

Testing & Verification

With the PPU’s background rendering implemented, we can now see visual output.

  1. Prepare a ROM:

    • Find a simple Game Boy ROM that displays a static background, such as a “hello world” test ROM or the official Game Boy boot ROM (if you’ve implemented its loading). Blargg’s CPU test ROMs often have simple backgrounds.
    • Let’s assume you have a test.gb file.
  2. Update Program.fs to load the ROM:

    File: src/GameBoy/Program.fs

    module GameBoy.Program
    
    open GameBoy.Emulator
    open System.IO
    
    [<EntryPoint>]
    let main argv =
        let bootRomPath = "bootrom.bin" // Path to your boot ROM (optional)
        let gameRomPath = "test.gb"    // Path to your Game Boy ROM
    
        let bootRomData =
            if File.Exists(bootRomPath) then
                File.ReadAllBytes(bootRomPath)
            else
                printfn "Boot ROM not found at %s. Continuing without it." bootRomPath
                Array.empty<byte>
    
        let gameRomData =
            if File.Exists(gameRomPath) then
                File.ReadAllBytes(gameRomPath)
            else
                failwithf "Game ROM not found at %s" gameRomPath
    
        let emulator = createEmulatorState bootRomData gameRomData
    
        printfn "Starting Game Boy emulator..."
        runEmulator emulator bootRomData // Pass boot ROM data to runEmulator
        printfn "Emulator stopped."
    
        0 // Return 0 for success
    
  3. Run the Emulator: Navigate to your src/GameBoy directory in the terminal and run:

    dotnet run
    

Expected Behavior:

  • A new window titled “F# Game Boy Emulator” should appear.
  • If you’re running a simple test ROM, you should see static background graphics. For instance, the Game Boy boot ROM will show the Nintendo logo (though the scrolling is not yet implemented, so it might appear static or partially rendered).
  • If you run a game ROM, you might see the initial background of the game, possibly without sprites or correct scrolling.

Quick Debugging Checks:

  • Blank Window:
    • Check if SDL2.dll (or equivalent) is in the correct directory.
    • Ensure LCDC bit 7 is set to enable the LCD.
    • Verify renderScanline is actually being called.
    • Check for SDL errors in the console.
  • Scrambled Graphics:
    • Incorrect readTilePixel logic, especially the tileDataAddress and bit manipulation.
    • Wrong BGP palette mapping.
    • Incorrect bgTileMapAddress or bgTileDataAddress selection based on LCDC.
  • No Background:
    • Check if LCDC bit 0 (background enable) is set.
    • Ensure VRAM is being loaded correctly by the ROM and accessed by the PPU.

Production Considerations

Performance

The PPU’s pixel-by-pixel rendering is a performance-critical area. Our current renderScanline function does a lot of work for each pixel.

  • Caching: Tile data (the 8x8 pixel patterns) rarely changes. We could pre-render each 8x8 tile into a small pixel buffer once it’s written to VRAM and then just copy from these cached tile buffers during renderScanline.
  • Batching: Instead of setting each pixel individually, we could potentially use SDL’s SDL_RenderDrawPoints or similar functions if we pre-calculate a scanline’s worth of pixels.
  • JIT Optimization: F# and .NET’s JIT compiler are very good, but tight loops like pixel rendering can still be bottlenecks. Profile your application to identify hot spots.

Synchronization

Accurate timing between the CPU and PPU is paramount.

  • Cycle Counting: We’re currently passing CPU cycles directly to updatePpu. This is a good start. Ensure that all CPU instructions correctly report their cycle counts.
  • PPU Cycle Accuracy: The PPU’s mode transitions and LY increments need to happen at precise cycle counts. Slight inaccuracies here can lead to visual glitches, screen tearing, or incorrect interrupt timings. The cycle counts used in updatePpu are approximations; consult Game Boy PPU documentation for exact timings per mode.

Maintainability

The PPU code will grow.

  • Modularity: Keep functions small and focused (e.g., readTilePixel, renderBackgroundPixel, renderSpritePixel).
  • Clear Register Mapping: Use named constants or discriminated unions for register bits where appropriate to make Lcdc &&& 0x08uy more readable.

Common Issues & Solutions

  1. Issue: Blank screen or erratic display behavior.

    • Cause: The LCD is likely disabled, or PPU timing is completely off.
    • Solution:
      • Verify LCDC bit 7 (0x80uy) is set. The boot ROM usually sets this.
      • Log the ppu.Ly and ppu.Stat values. LY should increment from 0 to 153. STAT mode bits (0-1) should cycle through 2, 3, 0, then 1 for VBlank.
      • Ensure updatePpu is called with the correct number of CPU cycles.
  2. Issue: Background is displayed, but colors are wrong or patterns are distorted.

    • Cause: Incorrect palette mapping or misinterpretation of tile data bits.
    • Solution:
      • Double-check getPaletteColor logic. BGP register bits define the colors.
      • Review readTilePixel bit manipulation. The Game Boy uses two bytes per tile row, where bit 0 of each byte forms the LSB of the pixel color, and bit 1 forms the MSB. Ensure the combining (bit2 <<< 1) ||| bit1 is correct.
      • Verify bgTileDataAddress and bgTileMapAddress selection based on LCDC bits.
  3. Issue: SDL2-CS fails to initialize or window doesn’t appear.

    • Cause: Native SDL2 library not found or installed incorrectly.
    • Solution:
      • Windows: Ensure SDL2.dll is in the executable’s directory (bin/Debug/net8.0).
      • macOS: Run brew install sdl2.
      • Linux: Run sudo apt-get install libsdl2-2.0-0 or equivalent for your distribution.
      • Check the error message from SDL_GetError().

Summary & Next Step

In this chapter, we’ve laid the critical groundwork for the Game Boy’s visual output. We’ve:

  • Integrated PPU registers into our MMU.
  • Designed and implemented the core PpuState and its update logic.
  • Developed functions to read tile data and render the background layer pixel by pixel.
  • Set up SDL2-CS to display our PPU’s frame buffer on screen.

By now, you should be able to load a simple ROM and see its background displayed, albeit without sprites or complex scrolling. This is a huge step towards a fully functional emulator!

Next, we’ll dive into the more dynamic aspects of the PPU: rendering sprites, handling LCD control register details, and implementing more accurate PPU timing and interrupts.


๐Ÿง  Check Your Understanding

  • What are the four main PPU modes, and what is the primary activity in each?
  • How does the LCDC register influence which tile map and tile data region are used for background rendering?
  • Explain the purpose of the SCY and SCX registers in background rendering.

โšก Mini Task

  • Modify the getPaletteColor function to use a custom set of colors (e.g., shades of green for a classic Game Boy feel) instead of grayscale.
  • Add logging to print the LY register value every time it increments, and observe its cycle from 0 to 153.

๐Ÿš€ Scenario

You’re running a complex Game Boy game, and the background appears to “tear” horizontally, with parts of the image shifted. Upon closer inspection, you notice the tearing occurs at seemingly random scanlines. What are the most likely causes for this issue, and where would you begin debugging in your PPU implementation? Consider CPU-PPU synchronization and interrupt handling.


๐Ÿ“Œ TL;DR

  • The Game Boy PPU draws a 160x144 display by rendering scanlines in modes: OAM Scan, Drawing Pixels, HBlank, and VBlank.
  • VRAM (0x8000-0x9FFF) stores tile data (pixel patterns) and tile maps (indices to tiles for background).
  • PPU I/O registers like LCDC, STAT, SCY, SCX, LY, LYC, and BGP control display features and status.
  • Background rendering involves reading SCY/SCX, determining tile map/data regions from LCDC, fetching tile indices, and then translating 2-bit pixel values to colors using BGP.
  • SDL2-CS is used for displaying the PPU’s CurrentFrameBuffer to a window.

๐Ÿง  Core Flow

  1. CPU executes instructions, returning cycle count.
  2. Ppu.updatePpu accumulates cycles, advances PPU mode, and increments LY.
  3. During “Drawing Pixels” mode, renderScanline fetches tile data from VRAM via MMU, applies scrolling from SCX/SCY, and draws pixels to CurrentFrameBuffer.
  4. Upon entering VBlank (after LY reaches 144), ppu.FrameReady is set to true.
  5. In Emulator.runEmulator, if ppu.FrameReady is true, the CurrentFrameBuffer is copied to an SDL texture and rendered to the screen.

๐Ÿš€ Key Takeaway

Emulating graphics hardware like the PPU demands meticulous attention to hardware specifications and accurate timing. The iterative process of accumulating CPU cycles, transitioning PPU modes, and carefully mapping memory-backed registers to visual output is fundamental to bringing an emulator to life.


References

This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.