Introduction
So far, our Game Boy emulator can execute CPU instructions and manage memory, but it’s a silent, black box. This chapter changes that. We’re about to bring the Game Boy to life by tackling the Picture Processing Unit (PPU) โ the hardware responsible for all the visuals. This is a significant milestone, as seeing actual graphics from a ROM is incredibly rewarding and validates much of our prior work.
In this first part of PPU emulation, we’ll focus on understanding the Game Boy’s display architecture, specifically how Video RAM (VRAM) stores graphical data, how tiles are defined, and how the PPU renders the background layer. By the end of this chapter, our emulator will be able to display static backgrounds from simple Game Boy ROMs, making the project visually verifiable.
The PPU is one of the most complex components of the Game Boy, demanding precise timing and careful interpretation of hardware specifications. We’ll break it down into manageable pieces, starting with the fundamentals of background rendering.
Planning & Design
The Game Boy’s PPU is responsible for drawing a 160x144 pixel display. It operates by drawing one horizontal line (a “scanline”) at a time, moving from top to bottom. This process involves fetching tile data from VRAM, assembling it into a background layer, and then drawing it to the screen.
Game Boy Display Architecture Overview
The Game Boy’s LCD has a fixed resolution of 160 pixels wide by 144 pixels high. The PPU cycles through different modes as it draws the screen:
- Mode 2 (OAM Scan): The PPU spends about 80 CPU cycles scanning Object Attribute Memory (OAM) to find sprites that appear on the current scanline.
- Mode 3 (Drawing Pixels): The PPU spends about 172-289 CPU cycles (variable) fetching tile data, background map data, and sprite data, then drawing the pixels for the current scanline. This is the most computationally intensive phase.
- Mode 0 (HBlank): After drawing a scanline, the PPU enters a Horizontal Blanking period for about 204 CPU cycles. This is a good time for the CPU to access VRAM or OAM without contention.
- Mode 1 (VBlank): Once all 144 visible scanlines are drawn, the PPU enters a Vertical Blanking period. This lasts for 10 scanlines (lines 144-153) and takes roughly 4560 CPU cycles in total. This is the ideal time to update the screen buffer.
PPU Memory and Registers
The PPU interacts heavily with specific memory regions and I/O registers:
Video RAM (VRAM):
0x8000-0x9FFF0x8000-0x97FF: Tile Data. This area stores the actual pixel patterns for 8x8 tiles. Each tile uses 16 bytes (2 bytes per row, 8 rows). Each pixel has 2 bits, allowing 4 colors.0x9800-0x9BFF: Tile Map 0. A 32x32 grid of tile indices for the background layer.0x9C00-0x9FFF: Tile Map 1. Another 32x32 grid of tile indices. The LCDC register determines which map is active.
PPU I/O Registers:
0xFF40(LCDC): LCD Control. Crucial for enabling/disabling PPU features (LCD, background, sprites, tile map selection, tile data selection).0xFF41(STAT): LCD Status. Contains the current PPU mode, LYC compare flag, and interrupt enable bits.0xFF42(SCY): Scroll Y. Vertical scroll offset for the background.0xFF43(SCX): Scroll X. Horizontal scroll offset for the background.0xFF44(LY): LCDC Y-Coordinate. The current scanline being drawn (0-153).0xFF45(LYC): LY Compare. An interrupt can be triggered when LY equals LYC.0xFF47(BGP): Background Palette. Defines the 4 colors used for the background.0xFF48(OBP0),0xFF49(OBP1): Object Palettes. Define colors for sprites (covered in Part 2).
Modeling the PPU State
We’ll introduce a new PpuState record to encapsulate all PPU-related registers and internal state. This state will be updated by the CPU in cycles.
// In Ppu.fs
type PpuMode =
| HBlank = 0
| VBlank = 1
| OAMScan = 2
| DrawingPixels = 3
type PpuState = {
mutable Lcdc : byte // LCD Control (0xFF40)
mutable Stat : byte // LCD Status (0xFF41)
mutable Scy : byte // Scroll Y (0xFF42)
mutable Scx : byte // Scroll X (0xFF43)
mutable Ly : byte // LCDC Y-Coordinate (0xFF44)
mutable Lyc : byte // LY Compare (0xFF45)
mutable Bgp : byte // Background Palette (0xFF47)
mutable Obp0 : byte // Object Palette 0 (0xFF48)
mutable Obp1 : byte // Object Palette 1 (0xFF49)
mutable CyclesThisScanline : int
mutable CurrentFrameBuffer : byte [] // 160 * 144 * 4 bytes for RGBA
mutable FrameReady : bool
}
PPU Rendering Flow
The core PPU logic will reside in an updatePpu function, which will be called by the main emulator loop for every CPU cycle (or a block of cycles).
PPU Update Cycle:
- Accumulate Cycles: The
updatePpufunction receives the number of CPU cycles executed. - Mode Progression: Based on accumulated cycles, the PPU transitions between modes (OAM Scan, Drawing Pixels, HBlank).
- Scanline Increment: When a scanline completes (after HBlank), the
LYregister is incremented. - VBlank: When
LYreaches 144, the PPU enters VBlank, and a VBlank interrupt is triggered. TheFrameReadyflag is set. - Render Scanline (Mode 3): During the “Drawing Pixels” mode, we’ll implement the logic to fetch tile data and draw pixels to our internal frame buffer.
Graphics Library: SDL2-CS
For rendering, we’ll use SDL2-CS, which is a C# binding for the SDL2 library, fully compatible with F# and .NET. SDL (Simple DirectMedia Layer) is a cross-platform development library designed to provide low-level access to audio, keyboard, mouse, joystick, and graphics hardware via OpenGL and Direct3D.
โก Quick Note: SDL2-CS is a NuGet package that wraps the native SDL2 library. You’ll need the native SDL2 runtime binaries installed on your system or bundled with your application. For Windows, this typically means SDL2.dll in your executable directory. For macOS and Linux, it’s usually libSDL2.dylib or libSDL2.so.
Step-by-Step Implementation
We’ll start by defining the PPU state, then integrate it into our MMU for register access, and finally implement the core updatePpu and renderScanline logic.
1. Update Memory.fs for PPU Registers
First, we need to ensure our MMU can read and write to the PPU’s I/O registers. We’ll add specific handling for the PPU’s memory-mapped registers.
File: src/GameBoy/Memory.fs
module GameBoy.Memory
open System
// ... (existing MemoryState type)
type MemoryState = {
// ... (existing fields)
mutable Vram : byte [] // 0x8000-0x9FFF
mutable Oam : byte [] // 0xFE00-0xFE9F
mutable Lcdc : byte // 0xFF40
mutable Stat : byte // 0xFF41
mutable Scy : byte // 0xFF42
mutable Scx : byte // 0xFF43
mutable Ly : byte // 0xFF44
mutable Lyc : byte // 0xFF45
mutable Bgp : byte // 0xFF47
mutable Obp0 : byte // 0xFF48
mutable Obp1 : byte // 0xFF49
mutable Dma : byte // 0xFF46 (DMA Transfer)
mutable If : byte // 0xFF0F (Interrupt Flag)
mutable Ie : byte // 0xFFFF (Interrupt Enable)
}
// ... (existing `createMemoryState` function)
let createMemoryState (romData: byte []) =
let state = {
// ... (existing initializations)
Vram = Array.zeroCreate (0x9FFF - 0x8000 + 1)
Oam = Array.zeroCreate (0xFE9F - 0xFE00 + 1)
Lcdc = 0x91uy // Default boot ROM value
Stat = 0x00uy
Scy = 0x00uy
Scx = 0x00uy
Ly = 0x00uy
Lyc = 0x00uy
Bgp = 0xFCuy // Default palette
Obp0 = 0xFFuy
Obp1 = 0xFFuy
Dma = 0x00uy
If = 0xE1uy // Default boot ROM value
Ie = 0x00uy
}
// ... (existing ROM loading)
state
// ... (existing `readByte` function)
let readByte (addr: uint16) (state: MemoryState) =
match addr with
// ... (existing ranges)
| _ when addr >= 0x8000us && addr <= 0x9FFFus -> state.Vram.[int (addr - 0x8000us)] // VRAM
| _ when addr >= 0xFE00us && addr <= 0xFE9Fus -> state.Oam.[int (addr - 0xFE00us)] // OAM
| 0xFF40us -> state.Lcdc
| 0xFF41us -> state.Stat
| 0xFF42us -> state.Scy
| 0xFF43us -> state.Scx
| 0xFF44us -> state.Ly
| 0xFF45us -> state.Lyc
| 0xFF46us -> state.Dma // DMA register
| 0xFF47us -> state.Bgp
| 0xFF48us -> state.Obp0
| 0xFF49us -> state.Obp1
| 0xFF0Fus -> state.If
| 0xFFFFus -> state.Ie
| _ ->
// ... (existing default read)
state.Hram.[int (addr - 0xFF80us)] // HRAM
Explanation:
- We’ve added
VramandOamarrays toMemoryStateto represent these memory regions. - New fields (
Lcdc,Stat,Scy, etc.) are added toMemoryStateto hold the values of the PPU’s I/O registers. These aremutablebecause the PPU (and CPU) will modify them. createMemoryStatenow initializes these PPU registers with typical boot ROM values.- The
readBytefunction is updated withmatchcases to correctly return the values from these new fields when their respective addresses are queried.
File: src/GameBoy/Memory.fs (continued, writeByte)
let writeByte (addr: uint16) (value: byte) (state: MemoryState) =
match addr with
// ... (existing ranges)
| _ when addr >= 0x8000us && addr <= 0x9FFFus -> state.Vram.[int (addr - 0x8000us)] <- value // VRAM
| _ when addr >= 0xFE00us && addr <= 0xFE9Fus -> state.Oam.[int (addr - 0xFE00us)] <- value // OAM
| 0xFF40us -> state.Lcdc <- value
| 0xFF41us -> state.Stat <- value // STAT register. Only bits 3-6 are writable by CPU.
| 0xFF42us -> state.Scy <- value
| 0xFF43us -> state.Scx <- value
| 0xFF44us -> state.Ly <- value // LY is read-only for CPU, PPU writes it. Writing to it resets it to 0.
| 0xFF45us -> state.Lyc <- value
| 0xFF46us -> // DMA Transfer
state.Dma <- value
// ๐ง Important: DMA transfer is complex. For now, we'll just store the value.
// A full DMA implementation would block the CPU for 160 cycles and copy data from (value * 0x100) to OAM.
| 0xFF47us -> state.Bgp <- value
| 0xFF48us -> state.Obp0 <- value
| 0xFF49us -> state.Obp1 <- value
| 0xFF0Fus -> state.If <- value // Interrupt Flag
| 0xFFFFus -> state.Ie <- value // Interrupt Enable
| _ ->
// ... (existing default write)
state.Hram.[int (addr - 0xFF80us)] <- value // HRAM
Explanation:
- The
writeBytefunction is similarly updated to allow the CPU to write to these PPU registers. LYwrite: Writing to0xFF44(LY) typically resets it to 0. We’ll handle this in thewriteBytefunction.STATwrite: Only specific bits ofSTATare writable by the CPU. For simplicity, we’ll allow full writes for now, but a production emulator would mask these.DMA: Direct Memory Access (DMA) to OAM is a critical feature for sprites. When the CPU writes to0xFF46, a block of memory is transferred to OAM. This takes 160 machine cycles and freezes the CPU. We’ll implement this more fully in a later chapter.
2. Create Ppu.fs
Now, let’s create the core PPU logic.
File: src/GameBoy/Ppu.fs
module GameBoy.Ppu
open GameBoy.Memory
open GameBoy.Cpu // Needed for interrupt flags
open System
// PPU Constants
let SCREEN_WIDTH = 160
let SCREEN_HEIGHT = 144
let VBLANK_SCANLINES = 10 // Scanlines 144-153
let TOTAL_SCANLINES = SCREEN_HEIGHT + VBLANK_SCANLINES // 154
let CPU_CYCLES_PER_SCANLINE = 456 // Roughly 456 CPU cycles per scanline (including HBlank)
// PPU Modes
type PpuMode =
| HBlank = 0 // Mode 0
| VBlank = 1 // Mode 1
| OAMScan = 2 // Mode 2
| DrawingPixels = 3 // Mode 3
// PPU State
type PpuState = {
mutable Lcdc : byte // LCD Control (0xFF40)
mutable Stat : byte // LCD Status (0xFF41)
mutable Scy : byte // Scroll Y (0xFF42)
mutable Scx : byte // Scroll X (0xFF43)
mutable Ly : byte // LCDC Y-Coordinate (0xFF44)
mutable Lyc : byte // LY Compare (0xFF45)
mutable Bgp : byte // Background Palette (0xFF47)
mutable Obp0 : byte // Object Palette 0 (0xFF48)
mutable Obp1 : byte // Object Palette 1 (0xFF49)
mutable CyclesThisScanline : int
mutable CurrentFrameBuffer : byte [] // 160 * 144 * 4 bytes for RGBA
mutable FrameReady : bool
}
let createPpuState () = {
Lcdc = 0x91uy
Stat = 0x00uy
Ly = 0x00uy
Lyc = 0x00uy
Scy = 0x00uy
Scx = 0x00uy
Bgp = 0xFCuy // Default palette: 0xFF, 0xAA, 0x55, 0x00
Obp0 = 0xFFuy
Obp1 = 0xFFuy
CyclesThisScanline = 0
CurrentFrameBuffer = Array.zeroCreate (SCREEN_WIDTH * SCREEN_HEIGHT * 4) // RGBA
FrameReady = false
}
// Helper to get pixel color from a 2-bit value and palette
let getPaletteColor (paletteRegister: byte) (pixelValue: byte) =
let colorIndex = (paletteRegister >>> (int pixelValue * 2)) &&& 0x03uy
match colorIndex with
| 0x00uy -> (0xFFuy, 0xFFuy, 0xFFuy, 0xFFuy) // White
| 0x01uy -> (0xAAuy, 0xAAuy, 0xAAuy, 0xFFuy) // Light Gray
| 0x02uy -> (0x55uy, 0x55uy, 0x55uy, 0xFFuy) // Dark Gray
| _ -> (0x00uy, 0x00uy, 0x00uy, 0xFFuy) // Black
// Function to read a tile's pixel data from VRAM
// tileIndex: 0-255
// tileDataAddress: base address for tile data (0x8000 or 0x8800)
let readTilePixel (mmu: MemoryState) (tileIndex: byte) (tileDataAddress: uint16) (yInTile: int) (xInTile: int) =
let baseAddr =
if tileDataAddress = 0x8000us then
// Unsigned addressing (0-255)
uint16 (0x8000 + (int tileIndex * 16))
else
// Signed addressing (0-127, -128- -1) mapping to 0-255
// If tileIndex is 0-127, it maps to 0x8800-0x8FFF.
// If tileIndex is 128-255 (interpreted as -128 to -1), it maps to 0x9000-0x97FF.
let signedTileIndex = sbyte tileIndex
if signedTileIndex >= 0s then
uint16 (0x8800 + (int signedTileIndex * 16))
else
uint16 (0x9000 + ((int signedTileIndex + 256) * 16)) // Equivalent to (signedTileIndex + 128 + 128) * 16
let tileLineAddr = baseAddr + uint16 (yInTile * 2)
let byte1 = Memory.readByte tileLineAddr mmu
let byte2 = Memory.readByte (tileLineAddr + 1us) mmu
let bit1 = (byte1 >>> (7 - xInTile)) &&& 0x01uy
let bit2 = (byte2 >>> (7 - xInTile)) &&& 0x01uy
(bit2 <<< 1) ||| bit1 // Combine into 2-bit pixel value
// Renders a single scanline to the frame buffer
let renderScanline (ppu: PpuState) (mmu: MemoryState) =
let bgDisplayEnable = (ppu.Lcdc &&& 0x01uy) = 0x01uy // LCDC Bit 0
if not bgDisplayEnable then
// If background display is disabled, clear the scanline to white
for x = 0 to SCREEN_WIDTH - 1 do
let pixelIndex = ((int ppu.Ly * SCREEN_WIDTH) + x) * 4
ppu.CurrentFrameBuffer.[pixelIndex] <- 0xFFuy // R
ppu.CurrentFrameBuffer.[pixelIndex + 1] <- 0xFFuy // G
ppu.CurrentFrameBuffer.[pixelIndex + 2] <- 0xFFuy // B
ppu.CurrentFrameBuffer.[pixelIndex + 3] <- 0xFFuy // A
()
else
let bgTileMapAddress =
if (ppu.Lcdc &&& 0x08uy) = 0x08uy then 0x9C00us // LCDC Bit 3: 1=9C00-9FFF, 0=9800-9BFF
else 0x9800us
let bgTileDataAddress =
if (ppu.Lcdc &&& 0x10uy) = 0x10uy then 0x8000us // LCDC Bit 4: 1=8000-8FFF, 0=8800-97FF
else 0x8800us
let scrollY = int ppu.Scy
let scrollX = int ppu.Scx
let currentLy = int ppu.Ly
let yInMap = (currentLy + scrollY) % 256 // Wrapped Y coordinate in the 256x256 background map
let yInTile = yInMap % 8 // Y coordinate within the 8x8 tile
for x = 0 to SCREEN_WIDTH - 1 do
let xInMap = (x + scrollX) % 256 // Wrapped X coordinate in the 256x256 background map
let xInTile = xInMap % 8 // X coordinate within the 8x8 tile
let tileMapX = xInMap / 8
let tileMapY = yInMap / 8
let tileMapOffset = uint16 (tileMapY * 32 + tileMapX)
let tileIndex = Memory.readByte (bgTileMapAddress + tileMapOffset) mmu
let pixelValue = readTilePixel mmu tileIndex bgTileDataAddress yInTile xInTile
let (r, g, b, a) = getPaletteColor ppu.Bgp pixelValue
let pixelIndex = ((currentLy * SCREEN_WIDTH) + x) * 4
ppu.CurrentFrameBuffer.[pixelIndex] <- r
ppu.CurrentFrameBuffer.[pixelIndex + 1] <- g
ppu.CurrentFrameBuffer.[pixelIndex + 2] <- b
ppu.CurrentFrameBuffer.[pixelIndex + 3] <- a
// Main PPU update function, called by the CPU
let updatePpu (cycles: int) (ppu: PpuState) (mmu: MemoryState) =
ppu.CyclesThisScanline <- ppu.CyclesThisScanline + cycles
// ๐ง Important: Check LCD enable/disable. If LCD is off, PPU state should be reset.
let lcdEnable = (ppu.Lcdc &&& 0x80uy) = 0x80uy // LCDC Bit 7
if not lcdEnable then
ppu.CyclesThisScanline <- 0
ppu.Ly <- 0x00uy
// Reset PPU mode to HBlank if LCD is off
ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.HBlank // Clear bits 0-2 (mode) and set to HBlank
ppu.FrameReady <- false
()
else
// Get current PPU mode from STAT register
let currentMode = (ppu.Stat &&& 0x03uy) |> enum<PpuMode>
match currentMode with
| OAMScan ->
if ppu.CyclesThisScanline >= 80 then
ppu.CyclesThisScanline <- ppu.CyclesThisScanline - 80
ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.DrawingPixels // Change mode to Drawing Pixels
| DrawingPixels ->
if ppu.CyclesThisScanline >= 172 then // Minimum cycles for DrawingPixels
ppu.CyclesThisScanline <- ppu.CyclesThisScanline - 172 // Use minimum for simplicity, actual is variable
renderScanline ppu mmu // Render the completed scanline
ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.HBlank // Change mode to HBlank
// Check and trigger HBlank interrupt if enabled (STAT bit 3)
if (ppu.Stat &&& 0x08uy) = 0x08uy then
mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat
| HBlank ->
if ppu.CyclesThisScanline >= CPU_CYCLES_PER_SCANLINE - 80 - 172 then // Remaining cycles for HBlank
ppu.CyclesThisScanline <- 0
ppu.Ly <- ppu.Ly + 1uy // Increment scanline
// Check and trigger LYC=LY interrupt if enabled (STAT bit 6)
if ppu.Ly = ppu.Lyc && (ppu.Stat &&& 0x40uy) = 0x40uy then
mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat
if ppu.Ly >= byte SCREEN_HEIGHT then
// Enter VBlank
ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.VBlank
mmu.If <- mmu.If ||| byte Cpu.InterruptType.VBlank // Trigger VBlank interrupt
// Check and trigger VBlank interrupt if enabled (STAT bit 4)
if (ppu.Stat &&& 0x10uy) = 0x10uy then
mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat
ppu.FrameReady <- true // A full frame is ready
else
// Back to OAM Scan for next scanline
ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.OAMScan
// Check and trigger OAM interrupt if enabled (STAT bit 5)
if (ppu.Stat &&& 0x20uy) = 0x20uy then
mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat
| VBlank ->
if ppu.CyclesThisScanline >= CPU_CYCLES_PER_SCANLINE then
ppu.CyclesThisScanline <- ppu.CyclesThisScanline - CPU_CYCLES_PER_SCANLINE
ppu.Ly <- ppu.Ly + 1uy
// Check and trigger LYC=LY interrupt if enabled (STAT bit 6)
if ppu.Ly = ppu.Lyc && (ppu.Stat &&& 0x40uy) = 0x40uy then
mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat
if ppu.Ly >= byte TOTAL_SCANLINES then
// End of VBlank, reset to scanline 0, OAM Scan mode
ppu.Ly <- 0x00uy
ppu.Stat <- (ppu.Stat &&& 0xF8uy) ||| byte PpuMode.OAMScan
// Check and trigger OAM interrupt if enabled (STAT bit 5)
if (ppu.Stat &&& 0x20uy) = 0x20uy then
mmu.If <- mmu.If ||| byte Cpu.InterruptType.LcdStat
// Update LY and STAT registers in MMU for CPU visibility
mmu.Ly <- ppu.Ly
mmu.Stat <- ppu.Stat
Explanation:
PpuStateandcreatePpuState: Defines the PPU’s internal mutable state and provides an initialization function. We use abyte arrayforCurrentFrameBufferto store RGBA pixel data.getPaletteColor: A helper function that takes a palette register value (BGPfor background) and a 2-bit pixel value, then returns the corresponding RGBA color tuple. This simplifies color mapping.readTilePixel: This function is crucial. It takes theMemoryState(to access VRAM), atileIndex, thetileDataAddress(either0x8000or0x8800depending on LCDC bit 4), and the(xInTile, yInTile)coordinates. It then fetches the two bytes that define the pixel row, extracts the 2-bit pixel value, and returns it. ThetileDataAddresslogic handles the two different tile data addressing modes.renderScanline: This is the heart of background rendering.- It checks
LCDCbit 0 to see if the background display is enabled. If not, it clears the current scanline to white. - It determines which tile map (
0x9800or0x9C00) and tile data region (0x8000or0x8800) to use based onLCDCbits 3 and 4, respectively. - It calculates the
(xInMap, yInMap)coordinates, takingSCXandSCY(scroll registers) into account, and wrapping around the 256x256 background map. - It then determines the
tileMapXandtileMapY(which 8x8 tile in the map) andxInTile,yInTile(pixel within that tile). - It reads the
tileIndexfrom the active tile map. - It calls
readTilePixelto get the 2-bit pixel value. - It uses
getPaletteColorto convert the pixel value into an RGBA color. - Finally, it writes the RGBA bytes into the
ppu.CurrentFrameBufferat the correct(x, y)position.
- It checks
updatePpu: This function simulates the PPU’s internal clock and mode transitions.- It accumulates CPU cycles.
- It checks
LCDCbit 7 to see if the LCD is enabled. If not, it resets the PPU state. - It progresses through
OAMScan,DrawingPixels,HBlank, andVBlankmodes based on elapsed cycles. - When
DrawingPixelscompletes, it callsrenderScanline. - When
HBlankcompletes, it incrementsppu.Ly. - When
ppu.LyreachesSCREEN_HEIGHT(144), it entersVBlankand setsppu.FrameReadytotrue. - It also handles triggering
VBlankandLCDStatinterrupts by setting bits inmmu.If. - Crucially, it updates
mmu.Lyandmmu.Statso the CPU can read the current PPU state.
3. Integrate PPU into Emulator.fs
Now we need to wire the PPU into our main emulator loop and set up SDL2-CS for display.
File: src/GameBoy/Emulator.fs
module GameBoy.Emulator
open GameBoy.Cpu
open GameBoy.Memory
open GameBoy.Ppu
open SDL2
open System
// Define the main emulator state
type EmulatorState = {
mutable Cpu : CpuState
mutable Mmu : MemoryState
mutable Ppu : PpuState
mutable MasterInterruptEnable : bool
mutable TotalCycles : int6}
let createEmulatorState (bootRomData: byte []) (gameRomData: byte []) = {
Cpu = createCpuState()
Mmu = createMemoryState gameRomData // Pass game ROM data
Ppu = createPpuState()
MasterInterruptEnable = false
TotalCycles = 0
}
// โก Quick Note: For simplicity, we'll load the boot ROM into memory here.
// In a real scenario, the boot ROM might be loaded into a separate ROM chip region.
let loadBootRom (bootRomData: byte []) (mmu: MemoryState) =
for i = 0 to bootRomData.Length - 1 do
mmu.Rom0.[i] <- bootRomData.[i]
// Main emulation loop function
let runEmulator (emulatorState: EmulatorState) (bootRomData: byte []) =
// Initialize SDL
if SDL.SDL_Init(SDL.SDL_INIT_VIDEO) < 0 then
failwithf "Could not initialize SDL: %s" (SDL.SDL_GetError())
let window = SDL.SDL_CreateWindow("F# Game Boy Emulator",
SDL.SDL_WINDOWPOS_CENTERED, SDL.SDL_WINDOWPOS_CENTERED,
SCREEN_WIDTH, SCREEN_HEIGHT,
SDL.SDL_WindowFlags.SDL_WINDOW_SHOWN)
if window = NativePtr.zero then
failwithf "Could not create window: %s" (SDL.SDL_GetError())
let renderer = SDL.SDL_CreateRenderer(window, -1, SDL.SDL_RendererFlags.SDL_RENDERER_ACCELERATED)
if renderer = NativePtr.zero then
failwithf "Could not create renderer: %s" (SDL.SDL_GetError())
let texture = SDL.SDL_CreateTexture(renderer,
SDL.SDL_PIXELFORMAT_RGBA32,
int SDL.SDL_TextureAccess.SDL_TEXTUREACCESS_STREAMING,
SCREEN_WIDTH, SCREEN_HEIGHT)
if texture = NativePtr.zero then
failwithf "Could not create texture: %s" (SDL.SDL_GetError())
// Load boot ROM if provided
if bootRomData.Length > 0 then
loadBootRom bootRomData emulatorState.Mmu
let mutable running = true
let mutable event = SDL.SDL_Event()
while running do
// Handle events (e.g., window close)
while SDL.SDL_PollEvent(&event) = 1 do
match event.type with
| SDL.SDL_EventType.SDL_QUIT -> running <- false
| _ -> ()
// Execute CPU instruction
let cycles = Cpu.executeInstruction emulatorState.Cpu emulatorState.Mmu emulatorState.MasterInterruptEnable
// Update PPU
Ppu.updatePpu cycles emulatorState.Ppu emulatorState.Mmu
// Update timers, sound, etc. (future chapters)
// Handle interrupts
Cpu.handleInterrupts emulatorState.Cpu emulatorState.Mmu &emulatorState.MasterInterruptEnable
// Render if frame is ready
if emulatorState.Ppu.FrameReady then
let pixelsPtr = NativePtr.ofNativeInt (NativePtr.sizeOf<byte> * 0) // Placeholder, actual update below
let pitch = SCREEN_WIDTH * 4 // RGBA
SDL.SDL_UpdateTexture(texture, NativePtr.zero, emulatorState.Ppu.CurrentFrameBuffer, pitch) |> ignore
SDL.SDL_RenderClear(renderer) |> ignore
SDL.SDL_RenderCopy(renderer, texture, NativePtr.zero, NativePtr.zero) |> ignore
SDL.SDL_RenderPresent(renderer)
emulatorState.Ppu.FrameReady <- false // Reset flag
emulatorState.TotalCycles <- emulatorState.TotalCycles + cycles
// Clean up SDL
SDL.SDL_DestroyTexture(texture)
SDL.SDL_DestroyRenderer(renderer)
SDL.SDL_DestroyWindow(window)
SDL.SDL_Quit()
Explanation:
EmulatorState: We add aPpufield to hold thePpuState.createEmulatorState: Initializes thePpuStatewhen the emulator is created.- SDL Initialization: The
runEmulatorfunction now includes boilerplate for initializing SDL, creating a window, a renderer, and a texture.SDL.SDL_Init(SDL.SDL_INIT_VIDEO): Initializes the video subsystem.SDL.SDL_CreateWindow: Creates the display window.SDL.SDL_CreateRenderer: Creates a 2D rendering context for the window.SDL.SDL_CreateTexture: Creates a texture that we can update with our PPU’s pixel data. We specifySDL_PIXELFORMAT_RGBA32to match ourbyte[]frame buffer.
updatePpuCall: After executing CPU instructions, we callPpu.updatePpuwith the cycles consumed. This drives the PPU’s internal clock.- Rendering
CurrentFrameBuffer:- When
ppu.FrameReadyistrue(indicating a full frame has been rendered by the PPU), we useSDL.SDL_UpdateTextureto copy ourppu.CurrentFrameBufferdata to the SDL texture. SDL.SDL_RenderClear,SDL.SDL_RenderCopy, andSDL.SDL_RenderPresentthen clear the renderer, copy the texture to it, and display it on the screen.ppu.FrameReadyis reset tofalseuntil the next frame is complete.
- When
- SDL Cleanup: Proper cleanup of SDL resources is added.
4. Project File (.fsproj) Updates
To use SDL2-CS, you need to add the NuGet package.
File: src/GameBoy/GameBoy.fsproj
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net8.0</TargetFramework>
<WarnOn>FS3390;NU1605</WarnOn>
<RootNamespace>GameBoy</RootNamespace>
<GenerateProgramFile>false</GenerateProgramFile>
</PropertyGroup>
<ItemGroup>
<Compile Include="Cpu.fs" />
<Compile Include="Memory.fs" />
<Compile Include="Ppu.fs" />
<Compile Include="Emulator.fs" />
<Compile Include="Program.fs" />
</ItemGroup>
<ItemGroup>
<PackageReference Include="SDL2-CS" Version="2.28.0" />
</ItemGroup>
</Project>
Explanation:
- We’ve added a
<PackageReference>forSDL2-CSversion2.28.0. This is the latest stable version as of 2026-05-05. - Ensure
Ppu.fsis listed beforeEmulator.fsin theItemGroupto maintain correct compilation order.
Installation of Native SDL2 Libraries:
For SDL2-CS to work, you need the native SDL2 runtime libraries.
- Windows: Download
SDL2.dllfrom the official SDL website (https://libsdl.org/download-2.0.php) and place it in your project’sbin/Debug/net8.0(orbin/Release/net8.0) directory alongside your executable. - macOS: Install via Homebrew:
brew install sdl2. - Linux: Install via package manager:
sudo apt-get install libsdl2-2.0-0(Debian/Ubuntu) orsudo yum install SDL2(Fedora/RHEL).
Testing & Verification
With the PPU’s background rendering implemented, we can now see visual output.
Prepare a ROM:
- Find a simple Game Boy ROM that displays a static background, such as a “hello world” test ROM or the official Game Boy boot ROM (if you’ve implemented its loading). Blargg’s CPU test ROMs often have simple backgrounds.
- Let’s assume you have a
test.gbfile.
Update
Program.fsto load the ROM:File:
src/GameBoy/Program.fsmodule GameBoy.Program open GameBoy.Emulator open System.IO [<EntryPoint>] let main argv = let bootRomPath = "bootrom.bin" // Path to your boot ROM (optional) let gameRomPath = "test.gb" // Path to your Game Boy ROM let bootRomData = if File.Exists(bootRomPath) then File.ReadAllBytes(bootRomPath) else printfn "Boot ROM not found at %s. Continuing without it." bootRomPath Array.empty<byte> let gameRomData = if File.Exists(gameRomPath) then File.ReadAllBytes(gameRomPath) else failwithf "Game ROM not found at %s" gameRomPath let emulator = createEmulatorState bootRomData gameRomData printfn "Starting Game Boy emulator..." runEmulator emulator bootRomData // Pass boot ROM data to runEmulator printfn "Emulator stopped." 0 // Return 0 for successRun the Emulator: Navigate to your
src/GameBoydirectory in the terminal and run:dotnet run
Expected Behavior:
- A new window titled “F# Game Boy Emulator” should appear.
- If you’re running a simple test ROM, you should see static background graphics. For instance, the Game Boy boot ROM will show the Nintendo logo (though the scrolling is not yet implemented, so it might appear static or partially rendered).
- If you run a game ROM, you might see the initial background of the game, possibly without sprites or correct scrolling.
Quick Debugging Checks:
- Blank Window:
- Check if
SDL2.dll(or equivalent) is in the correct directory. - Ensure
LCDCbit 7 is set to enable the LCD. - Verify
renderScanlineis actually being called. - Check for SDL errors in the console.
- Check if
- Scrambled Graphics:
- Incorrect
readTilePixellogic, especially thetileDataAddressand bit manipulation. - Wrong
BGPpalette mapping. - Incorrect
bgTileMapAddressorbgTileDataAddressselection based onLCDC.
- Incorrect
- No Background:
- Check if
LCDCbit 0 (background enable) is set. - Ensure
VRAMis being loaded correctly by the ROM and accessed by the PPU.
- Check if
Production Considerations
Performance
The PPU’s pixel-by-pixel rendering is a performance-critical area. Our current renderScanline function does a lot of work for each pixel.
- Caching: Tile data (the 8x8 pixel patterns) rarely changes. We could pre-render each 8x8 tile into a small pixel buffer once it’s written to VRAM and then just copy from these cached tile buffers during
renderScanline. - Batching: Instead of setting each pixel individually, we could potentially use SDL’s
SDL_RenderDrawPointsor similar functions if we pre-calculate a scanline’s worth of pixels. - JIT Optimization: F# and .NET’s JIT compiler are very good, but tight loops like pixel rendering can still be bottlenecks. Profile your application to identify hot spots.
Synchronization
Accurate timing between the CPU and PPU is paramount.
- Cycle Counting: We’re currently passing CPU cycles directly to
updatePpu. This is a good start. Ensure that all CPU instructions correctly report their cycle counts. - PPU Cycle Accuracy: The PPU’s mode transitions and
LYincrements need to happen at precise cycle counts. Slight inaccuracies here can lead to visual glitches, screen tearing, or incorrect interrupt timings. The cycle counts used inupdatePpuare approximations; consult Game Boy PPU documentation for exact timings per mode.
Maintainability
The PPU code will grow.
- Modularity: Keep functions small and focused (e.g.,
readTilePixel,renderBackgroundPixel,renderSpritePixel). - Clear Register Mapping: Use named constants or discriminated unions for register bits where appropriate to make
Lcdc &&& 0x08uymore readable.
Common Issues & Solutions
Issue: Blank screen or erratic display behavior.
- Cause: The LCD is likely disabled, or PPU timing is completely off.
- Solution:
- Verify
LCDCbit 7 (0x80uy) is set. The boot ROM usually sets this. - Log the
ppu.Lyandppu.Statvalues.LYshould increment from 0 to 153.STATmode bits (0-1) should cycle through 2, 3, 0, then 1 for VBlank. - Ensure
updatePpuis called with the correct number of CPU cycles.
- Verify
Issue: Background is displayed, but colors are wrong or patterns are distorted.
- Cause: Incorrect palette mapping or misinterpretation of tile data bits.
- Solution:
- Double-check
getPaletteColorlogic.BGPregister bits define the colors. - Review
readTilePixelbit manipulation. The Game Boy uses two bytes per tile row, where bit 0 of each byte forms the LSB of the pixel color, and bit 1 forms the MSB. Ensure the combining(bit2 <<< 1) ||| bit1is correct. - Verify
bgTileDataAddressandbgTileMapAddressselection based onLCDCbits.
- Double-check
Issue:
SDL2-CSfails to initialize or window doesn’t appear.- Cause: Native SDL2 library not found or installed incorrectly.
- Solution:
- Windows: Ensure
SDL2.dllis in the executable’s directory (bin/Debug/net8.0). - macOS: Run
brew install sdl2. - Linux: Run
sudo apt-get install libsdl2-2.0-0or equivalent for your distribution. - Check the error message from
SDL_GetError().
- Windows: Ensure
Summary & Next Step
In this chapter, we’ve laid the critical groundwork for the Game Boy’s visual output. We’ve:
- Integrated PPU registers into our
MMU. - Designed and implemented the core
PpuStateand its update logic. - Developed functions to read tile data and render the background layer pixel by pixel.
- Set up
SDL2-CSto display our PPU’s frame buffer on screen.
By now, you should be able to load a simple ROM and see its background displayed, albeit without sprites or complex scrolling. This is a huge step towards a fully functional emulator!
Next, we’ll dive into the more dynamic aspects of the PPU: rendering sprites, handling LCD control register details, and implementing more accurate PPU timing and interrupts.
๐ง Check Your Understanding
- What are the four main PPU modes, and what is the primary activity in each?
- How does the
LCDCregister influence which tile map and tile data region are used for background rendering? - Explain the purpose of the
SCYandSCXregisters in background rendering.
โก Mini Task
- Modify the
getPaletteColorfunction to use a custom set of colors (e.g., shades of green for a classic Game Boy feel) instead of grayscale. - Add logging to print the
LYregister value every time it increments, and observe its cycle from 0 to 153.
๐ Scenario
You’re running a complex Game Boy game, and the background appears to “tear” horizontally, with parts of the image shifted. Upon closer inspection, you notice the tearing occurs at seemingly random scanlines. What are the most likely causes for this issue, and where would you begin debugging in your PPU implementation? Consider CPU-PPU synchronization and interrupt handling.
๐ TL;DR
- The Game Boy PPU draws a 160x144 display by rendering scanlines in modes: OAM Scan, Drawing Pixels, HBlank, and VBlank.
- VRAM (
0x8000-0x9FFF) stores tile data (pixel patterns) and tile maps (indices to tiles for background). - PPU I/O registers like
LCDC,STAT,SCY,SCX,LY,LYC, andBGPcontrol display features and status. - Background rendering involves reading
SCY/SCX, determining tile map/data regions fromLCDC, fetching tile indices, and then translating 2-bit pixel values to colors usingBGP. SDL2-CSis used for displaying the PPU’sCurrentFrameBufferto a window.
๐ง Core Flow
- CPU executes instructions, returning cycle count.
Ppu.updatePpuaccumulates cycles, advances PPU mode, and incrementsLY.- During “Drawing Pixels” mode,
renderScanlinefetches tile data from VRAM viaMMU, applies scrolling fromSCX/SCY, and draws pixels toCurrentFrameBuffer. - Upon entering VBlank (after
LYreaches 144),ppu.FrameReadyis set totrue. - In
Emulator.runEmulator, ifppu.FrameReadyis true, theCurrentFrameBufferis copied to an SDL texture and rendered to the screen.
๐ Key Takeaway
Emulating graphics hardware like the PPU demands meticulous attention to hardware specifications and accurate timing. The iterative process of accumulating CPU cycles, transitioning PPU modes, and carefully mapping memory-backed registers to visual output is fundamental to bringing an emulator to life.
References
- Pan Docs - The Ultimate Game Boy Technical Manual
- F# Language Reference
- SDL2-CS NuGet Package
- SDL Documentation
This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.