Inferensys

Glossary

Page Table

A page table is a data structure used by an operating system's virtual memory system to store the mapping between virtual addresses used by a process and physical addresses in RAM.
Data scientist building training data pipeline on laptop, data preprocessing visible, technical workspace.
HIERARCHICAL MEMORY STRUCTURES

What is a Page Table?

A page table is a core data structure in a computer's virtual memory system, managed by the operating system's memory management unit (MMU).

A page table is a data structure used by a virtual memory system to store the mapping between virtual addresses used by a process and physical addresses in RAM. It enables the illusion of a large, contiguous address space for each program while the operating system manages the actual, fragmented physical memory. Each entry in the table translates a virtual page to a physical page frame, and includes metadata like permission bits for memory protection.

The page table is consulted on every memory access, making its performance critical. To accelerate lookups, a hardware cache called a Translation Lookaside Buffer (TLB) stores recent translations. In modern systems, page tables are often hierarchical (multi-level) to manage large address spaces efficiently. This structure is a foundational concept for memory isolation between processes and is analogous to the indexing systems used in agentic memory architectures for context retrieval.

VIRTUAL MEMORY

Key Components of a Page Table

A page table is the core data structure used by an operating system's virtual memory manager to translate virtual addresses used by a process into physical addresses in RAM. It is a per-process structure, enabling memory isolation and the illusion of a large, contiguous address space.

01

Page Table Entry (PTE)

A Page Table Entry (PTE) is the fundamental unit within a page table, storing the mapping for a single virtual page. Its structure is hardware-defined but typically includes:

  • Physical Frame Number (PFN): The starting physical address of the corresponding page frame in RAM.
  • Present/Absent Bit: Indicates if the page is currently loaded in physical memory (present) or has been swapped out to disk (absent, triggering a page fault).
  • Read/Write Bit: Controls write permissions.
  • User/Supervisor Bit: Determines if the page is accessible from user-mode or only kernel-mode.
  • Accessed Bit: Set by hardware when the page is read or written; used by page replacement algorithms.
  • Dirty Bit: Set by hardware when the page is written to; indicates the page must be written back to disk if evicted.
02

Multi-Level Page Tables

Multi-Level Page Tables (e.g., two-level, four-level, five-level) are a hierarchical design used to manage the page table's own memory footprint efficiently. Instead of one massive linear table, the virtual address is split into indexes for each level.

  • How it works: A top-level Page Directory contains pointers to intermediate Page Middle Directories, which finally point to Page Tables containing the actual PTEs. This creates a tree structure.
  • Key Benefit: It saves memory because only the page tables for regions of the address space that are actually allocated need to be allocated and resident. Unused vast regions of the address space consume no intermediate table entries.
  • Trade-off: Adds a small latency overhead for address translation, as multiple memory accesses may be needed to walk the hierarchy (mitigated by the TLB).
03

Inverted Page Table (IPT)

An Inverted Page Table (IPT) is an alternative page table design where the table is indexed by the physical frame number, not the virtual page number. There is one entry per physical page frame in the entire system, not per virtual page per process.

  • Structure: Each IPT entry stores the Process ID (PID) and the Virtual Page Number (VPN) of the process currently occupying that physical frame.
  • Translation Process: To find a mapping, the memory management unit (MMU) must perform a hash table lookup on the (PID, VPN) pair to locate the corresponding physical frame. This is slower than a direct lookup in a traditional page table.
  • Use Case: Primarily used in 64-bit architectures (e.g., PowerPC, some ARM configurations) where the virtual address space is astronomically large, making per-process multi-level tables potentially wasteful. It guarantees a constant, system-wide page table size proportional to physical RAM.
04

Hashed Page Table

A Hashed Page Table is a common implementation technique for Inverted Page Tables and large sparse address spaces. It uses a hash function to map a virtual address (often combined with a Process ID) to a chain of entries in a hash table.

  • Collision Handling: Uses chaining; each hash bucket contains a linked list of entries for virtual addresses that hash to the same value.
  • Performance: Lookup time depends on the hash function quality and chain length. In practice, chains are kept very short.
  • Advantage over linear search: Provides O(1) average-case lookup time for translations, making the search of a large inverted table feasible. It is a core component of the translation mechanism in architectures like PowerPC's Real Mode (RP).
05

Page Table Base Register (PTBR)

The Page Table Base Register (PTBR), also called CR3 on x86/x86-64 architectures, is a privileged CPU register that holds the physical address of the root of the page table hierarchy (e.g., the Page Directory Pointer Table or PML4) for the currently executing process.

  • Context Switch Critical: During a process context switch, the operating system kernel must load the PTBR with the physical address of the new process's page table root. This act instantly changes the entire virtual-to-physical address map for the CPU.
  • Memory Isolation: Because each process has its own page table and its root address in the PTBR, processes are isolated—they cannot access each other's memory unless explicitly shared via identical mappings.
  • Kernel Mapping: The upper portion of the virtual address space (kernel space) is often mapped identically in all process page tables, with protections preventing user-mode access. This allows the kernel to be always "in context."
06

Translation Lookaside Buffer (TLB)

The Translation Lookaside Buffer (TLB) is a hardware cache, integral to the page table system, that stores recent virtual-to-physical address translations. It is not a component of the page table but is its critical performance accelerator.

  • Operation: On a memory access, the MMU first checks the TLB for a cached translation (a TLB hit). If found, the physical address is obtained in ~1 cycle. If not found (a TLB miss), the costly page table walk in memory is initiated.
  • Flushing: TLBs are per-CPU and must be flushed (or specific entries invalidated) on events like a context switch (changing the PTBR) or when the OS modifies a page table entry (e.g., swapping a page out). Instructions like INVLPG (x86) handle this.
  • Architectures: Modern CPUs have multiple, hierarchical TLBs (L1, L2) and may support large pages (e.g., 2MB, 1GB) to reduce TLB pressure for big, contiguous memory regions.
MEMORY HIERARCHY

How Page Table Address Translation Works

A page table is the core data structure used by a virtual memory system to map a process's virtual address space to physical memory frames.

A page table is a per-process data structure, managed by the operating system's Memory Management Unit (MMU), that stores mappings from virtual pages to physical frames. When a CPU issues a memory request using a virtual address, the MMU performs a page table walk: it uses the virtual page number as an index into the table to find the corresponding physical frame number, which is then combined with the page offset to form the complete physical address. This translation is fundamental to providing each process with an isolated, contiguous virtual address space.

To accelerate this process, a small, hardware-associative cache called a Translation Lookaside Buffer (TLB) stores recent translations, avoiding the slower main memory access of a full table walk on a TLB hit. On a TLB miss, the system performs the walk, potentially requiring multiple memory accesses in multi-level (hierarchical) page tables. If the required page is not resident in physical memory (a page fault), the OS loads it from disk via swapping. This entire mechanism enables memory protection, isolation, and the efficient use of physical RAM through demand paging.

HIERARCHICAL MEMORY STRUCTURES

Frequently Asked Questions

A page table is a core data structure in virtual memory systems, enabling the efficient and secure mapping of virtual addresses used by software to physical addresses in hardware RAM. This FAQ addresses its role in modern computing and agentic architectures.

A page table is a data structure used by an operating system's virtual memory system to store the mapping between virtual addresses used by a process and physical addresses in RAM. It works by dividing memory into fixed-size blocks called pages (e.g., 4KB). When a process accesses a memory address, the Memory Management Unit (MMU) uses the page table to translate the virtual page number into a physical frame number. This translation allows multiple processes to have their own contiguous virtual address spaces while the OS manages the fragmented physical memory, providing isolation, security, and the illusion of abundant memory.

Key Mechanism:

  • Virtual Address: Generated by the CPU, contains a Virtual Page Number (VPN) and an offset.
  • Page Table Lookup: The MMU uses the VPN as an index into the process's page table to find the corresponding Page Table Entry (PTE).
  • PTE Contents: Contains the Physical Frame Number (PFN), along with control bits (present, readable, writable, dirty, accessed).
  • Address Formation: The PFN is combined with the offset from the virtual address to produce the final physical address for the RAM access.
Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.