A cache hierarchy is a layered arrangement of small, fast static RAM (SRAM) memory caches (L1, L2, L3) integrated into a processor to reduce the average time and energy required to access data from main memory. Each successive level (L1, L2, L3) is larger, slower, and shared among more processor cores, exploiting the principles of temporal and spatial locality. The memory management unit (MMU) orchestrates data movement between these levels, with the goal of keeping the most frequently used data in the fastest, closest cache (L1).
