Memory compression operates transparently within the memory hierarchy, typically between the CPU and main RAM. It applies fast, lossless algorithms like LZ4 or Zstandard to compress memory pages in real-time before they are written to or read from physical memory. This process, often managed by the operating system kernel or hypervisor, increases the effective capacity of RAM without adding physical hardware, delaying the need for costly memory swapping to disk. The core trade-off is between increased CPU utilization for compression/decompression and reduced I/O latency from fewer swap operations.
