Data skew is an imbalance in the distribution of data or computational workload across partitions, shards, or nodes in a distributed system. This creates hotspots where specific nodes handle a disproportionate share of the load, leading to uneven resource utilization, degraded parallel processing performance, and increased latency for operations like retrieval or inference. In the context of agentic memory and context management, skew can occur in vector database partitions or knowledge graph shards, causing specific memory stores to become bottlenecks during retrieval-augmented generation or multi-agent coordination.
