Data sharding is a database architecture pattern that horizontally partitions a large dataset into smaller, independent, and more manageable subsets called shards, which are distributed across multiple database servers or nodes. This horizontal partitioning is typically based on a shard key, such as a user ID or geographic region, to distribute the load and enable parallel processing. The primary goals are to improve scalability by allowing the system to handle more data and transactions than a single server can manage, and to enhance performance by reducing query latency and contention.
