Modern applications rely on distributed databases to handle massive amounts of data and scale seamlessly across multiple nodes. While sharding helps distribute the load, it also introduces a major challenge — cross-shard joins and data movement, which can significantly impact performance.
When a query requires joining tables stored on different shards, the database must move data across nodes, leading to: