What Is AWS Redshift Data Sharing?

As a data engineer, most of my time will be spent constructing data pipelines from source systems to data lakes, databases, and warehouses. In the cloud world, the databases/warehouses are usually isolated in a private subnet in a VPC, and sharing the data will be a challenge. One of the pain points is to have this data distributed to several teams in the organization. Data can be shared by exporting into files, but this increases the concerns of security, data duplication, and maintenance of these export pipelines. 

I was delighted to find that we have a utility in AWS Redshift that will let you share the data between two Redshift clusters without building any ETL infrastructure. AWS Redshift data sharing allows you to securely share live, read-only data between different Redshift clusters within or across AWS accounts and regions. It eliminates the need for data duplication and helps multiple stakeholders access the same dataset, allowing different departments, teams, or external partners to collaborate and derive insights from shared data. By sharing specific databases, schemas, tables, or views from the Producer Cluster to one or more Consumer Clusters, organizations can significantly reduce the complexity of their data pipelines.

Leave a Reply

Your email address will not be published. Required fields are marked *