AWS Glue is a powerful serverless data integration that simplifies data discovery, preparation, and transformation. However, as with any tool, real-world application reveals quirks and corner cases that are not clearly identified in documentation. 

In this article, let’s talk about some key challenges observed from my hands-on experience while building data pipelines using Glue crawlers when dealing with CSV files, schema evolution, partitioning, and crawler update settings.

Leave a Reply

Your email address will not be published. Required fields are marked *