Iceberg template

It extracts data from multiple sources and ingests your data to your data lake built on Amazon Simple Storage Service (Amazon S3) using both batch and streaming jobs. It also enables time travel, rollback, hidden partitioning, and schema evolution changes, such as adding, dropping, renaming, updating, and reordering columns.ĪWS Glue is one of the key elements to building data lakes. You can perform ACID transactions against your data lakes by using simple SQL expressions. Iceberg is an open table format designed for large analytic workloads on huge datasets. This means that not only inserts but also updates and deletes need to be replicated into the data lakes.Īpache Iceberg provides the capability of ACID transactions on your data lakes, which allows concurrent queries to add or delete records isolated from any existing queries with read-consistency for queries.

There is also a common demand to reflect the changes occurring in the data sources into the data lakes. A large volume of data constantly comes from different data sources into the data lakes.

In a typical use case of data lakes, many concurrent queries run to retrieve consistent snapshots of business insights by aggregating query results. Nowadays, many customers have built their data lakes as the core of their data analytic systems. Post Syndicated from Tomohiro Tanaka original