반응형
Notice
Recent Posts
Recent Comments
Link
관리 메뉴

지구정복

[Iceberg] Iceberg Guide Book Summary | CHAPTER 3. Lifecycle of Write and Read Queries 본문

데이터 엔지니어링 정복/Iceberg

[Iceberg] Iceberg Guide Book Summary | CHAPTER 3. Lifecycle of Write and Read Queries

noohhee 2025. 3. 4. 21:03
728x90
반응형

 

apache iceberg guidebook에서 가져온 내용입니다.

 

 

CHAPTER 3

Lifecycle of Write and Read Queries

 

 

 

Writing Queries in Apache Iceberg

 

 

Create the Table

Send the query to the engine

Write the metadata file

Update the catalog file to commit changes

 

Insert the Query

 

Send the query to the engine

Check the catalog

Write the datafiles and metadata files

Update the catalog file to commit changes

 

 

 

Merge Query

Send the query to the engine

Check the catalog

Write datafiles and metadata files

Update the catalog file to commit changes

 

 

 

 

 

Reading Queries in Apache Iceberg

 

The SELECT Query

Send the query to the engine

Check the catalog

Get information from the metadata file

Get information from the manifest list

Get information from the manifest file

 

 

 

The Time-travel Query

 

 

Apache Iceberg provides

two ways to run time-travel queries: using a timestamp and using a snapshot ID

 

To analyze our order table’s history, we will query the history metadata table

 

 

# Spark SQL

SELECT * FROM catalog.db.orders.history;

 

 

 

the timestamp or the snapshot ID we will be targeting is the second one. This

is the query that we will run:

 

# Spark SQL

SELECT * FROM orders

TIMESTAMP AS OF '2023-03-07 20:45:08.914'

 

 

Send the query to the engine

Check the catalog

Get information from the metadata file

Get information from the manifest list

Get information from the manifest file

 

 

 

 

 

 

 

 

 

 

 

 

 

728x90
반응형
Comments