일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
- 삼성역맛집
- 파이썬
- bigdata engineer
- 프로그래머스
- 자바
- BFS
- bigdata engineering
- Data Engineering
- 백준
- HIVE
- 용인맛집
- 맛집
- 알고리즘
- 코테
- BigData
- Trino
- 개발
- 영어
- java
- apache iceberg
- hadoop
- 코엑스
- 코엑스맛집
- 코딩테스트
- 코딩
- 양평
- dfs
- Iceberg
- Data Engineer
- 여행
- Today
- Total
지구정복
[Iceberg] Iceberg Guide Book Summary | CHAPTER 3. Lifecycle of Write and Read Queries 본문
[Iceberg] Iceberg Guide Book Summary | CHAPTER 3. Lifecycle of Write and Read Queries
noohhee 2025. 3. 4. 21:03
apache iceberg guidebook에서 가져온 내용입니다.
CHAPTER 3
Lifecycle of Write and Read Queries

Writing Queries in Apache Iceberg

Create the Table
Send the query to the engine
Write the metadata file
Update the catalog file to commit changes

Insert the Query
Send the query to the engine
Check the catalog
Write the datafiles and metadata files
Update the catalog file to commit changes
Merge Query
Send the query to the engine
Check the catalog
Write datafiles and metadata files
Update the catalog file to commit changes
Reading Queries in Apache Iceberg
The SELECT Query
Send the query to the engine
Check the catalog
Get information from the metadata file
Get information from the manifest list
Get information from the manifest file
The Time-travel Query
Apache Iceberg provides
two ways to run time-travel queries: using a timestamp and using a snapshot ID
To analyze our order table’s history, we will query the history metadata table
# Spark SQL
SELECT * FROM catalog.db.orders.history;

the timestamp or the snapshot ID we will be targeting is the second one. This
is the query that we will run:
# Spark SQL
SELECT * FROM orders
TIMESTAMP AS OF '2023-03-07 20:45:08.914'
Send the query to the engine
Check the catalog
Get information from the metadata file
Get information from the manifest list
Get information from the manifest file