일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
- 코딩
- dfs
- 용인맛집
- HIVE
- 코테
- 자바
- 백준
- 코딩테스트
- 알고리즘
- 코엑스
- 여행
- 파이썬
- 개발
- hadoop
- 영어
- apache iceberg
- BFS
- 코엑스맛집
- Data Engineering
- bigdata engineering
- java
- bigdata engineer
- Trino
- 양평
- Iceberg
- 맛집
- BigData
- 삼성역맛집
- 프로그래머스
- Data Engineer
- Today
- Total
목록전체 글 (492)
지구정복

Spark3로 managed table create하는데 아래와 같은 에러 발생 spark-sql> CREATE TABLE spark_catalog.db1_test.example_hive1 ( > id INT, > name STRING, > age INT > ) > STORED AS parquet; ErrorCaused by: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Table db1_test.example_hive1 failed strict managed table checks due to the following reas..
airflow db upgrade error발생하면 airflow db upgrade 진행 만약 위 명령어 에러나면 아래 airflow python package 설치해준다./usr/airflow/.pyenv/versions/venv/bin/pip install sqlalchemy==1.4.49/usr/airflow/.pyenv/versions/venv/bin/pip install apache-airflow-providers-common-sql 그리고 다시 airflow db upgrade 진행
CHAPTER 6 Apache SparkConfigurationConfiguring Apache Iceberg and SparkConfiguring via the CLIAs a first step, you’ll need to specify the required packages to be installed and used with the Spark session. To do so, Spark provides the --packages option, which allows Spark to easily download the specified Maven-based packages and its dependencies to add them to the classpath of your application. ..
CHAPTER 5 Iceberg Catalogs Requirements of an Iceberg CatalogIceberg provides a catalog interface that requires the implementation of a set of functions, primarily ones to list existing tables, create tables, drop tables, check whether a table exists, and rename tables. Hive Metastore, AWS Glue, and a filesystem catalog (Hadoop). with a filesystem as the catalog, there’s a file called version-hi..
CHAPTER 4 Optimizing the Performance of Iceberg TablesCompaction When you are querying yourApache Iceberg tables, you need to open and scan each file and then close the filewhen you’re done. The more files you have to scan for a query, the greater the costthese file operations will put on your query. it is possible to run intothe “small files problem,” where too many small files have an impact o..