'데이터 엔지니어링 정복/NiFi' 카테고리의 글 목록

Notice

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

728x90

목록데이터 엔지니어링 정복/NiFi (2)

지구정복

[NiFi] Data Pipeline | log -> Json -> gzip.parquet -> HDFS -> Iceberg Table

NiFi 1.15.2Spark 3.1.4Iceberg 1.3.1Hive 3.1.3 마지막 HDFS에 저장된 gzip.parquet 파일을 읽어서 Iceberg Table로 적재하는 PySpark Streaming을 실행한다. 미리 Iceberg Table은 생성되어야 있어야 하므로 PySpark로 만들어준다.#iceberg table 생성쿼리q="""CREATE TABLE iceberg_test_db.test_table_name ( data STRING, log_timestamp timestamp_ntz)USING icebergPARTITIONED BY (days(log_timestamp))TBLPROPERTIES ( 'read.parquet.vectorization.enabled..

데이터 엔지니어링 정복/NiFi 2025. 4. 22. 14:42

[NiFi] 주로 사용하는 프로세서 설명

업무간 사용하는 프로세서에 대해서 공부해보자 NiFi 1.15.2 사용중이다. ListenUDP설명:Listens for Datagram Packets on a given port. The default behavior produces a FlowFile per datagram, however for higher throughput the Max Batch Size property may be increased to specify the number of datagrams to batch together in a single FlowFile. This processor can be restricted to listening for datagrams from a specific remote host and..

데이터 엔지니어링 정복/NiFi 2025. 4. 16. 14:15

Prev 1 Next

목록데이터 엔지니어링 정복/NiFi (2)

지구정복

티스토리툴바