SlideShare a Scribd company logo
Apache Kafka & Kafka Connect
( ETL )
Future Architect ,Inc
Keigo Suda
2016/11/25
D&S Data Night vol.04
12
! Kafka Connect ETL
( )
( ) /
13
! Future Architect ,Inc ( )
!
! IoT
14
15
Kafka Connect
16
! Kafka ver 0.9
! Kafka
!
!
! /
Kafka Connect
17
Connectors
https://www.confluent.io/product/connectors/
Kafka Connect
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)
21
22
!
!
!
Kafka Connect ( )
23
ETL Kakfa Connect( )
P T
// / /
AC
KM
I
M
KM
Kafka Connect
25http://docs.confluent.io/2.0.0/connect/userguide.html#getting-started
Kakfa Connect
Kafka Connect
26
Kakfa Connect
http://docs.confluent.io/2.0.0/connect/userguide.html#getting-started
27
Stream & Partition(RDB )
http://www.slideshare.net/KaufmanNg/data-pipelines-with-kafka-connect
28
N:1
Stream & Partition(RDB )
http://www.slideshare.net/KaufmanNg/data-pipelines-with-kafka-connect
29
Kakfa Connect
http://docs.confluent.io/2.0.0/connect/userguide.html#getting-started
30
Worker & Connector
http://docs.confluent.io/2.0.0/connect/userguide.html#getting-started
31
CONNECTER
WORKER
STREAM
PARTITION
STANDALONE MODE
DISTRIBUTED MODE
TASK
32
$ bin/connect-standalone.sh config/connect-standalone.properties connector1.properties
(Standalone mode)
!
!
! Worker
! Connector
Worker Connector
33
name=local-file-source
connector.class=FileStreamSource
tasks.max=1
file=test.txt
topic=connect-test
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
...
Worker
Connector
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)
35
!
!
! ETL
36
! Kafka Connect Source Sink
! Sink/Source Sink
Kafka Connect
Source Sink
37
! Connector/Task OK
! API OK
( )
( )
38
Connector( )
39
!!(source pull)
Task
40
! Task Connector
!
Deploy
name=sample-sink
connector.class=SampleSinkConnector
tasks.max=1
topic=connect-test
Connector
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)
42
! Kafka
!
Kafka Connect
43
!
API
Kafka Connect
44
(kafka-connect-jdbc)
45
! Confluent
Confuluent Platform
Kafka Connect
46
! Kakfa ( )
! Confluent….

More Related Content

What's hot (20)

PDF
S13 Oracle Database を Microsoft Azure 上で運用する為に~基本事項とベストプラクティス
Microsoft Azure Japan
 
PPTX
トランザクションをSerializableにする4つの方法
Kumazaki Hiroki
 
PDF
3分でわかるAzureでのService Principal
Toru Makabe
 
PPTX
事例で学ぶApache Cassandra
Yuki Morishita
 
PPTX
Apache Bigtopによるオープンなビッグデータ処理基盤の構築(オープンデベロッパーズカンファレンス 2021 Online 発表資料)
NTT DATA Technology & Innovation
 
PDF
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
NTT DATA OSS Professional Services
 
PDF
認証の課題とID連携の実装 〜ハンズオン〜
Masaru Kurahayashi
 
PDF
Apache Hadoopの新機能Ozoneの現状
NTT DATA OSS Professional Services
 
PPTX
Amazon EKS によるスマホゲームのバックエンド運用事例
gree_tech
 
PDF
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Cloudera Japan
 
PPTX
初心者向けMongoDBのキホン!
Tetsutaro Watanabe
 
PPTX
Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...
NTT DATA Technology & Innovation
 
PDF
Apache Bigtop3.2 (仮)(Open Source Conference 2022 Online/Hiroshima 発表資料)
NTT DATA Technology & Innovation
 
PPTX
Dockerからcontainerdへの移行
Akihiro Suda
 
PDF
Cloud runのオートスケールを検証してみる
虎の穴 開発室
 
PDF
ゲームアーキテクチャパターン (Aurora Serverless / DynamoDB)
Amazon Web Services Japan
 
PDF
At least onceってぶっちゃけ問題の先送りだったよね #kafkajp
Yahoo!デベロッパーネットワーク
 
PDF
SolrとElasticsearchを比べてみよう
Shinsuke Sugaya
 
PPTX
[社内勉強会]ELBとALBと数万スパイク負荷テスト
Takahiro Moteki
 
PDF
ksqlDB로 실시간 데이터 변환 및 스트림 처리
confluent
 
S13 Oracle Database を Microsoft Azure 上で運用する為に~基本事項とベストプラクティス
Microsoft Azure Japan
 
トランザクションをSerializableにする4つの方法
Kumazaki Hiroki
 
3分でわかるAzureでのService Principal
Toru Makabe
 
事例で学ぶApache Cassandra
Yuki Morishita
 
Apache Bigtopによるオープンなビッグデータ処理基盤の構築(オープンデベロッパーズカンファレンス 2021 Online 発表資料)
NTT DATA Technology & Innovation
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
NTT DATA OSS Professional Services
 
認証の課題とID連携の実装 〜ハンズオン〜
Masaru Kurahayashi
 
Apache Hadoopの新機能Ozoneの現状
NTT DATA OSS Professional Services
 
Amazon EKS によるスマホゲームのバックエンド運用事例
gree_tech
 
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Cloudera Japan
 
初心者向けMongoDBのキホン!
Tetsutaro Watanabe
 
Apache Sparkの基本と最新バージョン3.2のアップデート(Open Source Conference 2021 Online/Fukuoka ...
NTT DATA Technology & Innovation
 
Apache Bigtop3.2 (仮)(Open Source Conference 2022 Online/Hiroshima 発表資料)
NTT DATA Technology & Innovation
 
Dockerからcontainerdへの移行
Akihiro Suda
 
Cloud runのオートスケールを検証してみる
虎の穴 開発室
 
ゲームアーキテクチャパターン (Aurora Serverless / DynamoDB)
Amazon Web Services Japan
 
At least onceってぶっちゃけ問題の先送りだったよね #kafkajp
Yahoo!デベロッパーネットワーク
 
SolrとElasticsearchを比べてみよう
Shinsuke Sugaya
 
[社内勉強会]ELBとALBと数万スパイク負荷テスト
Takahiro Moteki
 
ksqlDB로 실시간 데이터 변환 및 스트림 처리
confluent
 

Viewers also liked (20)

PDF
Awsでつくるapache kafkaといろんな悩み
Keigo Suda
 
PDF
スマートファクトリーを支えるIoTインフラをつくった話
Keigo Suda
 
PDF
Kafka logをオブジェクトストレージに連携する方法まとめ
Keigo Suda
 
PDF
Kafkaによるリアルタイム処理
Naoki Yanai
 
PDF
Kafkaを使った マイクロサービス基盤 part2 +運用して起きたトラブル集
matsu_chara
 
PDF
Kafka Connect(Japanese)
Roman Shtykh
 
PPTX
Data Pipelines with Kafka Connect
Kaufman Ng
 
PPTX
Kafkaを活用するためのストリーム処理の基本
Sotaro Kimura
 
PDF
AWSマネージドサービスをフル活用したヘルスケアIoTプラットフォーム
Hiroki Takeda
 
PDF
Pony concurrency built into the type system
matsu_chara
 
PDF
Lt 私の○○遍歴教えるね これまで愛したキーボードたち
Keigo Suda
 
PDF
痛い目にあってわかる HAクラスタのありがたさ
Takatoshi Matsuo
 
PDF
Apache drillを業務利用してみる(までの道のり)
Keigo Suda
 
PDF
Is spark streaming based on reactive streams?
chibochibo
 
PDF
基幹業務もHadoop(EMR)で!!のその後
Keigo Suda
 
PDF
MapReduceを置き換えるSpark 〜HadoopとSparkの統合〜 #cwt2015
Cloudera Japan
 
PPTX
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
confluent
 
PDF
Dockerのディスクについて ~ファイルシステム・マウント方法など~
HommasSlide
 
ODP
Introduction to Kafka connect
Knoldus Inc.
 
PPTX
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Jamie Grier
 
Awsでつくるapache kafkaといろんな悩み
Keigo Suda
 
スマートファクトリーを支えるIoTインフラをつくった話
Keigo Suda
 
Kafka logをオブジェクトストレージに連携する方法まとめ
Keigo Suda
 
Kafkaによるリアルタイム処理
Naoki Yanai
 
Kafkaを使った マイクロサービス基盤 part2 +運用して起きたトラブル集
matsu_chara
 
Kafka Connect(Japanese)
Roman Shtykh
 
Data Pipelines with Kafka Connect
Kaufman Ng
 
Kafkaを活用するためのストリーム処理の基本
Sotaro Kimura
 
AWSマネージドサービスをフル活用したヘルスケアIoTプラットフォーム
Hiroki Takeda
 
Pony concurrency built into the type system
matsu_chara
 
Lt 私の○○遍歴教えるね これまで愛したキーボードたち
Keigo Suda
 
痛い目にあってわかる HAクラスタのありがたさ
Takatoshi Matsuo
 
Apache drillを業務利用してみる(までの道のり)
Keigo Suda
 
Is spark streaming based on reactive streams?
chibochibo
 
基幹業務もHadoop(EMR)で!!のその後
Keigo Suda
 
MapReduceを置き換えるSpark 〜HadoopとSparkの統合〜 #cwt2015
Cloudera Japan
 
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
confluent
 
Dockerのディスクについて ~ファイルシステム・マウント方法など~
HommasSlide
 
Introduction to Kafka connect
Knoldus Inc.
 
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Jamie Grier
 
Ad

Similar to Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装) (20)

PDF
Apache Kafka - Scalable Message Processing and more!
Guido Schmutz
 
PDF
Camel Kafka Connectors: Tune Kafka to “Speak” with (Almost) Everything (Andre...
HostedbyConfluent
 
PDF
Kafka Connect & Streams - the ecosystem around Kafka
Guido Schmutz
 
PDF
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
confluent
 
PDF
Leverage Kafka to build a stream processing platform
confluent
 
PDF
Apache Kafka - A modern Stream Processing Platform
Guido Schmutz
 
PDF
Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core
Guido Schmutz
 
PDF
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Lightbend
 
PDF
Spark streaming + kafka 0.10
Joan Viladrosa Riera
 
PDF
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
confluent
 
PDF
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
 
PDF
Kafka streams - From pub/sub to a complete stream processing platform
Paolo Castagna
 
PDF
From Zero to Hero with Kafka Connect
Databricks
 
PPTX
Real time Messages at Scale with Apache Kafka and Couchbase
Will Gardella
 
PDF
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
HostedbyConfluent
 
PDF
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Paolo Castagna
 
PDF
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Guido Schmutz
 
PDF
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
Joan Viladrosa Riera
 
PDF
Introducing Kafka's Streams API
confluent
 
PDF
Best Practices for Middleware and Integration Architecture Modernization with...
Claus Ibsen
 
Apache Kafka - Scalable Message Processing and more!
Guido Schmutz
 
Camel Kafka Connectors: Tune Kafka to “Speak” with (Almost) Everything (Andre...
HostedbyConfluent
 
Kafka Connect & Streams - the ecosystem around Kafka
Guido Schmutz
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
confluent
 
Leverage Kafka to build a stream processing platform
confluent
 
Apache Kafka - A modern Stream Processing Platform
Guido Schmutz
 
Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core
Guido Schmutz
 
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Lightbend
 
Spark streaming + kafka 0.10
Joan Viladrosa Riera
 
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
confluent
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
 
Kafka streams - From pub/sub to a complete stream processing platform
Paolo Castagna
 
From Zero to Hero with Kafka Connect
Databricks
 
Real time Messages at Scale with Apache Kafka and Couchbase
Will Gardella
 
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
HostedbyConfluent
 
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Paolo Castagna
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Guido Schmutz
 
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
Joan Viladrosa Riera
 
Introducing Kafka's Streams API
confluent
 
Best Practices for Middleware and Integration Architecture Modernization with...
Claus Ibsen
 
Ad

Recently uploaded (20)

PDF
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
PPT
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
PPT
introduction to networking with basics coverage
RamananMuthukrishnan
 
PPTX
PE introd.pptxfrgfgfdgfdgfgrtretrt44t444
nepmithibai2024
 
PPTX
法国巴黎第二大学本科毕业证{Paris 2学费发票Paris 2成绩单}办理方法
Taqyea
 
PPTX
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
PPTX
L1A Season 1 ENGLISH made by A hegy fixed
toszolder91
 
PPT
introductio to computers by arthur janry
RamananMuthukrishnan
 
PPTX
04 Output 1 Instruments & Tools (3).pptx
GEDYIONGebre
 
PPTX
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
PPTX
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
PDF
Apple_Environmental_Progress_Report_2025.pdf
yiukwong
 
PPTX
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
DOCX
Custom vs. Off-the-Shelf Banking Software
KristenCarter35
 
PDF
BRKACI-1003 ACI Brownfield Migration - Real World Experiences and Best Practi...
fcesargonca
 
PDF
The Internet - By the numbers, presented at npNOG 11
APNIC
 
PPTX
internet básico presentacion es una red global
70965857
 
PDF
Cleaning up your RPKI invalids, presented at PacNOG 35
APNIC
 
PDF
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
PPTX
Orchestrating things in Angular application
Peter Abraham
 
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
introduction to networking with basics coverage
RamananMuthukrishnan
 
PE introd.pptxfrgfgfdgfdgfgrtretrt44t444
nepmithibai2024
 
法国巴黎第二大学本科毕业证{Paris 2学费发票Paris 2成绩单}办理方法
Taqyea
 
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
L1A Season 1 ENGLISH made by A hegy fixed
toszolder91
 
introductio to computers by arthur janry
RamananMuthukrishnan
 
04 Output 1 Instruments & Tools (3).pptx
GEDYIONGebre
 
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
Apple_Environmental_Progress_Report_2025.pdf
yiukwong
 
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
Custom vs. Off-the-Shelf Banking Software
KristenCarter35
 
BRKACI-1003 ACI Brownfield Migration - Real World Experiences and Best Practi...
fcesargonca
 
The Internet - By the numbers, presented at npNOG 11
APNIC
 
internet básico presentacion es una red global
70965857
 
Cleaning up your RPKI invalids, presented at PacNOG 35
APNIC
 
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
Orchestrating things in Angular application
Peter Abraham
 

Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)