通过Debezium连接器获取sqlserver实时数据
1、下载连接器
debezium-connector-sqlserver-1.2.5.Final-plugin.tar.gz 下载地址
2、配置kafka Distributed模式启动
在启动kafkaconnect的distributed模式之前,首先需要创建三个主题,三个主题分别对应connect-distributed.propertites文件中config.storage.topic(default connect-configs)、offset.storage.topic (default connect-offsets) 、status.storage.topic (default connect-status)的配置。
1、config.storage.topic:用以保存connector和task的配置信息,需要注意的是这个主题的分区数只能是1,而且是有多副本的。
2、offset.storage.topic:用以保存offset信息。
3、status.storage.topic:用以保存connetor的状态信息。
2.1 创建topic
./kafka-topics.sh --zookeeper 10.170.130.xxx:2181 --create --topic connect-offsets --replication-factor 2 --partitions 3
./kafka-topics.sh --zookeeper 10.170.130.xxx:2181 --create --topic connect-configs --replication-factor 2 --partitions 1
./kafka-topics.sh --zookeeper 10.170.130.xxx:2181 --create --topic connect-status --replication-factor 2 --partitions 3
replication-factor 副本数
partitions 也根据集群规模设定
可能用到的命令:
kafka-topics --list --zookeeper 10.2.13.xxx:2181
kafka-topics --create --zookeeper 10.2.13.xxx:2181 --replication-factor 1 --partitions 3 --topic topic_calllog_phone6
kafka-topics --delete --topic topic_ArrayVPN_Log --zookeeper 10.160.14.xxx:2181
kafka-console-producer --broker-list 10.2.13.xxx:9092 --topic topic_calllog_phone
kafka-console-consumer --bootstrap-server 10.2.13.xx:9092 --topic topic_calllog_phone6 --from-beginning
3、将需要的connector jar包解压到connectors文件夹下(每个节点)
cd /usr/local/kafka
mkdir connectors
#上传debezium-connector-sqlserver-1.2.5.Final-plugin.tar.gz 到服务器上
tar -zxvf debezium-connector-sqlserver-1.2.5.Final-plugin.tar.gz -C /usr/local/kafka/connectors/
4、编写connect-distributed.properties配置文件
cd /usr/local/kafka/config/
# kafka集群地址
bootstrap.servers=10.170.130.xxx:9092,10.170.130.xxx:9092,10.170.130.xxx:9092
# Connector集群的名称,同一集群内的Connector需要保持此group.id一致
group.id=connect-cluster
# 存储到kafka的数据格式
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false
# 内部转换器的格式,针对offsets、config和status,一般不需要修改(一定要手动添加)
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
# 用于保存offsets的topic,应该有多个partitions,并且拥有副本(replication)
# Kafka Connect会自动创建这个topic,但是你可以根据需要自行创建
offset.storage.topic=connect-offsets
offset.storage.replication.factor=3
offset.storage.partitions=3
# 保存connector和task的配置,应该只有1个partition,并且有多个副本
config.storage.topic=connect-configs
config.storage.replication.factor=3
# 用于保存状态,可以拥有多个partition和replication
status.storage.topic=connect-status
status.storage.replication.factor=3
status.storage.partitions=3
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000
# RESET主机名,默认为本机
#rest.host.name=
# REST端口号
rest.port=18083
# The Hostname & Port that will be given out to other workers to connect to i.e. URLs that are routable from other servers.
#rest.advertised.host.name=
#rest.advertised.port=
# 保存connectors的路径
#plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/opt/module/kafka_2.11-2.4.0/connectors
5、分发配置到其余节点
6、启动kafka connect集群(每台节点执行命令)
#开启守护进程并启动
./connect-distributed.sh -daemon /usr/local/kafka/config/connect-distributed.properties
6.1 页面访问
http://10.170.130.xxx:18083/
返回:
{"version":"2.2.2","commit":"1d348535a0a747d1","kafka_cluster_id":"9gckc_g_Q-eOFjTHSLMacA"}
http://hserver1:18083/connectors/
返回:【】
curl -s hserver1:18083/connector-plugins | jq
返回:plugins下放的jar包
7、配置sqlserver CDC数据变更模式
7.1启动serveragent代理服务(已启动,启动步骤在sqlserver笔记本中)
7.2 开启数据库以及表的CDC模式
#创建数据库
Create database oneJ
#开启数据库的CDC
USE oneJ
GO
EXEC sys.sp_cdc_enable_db
GO
# 开启表的CDC
USE oneJ
GO
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'bs_year_task_gather_unit',
@role_name = NULL,
@supports_net_changes = 1
GO
8、 提交Connector用户配置
提交之前要手动创建以下参数中的kafka的topic
kafka-topics --create --zookeeper 10.2.13.xxx:2181 --replication-factor 2 --partitions 3 --topic dbhistory.fullfillment
暂时用的psotman提交成功,以下方式还没测试
# curl -s -X POST -H "Content-Type: application/json" --data '{
{
"name": "inventory-connector",
"config": {
"connector.class": "io.debezium.connector.sqlserver.SqlServerConnector",
"database.hostname": "10.170.130.xxx",
"database.port": "1433",
"database.user": "sa",
"database.password": "123456",
"database.dbname": "testme",
"database.server.name": "fullfillment",
"table.whitelist": "dbo.bs_news_docs",
"database.history.kafka.bootstrap.servers": "10.170.130.xxx:9092,10.170.130.xxx:9092,10.170.130.xxx:9092",
"database.history.kafka.topic": "dbhistory.fullfillment"
}
}
}' http://10.170.130.xxx:18083/connectors | jq
上述参数说明
1、name:自定义连接器名称
2、database.server.name:自定义名字(在kafka生成的topic前缀相关)
3、table.whitelist:监听的表列表
4、database.history.kafka.topic:需要存储信息的topic
8.1 其余的一些操作
#查看connector的信息
curl -s hserver1:18083/connectors/connector-mssql-online | jq
#查看connector下运行的task信息
curl -s hserver1:18083/connectors/connector-mssql-online/tasks | jq
#查看connector当前状态
#curl -s hserver1:18083/connectors/connector-mssql-online/status | jq
#删除 Connector
curl -s -X DELETE hserver1:18083/connectors/connector-mssql-online
#暂停/重启 Connector
curl -s -X PUT hserver1:18083/connectors/connector-mssql-online/pause
curl -s -X PUT hserver1:18083/connectors/connector-mssql-online/resume
出现监听不了的状况
将库的监听释放
重新监听表
cd /
find -name "sqlagent.out"
– 对表禁用变更数据捕获
USE MyDatabase;
GO
EXEC sys.sp_cdc_disable_table
@source_schema = N ‘dbo’ ,
@source_name = N ‘CDC_Test’ ,
@capture_instance = N ‘dbo_CDC_Test’
GO
– 对数据库禁用变更数据捕获
USE MyDatabase;
GO
EXECUTE sys.sp_cdc_disable_db;
GO