Debezium介绍
Debezium是一个开源的分布式平台,用于捕获数据库的更改事件,并将这些事件转换为可观察的流。它可以连接到各种不同类型的数据库,包括MySQL、PostgreSQL、MongoDB等,捕获数据库中的数据更改,并将这些更改转发到消息代理系统(如Kafka)中,以便其他应用程序可以实时地消费这些更改事件。
Debezium的主要特点包括:
1.实时数据捕获:Debezium能够实时地捕获数据库中的更改事件,包括插入、更新和删除操作,以便其他应用程序可以实时地获取这些更改。
2.可观察的流:捕获的数据库更改事件会被转换为可观察的流,这意味着其他应用程序可以轻松地订阅和消费这些事件,以便实时地获取数据库中的更改。
3.支持多种数据库:Debezium支持连接到各种不同类型的数据库,包括MySQL、PostgreSQL、MongoDB等,使得它非常灵活和通用。
4.可扩展性:由于Debezium是一个分布式平台,它具有良好的可扩展性,可以处理大规模的数据流,并确保高可用性和容错性。
5.Debezium 运行在 Kafka Connect 之上,这使它能够充分利用 Kafka 的分布式架构
Debezium部署方式
1. kafka connect 通过kafka connect来部署debezium
2. debezium server 通过部署debezium服务器
3. embedded engine 将debezium connect作为嵌入示引擎 嵌入应用程序中如java应用中
本文概括
本文使用常用的方式通过kafka connect来部署debezium,从监听sqlserver到生产消息入kafka队列
准备工具:
- kafka
- sqlserver2019
- debezium-connector-sqlserver
kafka集群和sqlserver2019本文不做部署说明,只针对debezium说明
详细步骤
一、 下载 debezium-connector-sqlserver
进入debezium官网下载connector组件 传送门
二、在kafka目录新建一个 connectors 目录
将下载的debezium connector解压放在该目录下
进入kafka config目录并编辑 connect-distributed.properties文件
在最下面加入plugin.path=刚刚新建的connectors目录
# List of comma-separated URIs the REST API will listen on. The supported protocols are HTTP and HTTPS.
# Specify hostname as 0.0.0.0 to bind to all interfaces.
# Leave hostname empty to bind to default interface.
# Examples of legal listener lists: HTTP://myhost:8083,HTTPS://myhost:8084"
listeners=HTTP://:8083
# The Hostname & Port that will be given out to other workers to connect to i.e. URLs that are routable from other servers.
# If not set, it uses the value for "listeners" if configured.
#rest.advertised.host.name=
#rest.advertised.port=
#rest.advertised.listener=
rest.port=18083
# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include
# any combination of:
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
#plugin.path=
# 保存connectors的路径fka_2.12-3.5.1
plugin.path=/opt/kafka/kafka_2.12-3.5.1/connectors
首先启动kafka执行命令:
[root@test config]# /opt/kafka/kafka_2.12-3.5.1/bin/kafka-server-start.sh /opt/kafka/kafka_2.12-3.5.1/config/server.properties
启动分布式模式
../bin/connect-distributed.sh /opt/kafka/kafka_2.12-3.5.1/config/connect-distributed.properties
执行命令: curl localhost:8083 初出现下面则表示成功
sqlserver开启CDC脚本(可根据场景更改)
if exists(select 1 from sys.databases where name='test' and is_cdc_enabled=0)
begin
exec sys.sp_cdc_enable_db--开启数据库CDC
END
go
ALTER DATABASE crmnew ADD FILEGROUP CDC; --为该库添加名为CDC的⽂件组
ALTER DATABASE crmnew
ADD FILE
(
NAME='CDC',
FILENAME ='D:\DB_data\CDC.ndf' --[日志目录]请按照实际情况指定
)
TO FILEGROUP CDC;
go
IF EXISTS(SELECT 1 FROM sys.tables WHERE name='aaaa' AND is_tracked_by_cdc = 0)
BEGIN
EXEC sys.sp_cdc_enable_table
@source_schema = 'dbo', -- source_schema
@source_name = 'aaaa', -- table_name
@capture_instance = NULL, -- capture_instance
@role_name = NULL, -- role_name
@index_name = NULL, -- index_name
@filegroup_name = 'CDC' -- filegroup_name
END;
go
EXEC sys.sp_cdc_change_job
@job_type = 'cleanup'
,@retention = 14400 --更改行将在更改表中保留的分钟数
,@threshold = 5000 --清除时可以使用一条语句删除的删除项的最大数量
-- 重启作业
go
EXEC sys.sp_cdc_start_job @job_type = N'cleanup';
接下来创建一个 Source 连接器,此前先要设定好这个连接器的相关配置,请求接口:localhost:8083/connectors/ json内容如下:
{
"name": "sqlserver-connector3",
"config": {
"connector.class": "io.debezium.connector.sqlserver.SqlServerConnector",
"database.hostname": "localhost",
"database.port": "1433",
"database.user": "sa",
"database.password": "123",
"database.dbname": "crmnew",
"table.include.list": "dbo.aaaa",
"database.server.name": "fullfillment",
"database.history.kafka.bootstrap.servers": "localhost:9092",
"database.history.kafka.topic": "fullfillment.dbo.aaaa"
}
}
我们监听kafka topic
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic fullfillment.dbo.aaaa
在sqlserver新增一条数据:
至此完成监听