美团数据平台之Kafka应用实践和优化
美团数据平台中心
王萌萌
美团数据平台中心-数据集成团队负责人
个人简介
王萌萌,美团大数据平台数据集成方向负责人,
2015年加入美团,主要负责为美团大数据生产分
析提供数据采集、集成、分发服务。在海量数据
采集、异构数据源数据同步、消息队列等方面有
较多的实践经验。
大纲
• 背景介绍
• 实践1:读写延时优化
• 实践2:元数据服务稳定性提升
• 实践3:平台化管理
• 后续规划
背景介绍-Kafka在美团大数据体系的应用
• 业务定位:
Ø 数据采集管道
Ø 实时数据分发
• 典型服务场景:
Ø 离线计算的缓存层
Ø 实时数仓
Ø 实时事件处理
• 集群规模:
Ø 物理集群:30+
Ø 逻辑集群:100+
Ø 单集群最大broker数: 1800+
• Topic规模:
Ø topic总数:10w+
Ø 单集群最大partition数:10w+
• 数据规模:
Ø 峰值吞吐:700TB/s,3亿+ msg/s
Ø 天级消息量:23w亿+
背景介绍-服务规模
• SSD缓存架构
• 均衡策略
• 请求队列拆分
• sticky分区策略
实践1-读写延时优化
• 业务背景:延迟消费&实时消费场景并存
• 线上问题: PageCache容量不足引起磁盘读,不
仅速度变慢,而且会污染PageCache,影响其他
读写请求
累计百分比
请求时间分布
读写延时优化-SSD缓存架构
• 权衡成本与性能,引入SSD作为PageCache和HDD间的缓存层
• 热数据读SSD,冷数据读HDD
读写延时优化-SSD缓存架构
• 效果(延迟读较严重的集群):
Ø TP99 写入时间从40ms+降至5ms
Ø TP99 读取时间从300ms+降至180ms+
Ø HDD磁盘读取占比从19.5%降至1%
读写延时优化-SSD缓存架构
• 背景:
Ø 机型异构,磁盘容量各异
Ø 业务多样,流量差异大
• 问题:
Ø 磁盘利用率不均衡,偶发打满
Ø broker处理能力不均衡,热点频发
• 解法:
Ø 事前:空闲磁盘优先的副本分配算法
Ø 事后:流量、磁盘容量多维度均衡
读写延时优化-均衡策略
• RoundRobin分配算法未考虑磁盘使用情况
• 改进:副本分配时采用空闲磁盘优先(IDF)策略。IDF基于箱子模型实现,每块磁盘被划分
为若干块大小相等的箱子(30GB),每个分区占用1~n个箱子
均衡策略-空闲磁盘优先的副本分配算法
• 迁移算法:
Ø 整体策略:使用率高的磁盘迁移至使用率低的磁盘
Ø 细节策略:
ü 考虑多磁盘规格,不同容量磁盘权重不同
Ø 限制条件:
ü 保证replica分布在不同的broker上
ü 保证partition在不同磁盘上的数量均匀
ü 保证TOR容灾
ü 尽量保证partition/leader数均衡
均衡策略-磁盘均衡
均衡策略-磁盘均衡效果
• 挑战:迁移效率 vs 读写稳定性
• 优化方案:
Ø 提升迁移效率:
ü 流水线加速,长尾分区不影响整体迁移速度
Ø 提升迁移稳定性:
ü 低峰期迁移
ü 迁移并发控制、迁移限速
ü fetcher隔离
数据迁移优化
• 按批次(逻辑集群)提交迁移计划,串行执行 =>满足限流条件下,流水线向zk补充提交reassign
数据迁移优化-流水线加速
• 背景:follower实时读和延迟读共享一个fetcher线程,
延迟读影响实时读
Ø 延迟读的请求量远大于实时读,导致实时读请求积压
Ø 延迟读触发读磁盘,显著拖慢fetcher的拉取效率
• 改进:
Ø 所有ISR的follower共享fetcher
Ø 所有非ISR的follower共享fetcher
数据迁移优化-fetcher隔离
• 评价指标:同集群内leader写入速率的方
差尽可能小
• 细节设计:
Ø reassign和prefer提交流水线化
Ø region粒度的管理能力
Ø 关键参数配置化:
ü 执行时间
ü leader切换并发
ü reassign并发
均衡策略-流量均衡
• 核心组件处理链路分析:
Ø Processor
Ø RequestHandler
读写延时优化-请求队列拆分&sticky分区策略
• Processor瓶颈分析:
Ø 对线上的慢节点,io-ratio和io-wait-ratio均很低->瓶颈在process
读写延时优化-请求队列拆分&sticky分区策略
• Processor的瓶颈:
高QPS的produce请求->
processor处理性能下降->
responseQueueTime升高,
responseTime升高
->totalTime升高
读写延时优化-请求队列拆分&sticky分区策略
• RequestHandler分析:
Ø 多线程阻塞模型,多个RequestHandler共同消费
RequestQueue
Ø 可能的瓶颈点:mmap触发磁盘读写,降低
RequestQueue消费速度,反压引起Processor阻
塞
读写延时优化-请求队列拆分&sticky分区策略
• 线上真实情况:
Ø 高produce QPS造成 requestQueue拥堵(80%+)
• 指标体现:network handler idle percent <= 0.2 && request handler idle percent >0.2
Ø requestHandler处理慢,反压到processor(10%+)
• 指标体现:request handler idle percent <= 0.2
读写延时优化-请求队列拆分&sticky分区策略
• 解法:
Ø 读写请求队列拆分,避免写影响读
Ø 客户端sticky方式发送,降低QPS
• 业务背景:
Ø 超过一半的写请求来自日志收集,客户端可控
读写延时优化-请求队列拆分&sticky分区策略
• 效果:
Ø 客户端sticky策略:QPS降低为原来的1/20,吞吐&延时变化不大
Ø 读写队列拆分:
ü 集群1(高QPS):读延迟下降80%,QPS增加40%
ü 集群2(非高QPS):读写QPS/延迟无显著变化, ProcessorIdlePercent提升, QueueTime耗
时降低70%
读写延时优化-请求队列拆分&sticky分区策略
• 背景:
Ø 集群规模过大,节点变更(掉线、升级等)频繁,元数据广播压力大(broker数*partition数)
Ø FullGC频繁,主备controller切换,影响客户端消费
• 针对不同场景分别处理:
Ø broker状态变更:broker startup/failure,此时的广播不必须
Ø controllor failover:需要全量广播,分批进行
实践2:元数据服务稳定性提升-广播优化
• 背景:依赖zk做节点探活,当zk故障恢复后controller产生误判,进行不必要的meta更新,误
判分区不可用,从而影响客户端消费
元数据服务稳定性提升-SafeMode
• SafeMode:对掉线节点数量进行判断,超过阈值即进入SafeMode状态
元数据服务稳定性提升- SafeMode
• 核心指标监控
Ø 服务端指标:基础环境指标(IO/Mem/CPU)、分阶段请求延迟、入出流量
Ø 客户端视角的指标:请求延迟分布、元数据接口成功率
Ø partition/broker/集群多维度聚合统计
实践3-平台化管理
• 核心指标监控
• 自动化运维:机器管理、扩容升级、数据迁移、数据均衡
实践3-平台化管理
• 能力扩展:
Ø 事务能力
• 稳定性:
Ø 硬件故障容错
Ø QoS能力
• 扩展性:
Ø 存算分离
Ø 弹性架构
后续规划
Q&A
邮箱:jinlai_ch@qq.com

More Related Content

PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PDF
Stream Processing using Apache Flink in Zalando's World of Microservices - Re...
PDF
Flink powered stream processing platform at Pinterest
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PPTX
Introduction to Apache Kafka
PDF
Apache Kafka Architecture & Fundamentals Explained
PDF
Stream Processing with Apache Flink
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Stream Processing using Apache Flink in Zalando's World of Microservices - Re...
Flink powered stream processing platform at Pinterest
Dynamic Rule-based Real-time Market Data Alerts
Introduction to Apache Kafka
Apache Kafka Architecture & Fundamentals Explained
Stream Processing with Apache Flink

What's hot (20)

PDF
Introduction to Apache Flink
PDF
Collborative Agents with Tools & Knowledge (Graphs) using LangGraph & LangChain
PDF
Apache Flink 101 - the rise of stream processing and beyond
PPTX
RedisConf17- Using Redis at scale @ Twitter
PDF
Building Better Data Pipelines using Apache Airflow
PDF
Iceberg: A modern table format for big data (Strata NY 2018)
PDF
Fundamentals of Apache Kafka
PDF
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
PPTX
Evening out the uneven: dealing with skew in Flink
PDF
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
PDF
Scaling up uber's real time data analytics
PPTX
Apache Kafka Best Practices
PDF
Dataflow with Apache NiFi
PDF
Introduction to Kafka Streams
PDF
Red Hat Insights
PDF
Introduction to Apache NiFi 1.11.4
PDF
From Mainframe to Microservice: An Introduction to Distributed Systems
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
PDF
Apache Kafka in the Airline, Aviation and Travel Industry
PPTX
Apache Kafka at LinkedIn
Introduction to Apache Flink
Collborative Agents with Tools & Knowledge (Graphs) using LangGraph & LangChain
Apache Flink 101 - the rise of stream processing and beyond
RedisConf17- Using Redis at scale @ Twitter
Building Better Data Pipelines using Apache Airflow
Iceberg: A modern table format for big data (Strata NY 2018)
Fundamentals of Apache Kafka
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Evening out the uneven: dealing with skew in Flink
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Scaling up uber's real time data analytics
Apache Kafka Best Practices
Dataflow with Apache NiFi
Introduction to Kafka Streams
Red Hat Insights
Introduction to Apache NiFi 1.11.4
From Mainframe to Microservice: An Introduction to Distributed Systems
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Apache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka at LinkedIn
Ad

Similar to 美团数据平台之Kafka应用实践和优化 (20)

PDF
05 朱近之 ibm云计算解决方案概览 0611
PDF
Greenplum技术
PPTX
How do we manage more than one thousand of Pegasus clusters - engine part
PDF
云存储与虚拟化分论坛 基于云计算的海量数据挖掘
PDF
How Enterprises Leverage Data to Overcome Business Challenges During Coronavirus
PDF
04 陈良忠ibm cloud forum ibm experience 0611
PDF
数据领导者的多云数据集成.pdf
PDF
Grid Technology and Enterprise Grid / 网格技术及其在企业信息化中的应用
PDF
Advanced Analytics and Machine Learning with Data Virtualization (Chinese)
PDF
Modernising Data Architecture for Data Driven Insights (Chinese)
PDF
03 李实恭-乘云之势以智致远 0611
PDF
Emc keynote 1130 1200
PDF
Trinity BDM - 橋接傳統與未來
PDF
逻辑数据编织 – 构建先进的现代企业数据架构
PDF
阿里巴巴数据中台实践分享.pdf
PDF
Data Analyse Black Horse - ClickHouse
PDF
Top100summit 腾讯-周健-服务化与体系化解决大量定制小项目开发困境
PDF
Big Data World Forum
PDF
如何快速实现数据编织架构
PDF
海通证券金融云思考与实践(数据技术嘉年华2017)
05 朱近之 ibm云计算解决方案概览 0611
Greenplum技术
How do we manage more than one thousand of Pegasus clusters - engine part
云存储与虚拟化分论坛 基于云计算的海量数据挖掘
How Enterprises Leverage Data to Overcome Business Challenges During Coronavirus
04 陈良忠ibm cloud forum ibm experience 0611
数据领导者的多云数据集成.pdf
Grid Technology and Enterprise Grid / 网格技术及其在企业信息化中的应用
Advanced Analytics and Machine Learning with Data Virtualization (Chinese)
Modernising Data Architecture for Data Driven Insights (Chinese)
03 李实恭-乘云之势以智致远 0611
Emc keynote 1130 1200
Trinity BDM - 橋接傳統與未來
逻辑数据编织 – 构建先进的现代企业数据架构
阿里巴巴数据中台实践分享.pdf
Data Analyse Black Horse - ClickHouse
Top100summit 腾讯-周健-服务化与体系化解决大量定制小项目开发困境
Big Data World Forum
如何快速实现数据编织架构
海通证券金融云思考与实践(数据技术嘉年华2017)
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
PDF
Migration, backup and restore made easy using Kannika
PDF
Five Things You Need to Know About Data Streaming in 2025
PDF
Data in Motion Tour Seoul 2024 - Keynote
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
PDF
Unlocking value with event-driven architecture by Confluent
PDF
Il Data Streaming per un’AI real-time di nuova generazione
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
PDF
Building API data products on top of your real-time data infrastructure
PDF
Speed Wins: From Kafka to APIs in Minutes
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
Webinar Think Right - Shift Left - 19-03-2025.pptx
Migration, backup and restore made easy using Kannika
Five Things You Need to Know About Data Streaming in 2025
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Roadmap Demo
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Build a Real-Time Decision Support Application for Financial Market Traders w...
Strumenti e Strategie di Stream Governance con Confluent Platform
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Building Real-Time Gen AI Applications with SingleStore and Confluent
Unlocking value with event-driven architecture by Confluent
Il Data Streaming per un’AI real-time di nuova generazione
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Break data silos with real-time connectivity using Confluent Cloud Connectors
Building API data products on top of your real-time data infrastructure
Speed Wins: From Kafka to APIs in Minutes
Evolving Data Governance for the Real-time Streaming and AI Era

美团数据平台之Kafka应用实践和优化