SlideShare a Scribd company logo
Guozhang Wang
Beijing Kafka Meetup, Nov. 16, 2019
The Silver Bullet for Endless Rebalances
Introduction to the Incremental Cooperative Protocol
Outline
• Review of the current eager rebalance algorithm
• Identify the known issues with common scenarios
• A new proposal: incremental cooperative rebalancing
2
3
A Short History of Consumer Groups
Topic 1
Topic 2
Partitions
Producers
Producers
Consumers
Consumers
Brokers
4
A Short History of Consumer Groups
Consumers
Consumers
fetch
fetch
1) membership (who owns what)
2) offset (consumed up to where)
Kafka 0.8.2-
5
A Short History of Consumer Groups
Consumers
Consumers
fetch
fetch
Kafka 0.9.0+
Group Coordinator
1) membership (who owns what)
2) offset (consumed up to where)
5
A Short History of Consumer Groups
Consumers
Consumers
fetch
fetch
Kafka 0.9.0+
Group Coordinator
1) membership (who owns what)
2) offset (consumed up to where)
Rebalance
6
Consumer Rebalance Protocol
• A rebalance happens when:
• Membership change

• Member crash: failure of a consumer

• Scaling in: member leaves the group

• Scaling out: new member joins
• Partition resources change
• Topics are created or deleted

• More partitions added to topics
7
Member Crash: Failure Detection (heartbeat)
C1
C2
Group Coordinator (broker side)
7
Member Crash: Failure Detection (heartbeat)
C1
C2
Group Coordinator (broker side)
heartbeat
ok
7
Member Crash: Failure Detection (heartbeat)
C1
C2
Group Coordinator (broker side)
heartbeat heartbeat
ok ok
7
Member Crash: Failure Detection (heartbeat)
C1
C2
Group Coordinator (broker side)
heartbeat.interval.ms
heartbeat heartbeat
ok ok
7
Member Crash: Failure Detection (heartbeat)
C1
C2
Group Coordinator (broker side)
heartbeat.interval.ms
session.timeout.ms
heartbeat heartbeat
ok ok
8
Member Crash: Failure Detection (heartbeat)
C1
C2
Group Coordinator (broker side)
heartbeat.interval.ms
session.timeout.ms
heartbeat
ok
8
Member Crash: Failure Detection (heartbeat)
C1
C2
Group Coordinator (broker side)
heartbeat.interval.ms
session.timeout.ms
heartbeat
ok
8
Member Crash: Failure Detection (heartbeat)
C1
C2
Group Coordinator (broker side)
heartbeat.interval.ms
session.timeout.ms
heartbeat
ok
8
Member Crash: Failure Detection (heartbeat)
C1
C2
Group Coordinator (broker side)
heartbeat.interval.ms
session.timeout.ms
heartbeat
ok
9
Scaling In: Consumer Shutdown (leave-group)
C1
C2
Group Coordinator (broker side)
heartbeat
ok
9
Scaling In: Consumer Shutdown (leave-group)
C1
C2
Group Coordinator (broker side)
heartbeat leave-group
ok
9
Scaling In: Consumer Shutdown (leave-group)
C1
C2
Group Coordinator (broker side)
heartbeat leave-group
ok
10
Scaling Out: Consumer Startup (join-group)
C1
C2
Group Coordinator (broker side)
10
Scaling Out: Consumer Startup (join-group)
C1
C2
Group Coordinator (broker side)
C3
10
Scaling Out: Consumer Startup (join-group)
C1
C2
Group Coordinator (broker side)
C3
join-group
10
Scaling Out: Consumer Startup (join-group)
C1
C2
Group Coordinator (broker side)
C3
join-group
11
Resources Change: Re-Subscribe (join-group)
C1
C2
Group Coordinator (broker side)
C3
join-group
11
Resources Change: Re-Subscribe (join-group)
C1
C2
Group Coordinator (broker side)
C3
join-group
consumer resubscribe
11
Resources Change: Re-Subscribe (join-group)
C1
C2
Group Coordinator (broker side)
C3
join-group join-group
consumer resubscribe
11
Resources Change: Re-Subscribe (join-group)
C1
C2
Group Coordinator (broker side)
C3
join-group join-group
consumer resubscribe
12
Consumer Rebalance Protocol
• During the rebalance:
• Existing consumers re-join the group
• A single member is chosen as group leader
• leader determines partition assignment (user customizable)
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
1 2 3
4 5 6
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
1 2 3
4 5 6
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
1 2 3
4 5 6
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
1 2 3
4 5 6
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
1 2 3
4 5 6
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
1 2 3
4 5 6
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
1 2 3
4 5 6
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)1 2 3
4 5 6
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
sync. barrier
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
13
Consumers Re-join Group
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
rebalance.timeout.ms (connect)

max.poll.interval.ms (consumer)
sync. barrier
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
sync-group
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#assign(…)
sync-group
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#assign(…)
sync-group
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#assign(…)
sync-group
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#assign(…)
sync-group
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#assign(…)
sync-group
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#assign(…)
sync-group
#onPartitionsAssigned(1,2)

#onPartitionsAssigned(4,5)

#onPartitionsAssigned(3,6)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#assign(…)
sync-group
#onPartitionsAssigned(1,2)

#onPartitionsAssigned(4,5)

#onPartitionsAssigned(3,6)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
1 2
4 5
3 6
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
14
Partition Reassignment (sync-group)
re-join
join-group
select C1 as leader
#assign(…)
sync-group
#onPartitionsAssigned(1,2)

#onPartitionsAssigned(4,5)

#onPartitionsAssigned(3,6)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
1 2
4 5
3 6
#onPartitionsRevoked(1,2,3)

#onPartitionsRevoked(4,5,6)
15
Summary of Rebalance Protocol
• Failure detection and re-join notification (heartbeat and leave-group)
• (New) group generation with current members(join-group)
• Partition assignment and propagation(sync-group)
16
Summary of Rebalance Protocol
• ConsumerRebalanceListener
• #onPartitionsRevoked (before sending join-group)

• #onPartitionsAssigned (after receiving sync-group)
• ConsumerPartitionAssignor
• #assign (only triggered by the leader)
Built-in: {range, round-robin, sticky}; Custom: {streams, …}
17
Known Issue #1: Stop-the-world Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions) #assign(…)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
1 2 3
4 5 6
1 2
4 5
3 6
17
Known Issue #1: Stop-the-world Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions) #assign(…)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
revoked all
1 2 3
4 5 6
1 2
4 5
3 6
17
Known Issue #1: Stop-the-world Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions) #assign(…)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
revoked all re-assigned
1 2 3
4 5 6
1 2
4 5
3 6
17
Known Issue #1: Stop-the-world Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions) #assign(…)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
revoked all re-assigned
eager rebalance:
before rebalance revoked all the partitions,
after rebalance most of the partitions are reassigned back
1 2 3
4 5 6
1 2
4 5
3 6
18
Known Issue #2: Back-and-forth Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
leave-group
#assign(…) #onPartitionsRevoked(all partitions) #assign(…)
1 2
4 5
3 6
1 2
4 5
3 6
18
Known Issue #2: Back-and-forth Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
leave-group
#assign(…) #onPartitionsRevoked(all partitions) #assign(…)
1 2
4 5
3 6
1 2
4 5
3 6
18
Known Issue #2: Back-and-forth Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
leave-group
#assign(…) #onPartitionsRevoked(all partitions) #assign(…)
1 2
4 5
3 6
1 2
4 5
3 6
18
Known Issue #2: Back-and-forth Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
leave-group
#assign(…) #onPartitionsRevoked(all partitions) #assign(…)
bounce a consumer
1 2
4 5
3 6
1 2
4 5
3 6
18
Known Issue #2: Back-and-forth Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
leave-group
#assign(…) #onPartitionsRevoked(all partitions) #assign(…)
bounce a consumer
1 2
4 5
3 6
1 2
4 5
3 6
18
Known Issue #2: Back-and-forth Rebalance
join-group
re-
join-group
#onPartitionsRevoked(all partitions)
sync-
#onPartitionsAssigned(given partitions)
re-join
sync-group
C1
C2
Group Coordinator(broker side)
C3
leave-group
#assign(…) #onPartitionsRevoked(all partitions) #assign(…)
unnecessary rebalances: 



first one to move partitions from C3 to C1/C2,

second one to move them back to C3 from C1/C2
bounce a consumer
1 2
4 5
3 6
1 2
4 5
3 6
19
Let’s Revisit: 



When to trigger a rebalance, 


Who to participate in a rebalance, 


What to reassign during rebalance
20
Rebalance Protocols
When Who What
Current Protocol
(Eager)
Immediately Everyone Everything
21
When Who What
Current Protocol
(Eager)
Immediately Everyone Everything
Proposed Protocol
(Cooperative)
After determined what
needs to be reassigned
Only those involved in
partition reassignment
Only those partitions
viable for reassignment
Rebalance Protocols
22
Rebalance Improvements
• [KIP-415] : incremental rebalance in Connect(2.3+)

• [KIP-345] : static membership in Consumer / Streams(2.3+)

• [KIP-429] : incremental rebalance in Consumer / Streams(2.4+)
22
Rebalance Improvements
• [KIP-415] : incremental rebalance in Connect(2.3+)

• [KIP-345] : static membership in Consumer / Streams(2.3+)

• [KIP-429] : incremental rebalance in Consumer / Streams(2.4+)
23
Incremental Assignment in the Consumer
owned-partitions
23
Incremental Assignment in the Consumer
owned-partitions assigned-partitions
23
Incremental Assignment in the Consumer
owned-partitions assigned-partitions
partitions-to-be-revoked
23
Incremental Assignment in the Consumer
owned-partitions assigned-partitions
partitions-to-be-revoked
#onPartitionsRevoked
23
Incremental Assignment in the Consumer
owned-partitions assigned-partitions
partitions-to-be-revoked partitions-to-be-added
#onPartitionsRevoked
23
Incremental Assignment in the Consumer
owned-partitions assigned-partitions
partitions-to-be-revoked partitions-to-be-added
#onPartitionsRevoked #onPartitionsAssigned
23
Incremental Assignment in the Consumer
owned-partitions assigned-partitions
partitions-to-be-revoked partitions-to-be-added
unchanged-partitions
#onPartitionsRevoked #onPartitionsAssigned
23
Incremental Assignment in the Consumer
assigned-partitions
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
1 2 3
4 5 6
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
1 2 3
4 5 6
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
#assign(…)
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
#assign(…)
C1: {1, 2, 3}
C2: {4, 5, 6}
C3: {3*, 6*} -> {}
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
#assign(…)
C1: {1, 2, 3}
C2: {4, 5, 6}
C3: {3*, 6*} -> {}
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
#assign(…)
C1: {1, 2, 3}
C2: {4, 5, 6}
C3: {3*, 6*} -> {}
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
#assign(…)
C1: {1, 2, 3}
C2: {4, 5, 6}
C3: {3*, 6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2
4 5
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
#assign(…)
C1: {1, 2, 3}
C2: {4, 5, 6}
C3: {3*, 6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2
4 5
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
#assign(…)
C1: {1, 2, 3}
C2: {4, 5, 6}
C3: {3*, 6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
24
Cooperative Protocol
C1
C2
Group Coordinator (broker side)
C3
join-group
re-join
join-group
1 2
4 5
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
#assign(…)
C1: {1, 2, 3}
C2: {4, 5, 6}
C3: {3*, 6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
1 2
4 5
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
1 2
4 5
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
1 2
4 5
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
sync-group
1 2
4 5
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
sync-group
#assign(…)
1 2
4 5
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
sync-group
#assign(…)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}1 2
4 5
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
sync-group
#assign(…)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}1 2
4 5
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
sync-group
#assign(…)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}1 2
4 5
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
sync-group
#assign(…)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
#onPartitionsAssigned(3,6)
1 2
4 5
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
sync-group
#assign(…)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
#onPartitionsAssigned(3,6)
1 2
4 5
3 6
25
Cooperative Protocol
oup
2, 3}
5, 6}
6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
sync-group
#assign(…)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
#onPartitionsAssigned(3,6)
1 2
4 5
3 6
26
Cooperative Rebalance
C1
C2
C3
join-group
re-
join-group
1 2 3
4 5 6
C1: {1, 2, 3} C2: {4, 5, 6}
sync-group
#assign(…)
C1: {1, 2, 3}
C2: {4, 5, 6}
C3: {3*, 6*} -> {}
#onPartitionsRevoked(3)

#onPartitionsRevoked(6)
join-group
C1: {1, 2} C2: {4, 5}
sync-group
#assign(…)
C1: {1, 2}
C2: {4, 5}
C3: {3, 6}
#onPartitionsAssigned(3,6)
1 2
4 5
3 6
Group Coordinator (broker side)
• Trade-off: more rebalances, but way cheaper
• Works better with a “sticky” assignor: fewer partitions to migrate
• Consumers can continue to fetch during a rebalance event (2.5+)
27
Benchmark Results for Streams
• 10 streams instances rolling bounce, measuring process rate
• …and pause time: 3522 ms v.s. 37138 ms
27
Benchmark Results for Streams
• 10 streams instances rolling bounce, measuring process rate
• …and pause time: 3522 ms v.s. 37138 ms
27
Benchmark Results for Streams
• 10 streams instances rolling bounce, measuring process rate
• …and pause time: 3522 ms v.s. 37138 ms
27
Benchmark Results for Streams
• 10 streams instances rolling bounce, measuring process rate
• …and pause time: 3522 ms v.s. 37138 ms
27
Benchmark Results for Streams
• 10 streams instances rolling bounce, measuring process rate
• …and pause time: 3522 ms v.s. 37138 ms
28
Benchmark Results for Connect (KIP-415)
• 900 tasks across workers, measuring time spent in rebalancing
29
Augmented Assignor Interface
ConsumerPartitionAssignor
• #assign (subscription now includes “owned-partitions”)
• #supportedProtocols (eager and/or cooperative)
Built-in: {range, round-robin, sticky : eager};



{sticky-cooperative : cooperative}



Custom: {streams, … : eager and cooperative}
30
Augmented Listener Interface
ConsumerRebalanceListener
• #onPartitionsRevoked (will not be triggered if there is nothing to revoke)

• #onPartitionsAssigned (triggered at completion of rebalanc, regardless of newly added partitions)
• # #onPartitionsLost (triggered instead of onPartitionRevoked when a member falls out of group)
31
Switch to Cooperative Rebalancing
In Consumer
• first rolling bounce: add “sticky-cooperative” / “my-cooperative” to [partition.assignment.strategy]
• second rolling bounce: remove old assignor (e.g.,“range”) from the config
In Streams
• first rolling bounce: set [upgrade.from = old version (“2.3”)]
• second rolling bounce: remove [upgrade.from] config
In Connect
• one rolling bounce: set [connect.protocol = “compatible”]
Take-aways
• We have extended the rebalance protocol to enable
smarter assignment (when, who, and what)
32
Take-aways
• We have extended the rebalance protocol to enable
smarter assignment (when, who, and what)
• No more stop-the-world rebalances with the incremental
cooperative protocol!
33
THANKS!
Guozhang Wang | guozhang@confluent.io | @guozhangwang
34

More Related Content

What's hot (20)

PDF
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
PDF
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Lightbend
 
PDF
Kafka Streams State Stores Being Persistent
confluent
 
PPTX
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
HostedbyConfluent
 
PDF
Common issues with Apache Kafka® Producer
confluent
 
PPTX
Kafka error handling patterns and best practices | Hemant Desale and Aruna Ka...
HostedbyConfluent
 
PPTX
Deep Dive into Apache Kafka
confluent
 
PPTX
A visual introduction to Apache Kafka
Paul Brebner
 
PDF
モニタリングプラットフォーム開発の裏側
Rakuten Group, Inc.
 
PPTX
Kafka 101
Aparna Pillai
 
PDF
PGOを用いたPostgreSQL on Kubernetes入門(PostgreSQL Conference Japan 2022 発表資料)
NTT DATA Technology & Innovation
 
PPTX
No data loss pipeline with apache kafka
Jiangjie Qin
 
PDF
Apache Kafka - Martin Podval
Martin Podval
 
PPTX
地理分散DBについて
Kumazaki Hiroki
 
PPTX
Apache kafka
Long Nguyen
 
PDF
ksqlDB: A Stream-Relational Database System
confluent
 
PPTX
オンライン物理バックアップの排他モードと非排他モードについて ~PostgreSQLバージョン15対応版~(第34回PostgreSQLアンカンファレンス...
NTT DATA Technology & Innovation
 
PPTX
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
PDF
Apache Kafka Architecture & Fundamentals Explained
confluent
 
PDF
Grafana LokiではじめるKubernetesロギングハンズオン(NTT Tech Conference #4 ハンズオン資料)
NTT DATA Technology & Innovation
 
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Lightbend
 
Kafka Streams State Stores Being Persistent
confluent
 
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
HostedbyConfluent
 
Common issues with Apache Kafka® Producer
confluent
 
Kafka error handling patterns and best practices | Hemant Desale and Aruna Ka...
HostedbyConfluent
 
Deep Dive into Apache Kafka
confluent
 
A visual introduction to Apache Kafka
Paul Brebner
 
モニタリングプラットフォーム開発の裏側
Rakuten Group, Inc.
 
Kafka 101
Aparna Pillai
 
PGOを用いたPostgreSQL on Kubernetes入門(PostgreSQL Conference Japan 2022 発表資料)
NTT DATA Technology & Innovation
 
No data loss pipeline with apache kafka
Jiangjie Qin
 
Apache Kafka - Martin Podval
Martin Podval
 
地理分散DBについて
Kumazaki Hiroki
 
Apache kafka
Long Nguyen
 
ksqlDB: A Stream-Relational Database System
confluent
 
オンライン物理バックアップの排他モードと非排他モードについて ~PostgreSQLバージョン15対応版~(第34回PostgreSQLアンカンファレンス...
NTT DATA Technology & Innovation
 
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Grafana LokiではじめるKubernetesロギングハンズオン(NTT Tech Conference #4 ハンズオン資料)
NTT DATA Technology & Innovation
 

Similar to Introduction to the Incremental Cooperative Protocol of Kafka (20)

PDF
The Next Generation of the Consumer Rebalance Protocol with David Jacot
HostedbyConfluent
 
PDF
Kafka Streams Rebalances and Assignments: The Whole Story with Alieh Saeedi &...
HostedbyConfluent
 
PDF
The Next Generation of the Consumer Rebalance Protocol With David Jacot | Cur...
HostedbyConfluent
 
PDF
Getting the Balance Right with Kafka Connect
HostedbyConfluent
 
PPTX
Partition-Tolerant Distributed Publish/Subscribe System
Vaidas Brundza
 
PDF
Design and Implementation of Incremental Cooperative Rebalancing
confluent
 
PDF
Mastering Kafka Consumer Distribution: A Guide to Efficient Scaling and Resou...
HostedbyConfluent
 
PPTX
Top Ten Kafka® Configs
confluent
 
PPTX
Apache Kafka Rebalance Protocol for the Cloud: Static Membership
confluent
 
PPTX
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
confluent
 
PPTX
Citi TechTalk Session 2: Kafka Deep Dive
confluent
 
PDF
Why stop the world when you can change it? Design and implementation of Incre...
confluent
 
PDF
Why is My Stream Processing Job Slow? with Xavier Leaute
Databricks
 
PDF
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
HostedbyConfluent
 
PDF
Java zone 2015 How to make life with kafka easier.
Krzysztof Debski
 
PDF
Kafka Reliable Data Delivery
Mostafa Asgari
 
PDF
JDD2015: Make your world event driven - Krzysztof Dębski
PROIDEA
 
PDF
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Otávio Carvalho
 
PDF
Implementing Exactly-once Delivery and Escaping Kafka Rebalance Storms with Y...
HostedbyConfluent
 
PDF
Kafka internals
David Groozman
 
The Next Generation of the Consumer Rebalance Protocol with David Jacot
HostedbyConfluent
 
Kafka Streams Rebalances and Assignments: The Whole Story with Alieh Saeedi &...
HostedbyConfluent
 
The Next Generation of the Consumer Rebalance Protocol With David Jacot | Cur...
HostedbyConfluent
 
Getting the Balance Right with Kafka Connect
HostedbyConfluent
 
Partition-Tolerant Distributed Publish/Subscribe System
Vaidas Brundza
 
Design and Implementation of Incremental Cooperative Rebalancing
confluent
 
Mastering Kafka Consumer Distribution: A Guide to Efficient Scaling and Resou...
HostedbyConfluent
 
Top Ten Kafka® Configs
confluent
 
Apache Kafka Rebalance Protocol for the Cloud: Static Membership
confluent
 
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
confluent
 
Citi TechTalk Session 2: Kafka Deep Dive
confluent
 
Why stop the world when you can change it? Design and implementation of Incre...
confluent
 
Why is My Stream Processing Job Slow? with Xavier Leaute
Databricks
 
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
HostedbyConfluent
 
Java zone 2015 How to make life with kafka easier.
Krzysztof Debski
 
Kafka Reliable Data Delivery
Mostafa Asgari
 
JDD2015: Make your world event driven - Krzysztof Dębski
PROIDEA
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Otávio Carvalho
 
Implementing Exactly-once Delivery and Escaping Kafka Rebalance Storms with Y...
HostedbyConfluent
 
Kafka internals
David Groozman
 
Ad

More from Guozhang Wang (14)

PDF
Consensus in Apache Kafka: From Theory to Production.pdf
Guozhang Wang
 
PDF
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Guozhang Wang
 
PDF
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Guozhang Wang
 
PDF
Performance Analysis and Optimizations for Kafka Streams Applications
Guozhang Wang
 
PDF
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Guozhang Wang
 
PPTX
Exactly-once Stream Processing with Kafka Streams
Guozhang Wang
 
PDF
Apache Kafka, and the Rise of Stream Processing
Guozhang Wang
 
PDF
Building Realtim Data Pipelines with Kafka Connect and Spark Streaming
Guozhang Wang
 
PDF
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Guozhang Wang
 
PDF
Introduction to Kafka Streams
Guozhang Wang
 
PPTX
Building a Replicated Logging System with Apache Kafka
Guozhang Wang
 
PPTX
Apache Kafka at LinkedIn
Guozhang Wang
 
PPTX
Behavioral Simulations in MapReduce
Guozhang Wang
 
PPTX
Automatic Scaling Iterative Computations
Guozhang Wang
 
Consensus in Apache Kafka: From Theory to Production.pdf
Guozhang Wang
 
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Guozhang Wang
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Guozhang Wang
 
Performance Analysis and Optimizations for Kafka Streams Applications
Guozhang Wang
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Guozhang Wang
 
Exactly-once Stream Processing with Kafka Streams
Guozhang Wang
 
Apache Kafka, and the Rise of Stream Processing
Guozhang Wang
 
Building Realtim Data Pipelines with Kafka Connect and Spark Streaming
Guozhang Wang
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Guozhang Wang
 
Introduction to Kafka Streams
Guozhang Wang
 
Building a Replicated Logging System with Apache Kafka
Guozhang Wang
 
Apache Kafka at LinkedIn
Guozhang Wang
 
Behavioral Simulations in MapReduce
Guozhang Wang
 
Automatic Scaling Iterative Computations
Guozhang Wang
 
Ad

Recently uploaded (20)

PDF
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
PPTX
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
PPTX
MODULE 04 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
PDF
Data structures notes for unit 2 in computer science.pdf
sshubhamsingh265
 
PDF
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
PDF
Digital water marking system project report
Kamal Acharya
 
PPTX
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
PPTX
MODULE 03 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
PPTX
澳洲电子毕业证澳大利亚圣母大学水印成绩单UNDA学生证网上可查学历
Taqyea
 
PDF
methodology-driven-mbse-murphy-july-hsv-huntsville6680038572db67488e78ff00003...
henriqueltorres1
 
PDF
Submit Your Papers-International Journal on Cybernetics & Informatics ( IJCI)
IJCI JOURNAL
 
PDF
WD2(I)-RFQ-GW-1415_ Shifting and Filling of Sand in the Pond at the WD5 Area_...
ShahadathHossain23
 
PPTX
Biosensors, BioDevices, Biomediccal.pptx
AsimovRiyaz
 
PDF
methodology-driven-mbse-murphy-july-hsv-huntsville6680038572db67488e78ff00003...
henriqueltorres1
 
PDF
Design Thinking basics for Engineers.pdf
CMR University
 
PDF
MODULE-5 notes [BCG402-CG&V] PART-B.pdf
Alvas Institute of Engineering and technology, Moodabidri
 
PPTX
Water Resources Engineering (CVE 728)--Slide 4.pptx
mohammedado3
 
PPTX
Final Major project a b c d e f g h i j k l m
bharathpsnab
 
PDF
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
PDF
mbse_An_Introduction_to_Arcadia_20150115.pdf
henriqueltorres1
 
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
MODULE 04 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
Data structures notes for unit 2 in computer science.pdf
sshubhamsingh265
 
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
Digital water marking system project report
Kamal Acharya
 
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
MODULE 03 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
澳洲电子毕业证澳大利亚圣母大学水印成绩单UNDA学生证网上可查学历
Taqyea
 
methodology-driven-mbse-murphy-july-hsv-huntsville6680038572db67488e78ff00003...
henriqueltorres1
 
Submit Your Papers-International Journal on Cybernetics & Informatics ( IJCI)
IJCI JOURNAL
 
WD2(I)-RFQ-GW-1415_ Shifting and Filling of Sand in the Pond at the WD5 Area_...
ShahadathHossain23
 
Biosensors, BioDevices, Biomediccal.pptx
AsimovRiyaz
 
methodology-driven-mbse-murphy-july-hsv-huntsville6680038572db67488e78ff00003...
henriqueltorres1
 
Design Thinking basics for Engineers.pdf
CMR University
 
MODULE-5 notes [BCG402-CG&V] PART-B.pdf
Alvas Institute of Engineering and technology, Moodabidri
 
Water Resources Engineering (CVE 728)--Slide 4.pptx
mohammedado3
 
Final Major project a b c d e f g h i j k l m
bharathpsnab
 
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
mbse_An_Introduction_to_Arcadia_20150115.pdf
henriqueltorres1
 

Introduction to the Incremental Cooperative Protocol of Kafka