A study on migration scheduling in distributed stream processing engines
M Lindeberg, T Plagemann - … of the 23rd International Conference on …, 2022 - dl.acm.org
M Lindeberg, T Plagemann
Proceedings of the 23rd International Conference on Distributed Computing …, 2022•dl.acm.orgThe cost of migrating stateful operators in distributed stream processing has attracted
research attention. Reactive migration in response to context changes is the common
approach. Other migration scheduling strategies, like proactive migration based on
prediction, and delayed migration are nearly neglected. This paper investigates four
algorithms that explore these alternative scheduling strategies. The algorithms are
implemented in a prototype stream processing overlay and run over an emulated network …
research attention. Reactive migration in response to context changes is the common
approach. Other migration scheduling strategies, like proactive migration based on
prediction, and delayed migration are nearly neglected. This paper investigates four
algorithms that explore these alternative scheduling strategies. The algorithms are
implemented in a prototype stream processing overlay and run over an emulated network …
The cost of migrating stateful operators in distributed stream processing has attracted research attention. Reactive migration in response to context changes is the common approach. Other migration scheduling strategies, like proactive migration based on prediction, and delayed migration are nearly neglected.
This paper investigates four algorithms that explore these alternative scheduling strategies. The algorithms are implemented in a prototype stream processing overlay and run over an emulated network. Experiments with synthetic workload reveal that (1) proactive migration can reduce average event delivery latency, and (2) that it is important to handle noise in the data to avoid wrong adaptations. Experiments with real workload demonstrate that (1) pro-activity is not always beneficial, and (2) careful timing of migration depending on operator state, has a large potential to limit overhead. The experiments demonstrate a reduction in state size of 38 %, resulting in a 30 % reduction in freeze time. Consideration of operator state size is especially important. The state transfer can lead to contention that further harms event delivery and/or causes network timeouts for cases with limited network resources.
Showing the best result for this search. See all results