目录
GlobalPartitioner: DataStream => DataStream
ShufflePartitioner: DataStream => DataStream
RebalancePartitioner: DataStream => DataStream
RescalePartitioner: DataStream => DataStream
BroadcastPartitioner: DataStream => DataStream
Flink8种分区策略有哪几种?
GlobalPartitioner: DataStream => DataStream
GlobalPartitioner数据会被分发到下游算子的第一个实例中进行处理。
GlobalPartitioner,GLOBAL分区。`将记录输出到下游Operator的第一个实例。
源码解读:
/**
* 发送所有的数据到下游算子的第一个task(ID = 0)
* @param <T>
*/
@Internal
public class GlobalPartitioner<T> extends StreamPartitioner<T> {
private static final long serialVersionUID = 1L;
@Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
//只返回0,即只发送给下游算子的第一个task
return 0;
}
@Override
public StreamPartitioner<T> copy() {
return this;
}
@Override
public String toString() {
return "GLOBAL";
}
}
ShufflePartitioner: DataStream => DataStream
ShufflePartitioner数据会被随机分发到下游算子的每一个实例中进行处理。
`ShufflePartitioner,SHUFFLE分区。`将记录随机输出到下游Operator的每个实例。
/**
* 随机的选择一个channel进行发送
* @param <T>
*/
@Internal
public class ShufflePartitioner<T> extends StreamPartitioner<T> {
private static final long serialVersionUID = 1L;
private Random random = new Random();
@Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
//产生[0,numberOfChannels)伪随机数,随机发送到下游的某个task
return random.nextInt(numberOfChannels);
}
@Override
public StreamPartitioner<T> copy() {
return new ShufflePartitioner<T>();
}
@Override
public String toString() {
return "SHUFFLE";
}
}