Flink 源码学习|TimestampAssigner 和 WatermarkGenerator 的序列化处理

前置文档:


在 Flink 的分布式执行的过程中,WatermarkGeneratorTimestampAssigner 均有可能需要在不同节点之间进行传输。但是,这两个接口均没有直接继承 Serializable 以实现可序列化,而是通过可序列化的提供者类(Supplier)的方式实现。通过提供者模式(supplier pattern),可以避免需要在 API 方法中考虑序列化的问题。

WatermarkGeneratorSupplier

首先,我们来看 WatermarkGenerator 的提供者类 WatermarkGeneratorSupplier

源码flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarkGeneratorSupplier.javaGithub

package org.apache.flink.api.common.eventtime;

import org.apache.flink.annotation.PublicEvolving;
import org.apache.flink.metrics.MetricGroup;

import java.io.Serializable;

@PublicEvolving
@FunctionalInterface
public interface WatermarkGeneratorSupplier<T> extends Serializable {

    /** Instantiates a {@link WatermarkGenerator}. */
    WatermarkGenerator<T> createWatermarkGenerator(Context context);

    /**
     * Additional information available to {@link #createWatermarkGenerator(Context)}. This can be
     * access to {@link org.apache.flink.metrics.MetricGroup MetricGroups}, for example.
     */
    interface Context {

        /**
         * Returns the metric group for the context in which the created {@link WatermarkGenerator}
         * is used.
         *
         * <p>Instances of this class can be used to register new metrics with Flink and to create a
         * nested hierarchy based on the group names. See {@link MetricGroup} for more information
         * for the metrics system.
         *
         * @see MetricGroup
         */
        MetricGroup getMetricGroup();
    }
}
  • 泛型 T 为构造的 WatermarkGenerator 处理的输入数据流类型
  • 通过继承了 Serializable 接口实现可序列化
  • 提供接口方法 createWatermarkGenerator,根据可获取指标信息的 Context 构造 WatermarkGenerator,支持创建指标

TimestampAssignerSupplier

接着,我们来看 TimestampAssigner 的提供者类 TimestampAssignerSupplier

源码flink-core/src/main/java/org/apache/flink/api/common/eventtime/TimestampAssignerSupplier.javaGithub

package org.apache.flink.api.common.eventtime;

import org.apache.flink.annotation.PublicEvolving;
import org.apache.flink.api.java.ClosureCleaner;
import org.apache.flink.metrics.MetricGroup;

import java.io.Serializable;

/**
 * A supplier for {@link TimestampAssigner TimestampAssigners}. The supplier pattern is used to
 * avoid having to make {@link TimestampAssigner} {@link Serializable} for use in API methods.
 *
 * <p>This interface is {@link Serializable} because the supplier may be shipped to workers during
 * distributed execution.
 */
@PublicEvolving
@FunctionalInterface
public interface TimestampAssignerSupplier<T> extends Serializable {

    /** Instantiates a {@link TimestampAssigner}. */
    TimestampAssigner<T> createTimestampAssigner(Context context);

    static <T> TimestampAssignerSupplier<T> of(SerializableTimestampAssigner<T> assigner) {
        return new SupplierFromSerializableTimestampAssigner<>(assigner);
    }

    /**
     * Additional information available to {@link #createTimestampAssigner(Context)}. This can be
     * access to {@link org.apache.flink.metrics.MetricGroup MetricGroups}, for example.
     */
    interface Context {

        /**
         * Returns the metric group for the context in which the created {@link TimestampAssigner}
         * is used.
         *
         * <p>Instances of this class can be used to register new metrics with Flink and to create a
         * nested hierarchy based on the group names. See {@link MetricGroup} for more information
         * for the metrics system.
         *
         * @see MetricGroup
         */
        MetricGroup getMetricGroup();
    }

    /**
     * We need an actual class. Implementing this as a lambda in {@link
     * #of(SerializableTimestampAssigner)} would not allow the {@link ClosureCleaner} to "reach"
     * into the {@link SerializableTimestampAssigner}.
     */
    class SupplierFromSerializableTimestampAssigner<T> implements TimestampAssignerSupplier<T> {

        private static final long serialVersionUID = 1L;

        private final SerializableTimestampAssigner<T> assigner;

        public SupplierFromSerializableTimestampAssigner(
                SerializableTimestampAssigner<T> assigner) {
            this.assigner = assigner;
        }

        @Override
        public TimestampAssigner<T> createTimestampAssigner(Context context) {
            return assigner;
        }
    }
}
  • 泛型 T 为构造的 TimestampAssigner 处理的输入数据流类型
  • createTimestampAssignerContext 的逻辑与 WatermarkGeneratorSupplier 类似
  • 为已经支持序列化的 SerializableTimestampAssigner 对象提供了 of 方法:of 方法时传入可序列化的 SerializableTimestampAssigner 对象,返回一个 TimestampAssignerSupplier 的子类 SupplierFromSerializableTimestampAssigner,该子类的 createTimestampAssigner 方法在被调用时,会直接返回 of 方法参数的实例化对象
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

长行

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值