https://zhuanlan.zhihu.com/p/362788755
文章目录
概述
有多种方法可以评估ACE(平均因果效应)或ITE(个体因果效应),这里我们关注的是ITE估计,即利用Uplift Modeling的方法,为每个样本(如用户)分别估计干预/不干预时的结果,得到treatment(干预)想退control(不干预)的uplift(增量)值。增量建模有多种分类方法,既可以按照对增量的直接建模和间接建模分,也可以依据实现方式分为Meta-Learning和Tree-Based等。
参考各个文献,为了便于理解,本文将要整理的方法分为4种来介绍:The Class Transformation Method、Meta-Learning Method、Tree-Based Method、NN-Based Method。
The Class Transformation Method
参考[1],该方法应该归于Meta-Learning Method,这里为了不混淆单列出来。该方法适用于Treatment和Outcome都是二类分类的情况,通过将预测目标做单类的转换,从而实现单模型预测。
首先,考虑随机试验的情况,定义新的目标变量
Z i = Y i o b s W i + ( 1 − Y i o b s ) ( 1 − W i ) Z_{i}=Y_{i}^{o b s} W_{i}+\left(1-Y_{i}^{o b s}\right)\left(1-W_{i}\right) Zi=YiobsWi+(1−Yiobs)(1−Wi)
此时,当Y=1,W=1或Y=0,W=0时,Z=1。那么,对uplift有
τ ( X i ) = 2 P ( Z i = 1 ∣ X i ) − 1 \tau\left(X_{i}\right)=2 P\left(Z_{i}=1 \mid X_{i}\right)-1 τ(Xi)=2P(Zi=1∣Xi)−1
注:推导过程理解
P ( Z i = 1 ∣ X i ) = P ( Y i = 1 , W i = 1 ∣ X i ) + P ( Y i = 0 , W i = 0 ∣ X i ) P(Z_i=1|X_i)=P(Y_i=1,W_i=1|X_i)+P(Y_i=0,W_i=0|X_i) P(Zi=1∣Xi)=P(Yi=1,Wi=1∣Xi)+P(Yi=0,Wi=0∣Xi)
P ( W i = 1 ∣ X i ) = P ( W i = 0 ∣ X i ) = 1 2 P(W_i=1|X_i)=P(W_i=0|X_i)=\frac{1}{2} P(Wi=1∣Xi)=P(Wi=0∣Xi)=21τ ( X i ) = P ( Y i = 1 ∣ X i , W i = 1 ) − P ( Y i = 1 ∣ X i , W i = 0 ) = P ( Y i = 1 , W i = 1 ∣ X i ) P ( W i = 1 ∣ X i ) − P ( Y i = 1 , W i = 0 ∣ X i ) P ( W i = 0 ∣ X i ) = 2 [ P ( Y i = 1 , W i = 1 ∣ X i ) − P ( Y i = 1 , W i = 0 ∣ X i ) ] = [ P ( Y i = 1 , W i = 1 ∣ X i ) − P ( Y i = 1 , W i = 0 ∣ X i ) ] + [ 1 2 − P ( Y i = 0 , W i = 1 ∣ X i ) − 1 2 + P ( Y i = 0 , W i = 0 ∣ X i ) ] = P ( Z i = 1 ∣ X i ) − P ( Z i = 0 ∣ X i ) = 2 P ( Z i = 1 ∣ X i ) − 1 \tau(X_i)=P(Y_i=1|X_i,W_i=1)-P(Y_i=1|X_i,W_i=0) \\ =\frac{P(Y_i=1,W_i=1|X_i)}{P(W_i=1|X_i)}-\frac{P(Y_i=1,W_i=0|X_i)}{P(W_i=0|X_i)} \\ =2[P(Y_i=1,W_i=1|X_i)-P(Y_i=1,W_i=0|X_i)]\\ =[P(Y_i=1,W_i=1|X_i)-P(Y_i=1,W_i=0|X_i)]\\+[\frac{1}{2}-P(Y_i=0,W_i=1|X_i)-\frac{1}{2}+P(Y_i=0,W_i=0|X_i)]\\ =P(Z_i=1|X_i)-P(Z_i=0|X_i)\\ =2P(Z_i=1|X_i)-1 τ(Xi)=P(Yi=1∣Xi,Wi=1)−P(Yi=1∣Xi,Wi=0)=P(Wi=1∣Xi)P(Yi=1,Wi=1∣Xi)−P(Wi=0∣Xi)P(Yi=1,Wi=0∣Xi)=2[P(Yi=1,Wi=1∣Xi)−P(Yi=1,Wi=0∣Xi)]=[P(Yi=1,Wi=1∣Xi)−P(Yi=1,Wi=0∣Xi)]+[21−P(Yi=0,Wi=1∣Xi)−21+P(Yi=0,Wi=0∣Xi)]=P(Zi=1∣Xi)−P(Zi=0∣Xi)=2P(Zi=1∣Xi)−1
最后,我们考虑一般情况,适用于非平衡非随机试验的数据,可用任意机器学习算法估计(同CATE):
Y i ∗ = Y i ( 1 ) W i p ^ ( X i ) − Y i ( 0 ) ( 1 − W i ) ( 1 − p ^ ( X i ) ) Y_{i}^{*}=Y_{i}(1) \frac{W_{i}}{\hat{p}\left(X_{i}\right)}-Y_{i}(0) \frac{\left(1-W_{i}\right)}{\left(1-\hat{p}\left(X_{i}\right)\right)} Yi∗=Yi(1)p^(Xi)Wi−Yi(0)(1−p^(Xi))(1−Wi)