-
Second-order Attention Network for Single Image Super-Resolution
- 性能:
- SSRG是提出来的一个新的backbone,看图就行,每个块就是 F g = W S S C ( F 0 ) + H g ( F g − 1 ) F_g = W_{SSC}(F_0) + H_g(F_g−1) Fg=WSSC(F0)+Hg(Fg−1)其中 W S S C W_{SSC} WSSC就是共享的权重
- RL-NL把feature divide为k*k个grid,对每个grid做non-local operations,至于什么是non-local operation,看这个网站https://zhuanlan.zhihu.com/p/33345791
- LSRAG就是提SSRG每个小块的细节,看图就行
- SOCA倒是个复杂的东西:
- 先把前一层进来的feature map F = [ f 1 , ⋅ ⋅ ⋅ , f C ] F = [f_1, · · · , f_C] F=[f1,⋅⋅⋅,fC] reshape 成(C,s)的shape,其中s = H*W,设reshape后的矩阵为X
- 然后将X normalize,具体而言就是 Σ = X I ‾ X T Σ = X \overline{I}X^T Σ=XIXT,这时Σ 又可以分解为 Σ = U Λ U T Σ = UΛU^T Σ=UΛUT 其中Λ为对角阵, Y ^ = Σ α = U Λ α U T \hat{Y} = Σ^α = UΛ^αU^T Y^=Σα=UΛαUT当α小于1时可以将Λ中的元素拉向1,文中设为0.5,注意这时 Y ^ \hat{Y} Y^的shape为(C,C)
- Y ^ = [ y 1 , ⋅ ⋅ ⋅ , y C ] , z c = H G C P ( y c ) = 1 C ∑ i C y c ( i ) , w = f ( W U δ ( W D z ) ) \hat{Y} = [y_1, · · · , y_C] ,z_c = H_{GCP} (y_c) = \frac{1}{C}\sum\limits_i^C{y_c(i)},w = f(W_Uδ(W_Dz)) Y^=[y1,⋅⋅⋅,yC],zc=HGCP(yc)=C1i∑Cyc(i),w=f(WUδ(WDz))其中 z c z_c zc为每一层channel的cov((1,C))经过global channel average pooling之后的输出(scalar),而z将z_c统一为一个(C,1)的向量, δ为relu,f为sigmoid,这里得到的w就是attention的channel wise的权重: f c ^ = w c ⋅ f c \hat{f_c} = w_c · f_c fc^=wc⋅fc
- 其它一些细节参数如下
We set LSRAG number as G = 20 in the SSRG structure, and embed RL-NL modules (k = 2) at the head and tail of SSRG. In each LSRAG, we use m = 10 residual blocks plus single SOCA module at the tail. In SOCA module, we use 1 × 1 convolution filter with reduction ratio r = 16. For other convolution filter outside SOCA, the size and number of filter are set as 3 × 3 and C = 64, respectively. For upscale part H↑(·), we follow the works in [20, 39] and apply ESPCNN [28] to upscale the deep features, followed by one final convolution layer with three filters to produce color images (RGB channels).