Second-order Attention Network for Single Image Super-Resolution

最新推荐文章于 2025-06-03 10:45:02 发布

原创最新推荐文章于 2025-06-03 10:45:02 发布 · 1.1k 阅读

5 ·

CC 4.0 BY-SA版权

文章标签：

#CV #SR #deeplearning

论文阅读笔记专栏收录该内容

174 篇文章

订阅专栏

介绍了一种名为SSRG的第二阶注意力网络，用于单张图片的超分辨率处理。该网络包含RL-NL模块和LSRAG模块，以及复杂的SOCA模块，通过非局部操作和共享权重提高图像细节和质量。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Second-order Attention Network for Single Image Super-Resolution
性能：
SSRG是提出来的一个新的backbone，看图就行，每个块就是 $F_g = W_{SSC}(F_0) + H_g(F_g−1)$ 其中 $W_{SSC}$ 就是共享的权重
RL-NL把feature divide为k*k个grid，对每个grid做non-local operations，至于什么是non-local operation，看这个网站https://zhuanlan.zhihu.com/p/33345791
LSRAG就是提SSRG每个小块的细节，看图就行
SOCA倒是个复杂的东西：
- 先把前一层进来的feature map $F = [f_1, · · · , f_C]$ reshape 成(C，s)的shape，其中s = H*W，设reshape后的矩阵为X
- 然后将X normalize，具体而言就是 $\overline{I}X^T$ ,这时Σ 又可以分解为 $Σ = UΛU^T$ 其中Λ为对角阵， $\hat{Y} = Σ^α = UΛ^αU^T$ 当α小于1时可以将Λ中的元素拉向1，文中设为0.5，注意这时 $\hat{Y}$ 的shape为(C,C)
- $\hat{Y} = [y_1, · · · , y_C] ,z_c = H_{GCP} (y_c) = \frac{1}{C}\sum\limits_i^C{y_c(i)}，w = f(W_Uδ(W_Dz))$ 其中 $z_c$ 为每一层channel的cov（(1,C)）经过global channel average pooling之后的输出(scalar)，而z将z_c统一为一个(C,1)的向量, δ为relu，f为sigmoid，这里得到的w就是attention的channel wise的权重： $\hat{f_c} = w_c · f_c$
其它一些细节参数如下

We set LSRAG number as G = 20 in the SSRG structure, and embed RL-NL modules (k = 2) at the head and tail of SSRG. In each LSRAG, we use m = 10 residual blocks plus single SOCA module at the tail. In SOCA module, we use 1 × 1 convolution filter with reduction ratio r = 16. For other convolution filter outside SOCA, the size and number of filter are set as 3 × 3 and C = 64, respectively. For upscale part H↑(·), we follow the works in [20, 39] and apply ESPCNN [28] to upscale the deep features, followed by one final convolution layer with three filters to produce color images (RGB channels).