迷迭香、鹏-CSDN博客

原创 MVS 无监督学习

语义一致性 (Self-supervised multi view stereo via effective co-segmentation and data-augmentation)双对比学习 CL-MVSNet: Unsupervised multi-view stereo with dual-level contrastive learning。法向量一致 (M3VSNet: Unsupervised multi-metric multi-view stereo network)

2025-04-08 21:49:29 282

原创三维重建相关链接

链接。

2025-03-18 16:01:36 206

原创 Defect Detection

for 架baseline - 调研已有道路缺陷检测的综述、18年后Trans.和顶会文章。建议从新的往前调研，了解目前的sota方法的做法、下载主流的数据集。总结出一个性能汇总统计表，里面包含了现有同赛道方法在不同数据集中的性能指标（例如：mAP、IoU、F1 Score、Precision、Recall，这里指标有点多，现有方法用什么指标 show多少指标我们直接follow就ok）。同时标明方法发表在什么刊、时间和是否开源了代码和模型。

2025-03-18 15:59:09 318

原创三维重建数据集

公开数据集。

2025-02-26 17:09:46 701

原创 Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation

Inferring scene geometry from images via Structure from Motion is a long-standing and fundamental problem in computer vision. While classical approaches and, more recently, depth map predictions only focus on the visible parts of a scene, the task of scene

2025-01-07 09:52:59 913

原创每周汇总_2

3dgs 得到mesh。

2024-12-10 17:40:50 150

原创 Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference

Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to

2024-12-10 15:36:30 918

原创 DeepSFM: Structure From Motion Via Deep Bundle Adjustment

Structure from motion (SfM) is an essential computer vision problem which has not been well handled by deep learning. One of the promising trends is to apply explicit structural constraint, e.g. 3D cost volume, into the network. However, existing methods u

2024-12-10 15:35:36 881

原创无人机三维重建

用于单目相机的目前一种主流的pipline 是经过 sfm，得到相机位姿之后，用mvs 进行稠密重建，也有别的方式直接得到稠密点云，比如最近出的dust3r，vggsfm，但这种方式有很大缺点，比较吃显存，同时处理速度也没有多大提升。

2024-11-19 10:19:51 469

原创 DUSt3R: Geometric 3D Vision Made Easy

Multi-view stereo reconstruction (MVS) in the wild requires to first estimate the camera parameters e.g. intrinsic and extrinsic parameters. These are usually tedious and cumbersome to obtain, yet they are mandatory to triangulate corresponding pixels in 3

2024-07-02 13:32:23 1899

原创 Ubuntu 常用指令

【代码】Ubuntu 常用指令。

2023-12-26 21:18:19 443 1

原创 MVSTER: Epipolar Transformer for EfficientMulti-View Stereo

MVSTER网络结构如图所示。给定参考图像及其对应的源图像，首先利用特征金字塔网络提取2D多尺度特征。然后将源图像特征变化到参考摄像机坐标系下，通过可微单应性构造源体(Sec. 3.1)。随后，利用极线Transformer聚合源体并产生代价体，辅助分支进行单目深度估计以增强上下文。该体由轻量级3D CNN正则化以进行深度估计(Sec. 3.2)。进一步以级联结构构建，以粗到细的方式传播深度图 (Sec. 3.3).。

2023-12-26 13:31:44 1965 1

原创 csadasda

```python# TODO 使用K-means实现一幅纹理图像的滤波响应向量聚类import numpy as npfrom skimage import colorfrom skimage.filters import gaborfrom sklearn.cluster import KMeansimport matplotlib.pyplot as pltdef extract_features(image, kernels): feats = np.zeros((imag

2023-12-10 19:02:54 399 1

原创 NeRF-SLAM: Real-Time Dense Monocular SLAM with Neural Radiance Fields

We propose a novel geometric and photometric 3D mapping pipeline for accurate and real-time scene reconstruction from monocular images. To achieve this, we leverage recent advances in dense monocular SLAM and real-time hierarchical volumetric neural radian

2023-11-21 13:05:49 327 1

原创 LEAP: LIBERATE SPARSE-VIEW 3D MODELING FROM CAMERA POSES

现有的方法主要已知准确的相机位姿前提下进行的三维重建，而这大多数适用于密集视图，但准确估计稀疏视图的相机姿势通常是很难的。我们的分析表明，噪声估计姿态导致现有稀疏视图三维建模方法的性能下降。为了解决这个问题，我们提出了LEAP，一种新颖的无姿势方法，因此挑战了相机姿势不可或缺的流行观念。LEAP抛弃了基于姿态的操作，从数据中学习几何知识。

2023-11-14 16:45:15 157 1

原创 Power Bundle Adjustment for Large-Scale 3D Reconstruction

We introduce Power Bundle Adjustment as an expansion type algorithm for solving large-scale bundle adjustment problems. It is based on the power series expansion of the inverse Schur complement and constitutes a new family of solvers that we call inverse e

2023-11-14 12:26:14 170 1

原创深度学习库配置问题

1.2.0版本之前没有imageio.write。

2023-11-09 16:06:50 81 1

原创深度学习环境配置

下载conda 版本之后，利用软链接的方式进行切换。

2023-11-09 15:53:24 127 1

原创 ubuntu中库的安装方法

【代码】ubuntu中库的安装方法。

2023-11-09 15:14:24 239 1

原创 Ghost-free High Dynamic Range Imaging with Context-aware Transformer

多帧高动态范围成像（High Dynamic Range Imaging, HDR）旨在通过合并多幅不同曝光程度下的低动态范围图像，生成具有更宽动态范围和更逼真细节的图像。如果这些低动态范围图像完全对齐，则可以很好地融合为HDR图像，但是，实际拍摄到的图像容易受到相机、物体运动的干扰，三张低动态范围图像往往不能很好地得到对齐，直接对三图像做融合的话，所生成的图像容易产生伪影、重影。，本文提出了一种新的上下文感知视觉转换器（CA-VIT）用于高动态范围成像。

2023-11-05 14:37:29 244 1

原创 MVSNet: Depth Inference for Unstructured Multi-view Stereo

We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual image features, and then build the 3D cost volume upon the reference camera frustum via the differentiable homo

2023-10-23 21:37:12 165

原创 SAMLoc: Structure-Aware Constraints With Mutil-task Distillation for Long-term Visual Localization

Real-time and robust long-term visual localization is a key technology for autonomous driving. Season and illumination variance, as well as limited computing power make this problem more challenging. At present, most of the excellent visual localization al

2023-10-17 12:30:34 182

原创 ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer

通过这些方法，我们不仅能够保持相关系，还能够在高相关性的像素之间实现细粒度关注，从而补偿必要的局部信息。同时，在每个局部注意力模块，我们的网络需要回归一个辅助流图作为指导，这需要交叉视图上下文，因此加入了轻量级的 global cross attention block。在sfm、vslam 中的图像匹配中，尽管像SIFT、ORB 等关键点检测算子具有很多的应用，但由于依赖关键点检测器和特征描述中的上下文丢失，这些基于检测器的匹配方法，在极端情况下(包括大视点变化和无纹理区域)会遇到困难。

2023-10-16 22:10:05 1511 5

qq_34426949的博客