EmboSceneExplorer: Embodied Scene Explorer for Multimodal Perception and Navigation

Code Contributors: Ao Gao, Luosong Guo, Chaoyang Li, Jiangming Shi, Zilong Xie
Supervisors: Jingyu Gong, Xin Tan, Zhizhong Zhang, Yuan Xie†(Project Leader)

🔥 News

[2025-7-28] Version 1.0 released! 🎉

What is EmboSceneExplorer?

EmboSceneExplorer is a multimodal scene perception, understanding, and navigation system built on the Habitat simulation environment. It enables Embodied AI Agents to perform 3D perception and reconstruction, LLM-based grounding, and goal-oriented navigation within virtual 3D scenes (e.g., ScanNet). The workflow comprises four core components:

Multimodal Data Collection
Captures multimodal data including:
- RGB image sequences
- Depth maps and semantic segmentation maps
- COLMAP-style camera intrinsics and extrinsics (supporting 3D Gaussian Splatting training)
Scene Reconstruction Builds multimodal scene representations including:
- Dense point clouds
- High-fidelity meshes
- Occupancy grid maps (Occ)
3D Visual Grounding
Bridging language and spatial understanding, we've developed a 3D visual grounding model that currently achieves state-of-the-art performance across multiple metrics::
- Parsing natural language instructions (supporting both English and Chinese) into actionable goals
- Grounding semantic concepts to precise 3D locations
- Generating point-cloud-level accurate object from textual queries
Autonomous Navigation
Integrates scene representations to:
- Build navigable occupancy maps
- Plan optimal collision-free paths
- Execute exploration and goal-reaching behaviors

Quick Start

Prerequisites

Miniconda/Anaconda
NVIDIA GPU (CUDA 11.8)
Linux

Cloning the Repository

# HTTPS
git clone --recursive https://github.com/ECNU-AILab-SII/EmboSceneExplorer.git

or

# SSH
git clone --recursive [email protected]:ECNU-AILab-SII/EmboSceneExplorer.git

Conda Environment Setup

# Create environment from provided YAML
conda env create -f environment.yml

# Activate environment
conda activate emboscene

Data Preparation

# Download example scenes
gdown https://drive.google.com/file/d/1jwboFEruYFIG9c31qWga6X-vgbraKIt-/view?usp=sharing
unzip scenes.zip -d example_data/

# Modify the root_path in example_data/scanet/scannet.yaml to the project's absolute path
data_path: /xxx/xxx/EmboSceneExplorer/....

# Copy point.yaml and scant.yaml to corresponding locations in submodules  
cp example_data/scanet/pointnav_scannet.yaml ./submodules/habitat-lab/habitat-lab/habitat/config/benchmark/nav/pointnav
cp example_data/scanet/scannet.yaml ./submodules/habitat-lab/habitat-lab/habitat/config/habitat/dataset/pointnav/

# Download pretrained 3D visual grounding model
gdown https://drive.google.com/file/d/1OlBSTpcyIlcCqxqKgYztss6bBIIPJDFc/view?usp=sharing

Start

cd bash_scripts

# 1. Data collection:
bash data_collection.sh

# 2. RGB-D Reconstruction:
bash reconstruction.sh

# 3. Occupancy map reconstruction:
bash occupancy.sh

# 4. 3D Visual grounding (supporting both English and Chinese):
bash visual_grounding.sh

# 5. Navigation:
bash navigation.sh

Support

Report bugs or request features via GitHub Issues.
Join discussions or ask questions on GitHub Discussions.

License and Acknowledgments

EmboSceneExplorer is MIT licensed. See the LICENSE for details.

EmboSceneExplorer's development has been made possible thanks to these open-source projects:

Habitat-sim: A flexible, high-performance 3D simulator for embodied AI research.
Habitat-lab: A modular high-level library for end-to-end development in embodied AI.
3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer.

Citation

If you use EmboSceneExplorer in your research, please consider citing:

@misc{EmboSceneExplorer,
  author = {Ao Gao, Luosong Guo, Chaoyang Li, Jiangming Shi, Zilong Xie, Jingyu Gong, Xin Tan, Zhizhong Zhang, Yuan Xie},
  title = {EmboSceneExplorer: Embodied Scene Explorer for Multimodal Perception and Navigation},
  month = {July},
  year = {2025},
  url = {https://github.com/ECNU-AILab-SII/EmboSceneExplorer/}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
bash_scripts		bash_scripts
src		src
submodules		submodules
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EmboSceneExplorer: Embodied Scene Explorer for Multimodal Perception and Navigation

🔥 News

Table of Contents

What is EmboSceneExplorer?

Quick Start

Prerequisites

Cloning the Repository

Conda Environment Setup

Data Preparation

Start

Support

License and Acknowledgments

Citation

About

Uh oh!

Releases

Packages

Languages

License

ECNU-AILab-SII/EmboSceneExplorer

Folders and files

Latest commit

History

Repository files navigation

EmboSceneExplorer: Embodied Scene Explorer for Multimodal Perception and Navigation

🔥 News

Table of Contents

What is EmboSceneExplorer?

Quick Start

Prerequisites

Cloning the Repository

Conda Environment Setup

Data Preparation

Start

Support

License and Acknowledgments

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages