🔔 DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation

📚 Contents

Abstract
Update
Demo
Get Started
Evaluation
Citation

✨ Abstract

Adaptive navigation in unfamiliar environments is crucial for household service robots but remains challenging due to the need for both low-level path planning and high-level scene understanding. While recent vision-language model (VLM) based zero-shot approaches reduce dependence on prior maps and scene-specific training data, they face significant limitations: spatiotemporal discontinuity from discrete observations, unstructured memory representations, and insufficient task understanding leading to navigation failures. We propose DORAEMON (Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation), a novel cognitive-inspired, zero-shot, end-to-end framework consisting of Ventral and Dorsal Streams that mimics human navigation capabilities. The Dorsal Stream implements the Hierarchical Semantic-Spatial Fusion and Topology Map to handle spatiotemporal discontinuities, while the Ventral Stream combines CoDe-VLM and Exec-VLM to improve decision-making. Our approach also develops Nav-Ensurance to ensure navigation safety and efficiency. We evaluate DORAEMON on the HM3Dv1, HM3Dv2, MP3D, where it achieves state-of-the-art performance on both SR and SPL metrics, significantly outperforming existing methods. We also introduce a new evaluation metric (AORI) to assess navigation intelligence better. Comprehensive experiments demonstrate DORAEMON's effectiveness in zero-shot and end-to-end navigation without requiring prior map building or pre-training.

💥 Update

🔥 We've reorganized and cleaned up the repository to ensure a clear, well-structured codebase. Please give the training and inference scripts a try, and feel free to leave an issue if you run into any problems. We apologize for any confusion caused by our original codebase release. 5.15, 2025

🔥 We've released some demos. 5.22, 2025

📺 Demo

🛋️ SOFA

🟦 TABLE

🛏️ BED

🌳 PLANT

🗄️ CABINET

💺 CHAIR

🌳 PLANT

🛋️ SOFA

📺 TV

🚽 TOILET

🛋️ SOFA

💺 CHAIR

🎬Real: an orange cushion that fell on the ground

🚀 Get Started

⚙️ Installation and Setup

Clone this repo.

Create the conda environment and install all dependencies.

conda create -n doraemon python=3.9 cmake=3.14.0
conda activate doraemon
conda install habitat-sim=0.3.1 withbullet headless -c conda-forge -c aihabitat
pip install -r requirements.txt

🛢 Prepare Dataset

This project is based on Habitat simulator and the HM3D and MP3D datasets are available here. Our code requires all above data to be in a data folder in the following format. Move the downloaded HM3D v0.1, HM3D v0.2 and MP3D folders into the following configuration:

├── <DATASET_ROOT>
│  ├── hm3d_v0.1/
│  │  ├── val/
│  │  │  ├── 00800-TEEsavR23oF/
│  │  │  │  ├── TEEsavR23oF.navmesh
│  │  │  │  ├── TEEsavR23oF.glb
│  │  ├── hm3d_annotated_basis.scene_dataset_config.json
│  ├── objectnav_hm3d_v0.1/
│  │  ├── val/
│  │  │  ├── content/
│  │  │  │  ├──4ok3usBNeis.json.gz
│  │  │  ├── val.json.gz
│  ├── hm3d_v0.2/
│  │  ├── val/
│  │  │  ├── 00800-TEEsavR23oF/
│  │  │  │  ├── TEEsavR23oF.basis.navmesh
│  │  │  │  ├── TEEsavR23oF.basis.glb
│  │  ├── hm3d_annotated_basis.scene_dataset_config.json
│  ├── objectnav_hm3d_v0.2/
│  │  ├── val/
│  │  │  ├── content/
│  │  │  │  ├──4ok3usBNeis.json.gz
│  │  │  ├── val.json.gz
│  ├── mp3d/
│  │  ├── 17DRP5sb8fy/
│  │  │  ├── 17DRP5sb8fy.glb
│  │  │  ├── 17DRP5sb8fy.house
│  │  │  ├── 17DRP5sb8fy.navmesh
│  │  │  ├── 17DRP5sb8fy_semantic.ply
│  │  ├── mmp3d.scene_dataset_config.json
│  ├── objectnav_mp3d/
│  │  ├── val/
│  │  │  ├── content/
│  │  │  │  ├──2azQ1b91cZZ.json.gz
│  │  │  ├── val.json.gz

🔑 Prepare Gemini API

You can set your own GeminiAPI key by export GEMINI_API_KEY=xxx

📈 Evaluation

Run python scripts/main.py to visualize the result of an episode.

To evaluate DORAEMON, we use a framework for parallel evaluation (HM3D v0.1 contains 1000 episodes, 2000 episodes for HM3D v0.2 and 2195 episodes for MP3D). The file parallel_gpu0.sh contains a script to distribute K instances over N GPUs, and for each of them to run M episodes. A local flask server is initialized to handle the data aggregation, and then the aggregated results are logged to wandb. Make sure you are logged in with wandb login.

📖 Citation

If you find our work useful, please cite:

@misc{gu2025doraemondecentralizedontologyawarereliable,
      title={DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation}, 
      author={Tianjun Gu and Linfeng Li and Xuhong Wang and Chenghua Gong and Jingyu Gong and Zhizhong Zhang and Yuan Xie and Lizhuang Ma and Xin Tan},
      year={2025},
      eprint={2505.21969},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2505.21969}, 
}

📫 Contact

For questions about this work, please contact:

Tianjun Gu: [email protected]

Project Page: https://grady10086.github.io/DORAEMON/

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
config		config
demos		demos
docs		docs
scripts		scripts
src		src
README.md		README.md
doraemon_icon.png		doraemon_icon.png
fig1.png		fig1.png
fig2.png		fig2.png
fig3.png		fig3.png
fig4.png		fig4.png
p0.sh		p0.sh
parallel_gpu0.sh		parallel_gpu0.sh
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔔 DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation

📚 Contents

✨ Abstract

💥 Update

📺 Demo

🚀 Get Started

⚙️ Installation and Setup

🛢 Prepare Dataset

🔑 Prepare Gemini API

📈 Evaluation

📖 Citation

📫 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Grady10086/DORAEMON

Folders and files

Latest commit

History

Repository files navigation

🔔 DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation

📚 Contents

✨ Abstract

💥 Update

📺 Demo

🚀 Get Started

⚙️ Installation and Setup

🛢 Prepare Dataset

🔑 Prepare Gemini API

📈 Evaluation

📖 Citation

📫 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages