adding a new recipe on post training cr2 for driving captioning#131
adding a new recipe on post training cr2 for driving captioning#131jingyijin2 merged 7 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive documentation for post-training the Cosmos Reason 2 model for autonomous vehicle (AV) video captioning and visual question answering (VQA), developed in collaboration between Uber and NVIDIA. The recipe demonstrates how to adapt Cosmos Reason 2 for domain-specific AV captioning through targeted supervised fine-tuning.
Key changes include:
- A detailed post-training recipe with benchmark definition, zero-shot evaluation, data curation, fine-tuning, and re-evaluation sections
- Supporting assets (video samples, result visualizations) demonstrating training data and evaluation metrics
- Integration into the documentation structure with updated navigation files
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/recipes/post_training/reason2/video_caption_vqa/post_training.md | Main recipe documentation covering workflow, benchmarks, training configuration, and evaluation results |
| docs/recipes/post_training/reason2/video_caption_vqa/assets/*.mp4 | Video assets demonstrating training samples and annotation examples |
| docs/recipes/post_training/reason2/video_caption_vqa/assets/*.png | Visualization charts for BLEU, MCQ-based VQA, and LingoQA evaluation results |
| docs/recipes/post_training/reason2/video_caption_vqa/assets/*.json | Sample annotation file showing structured ground truth format |
| docs/recipes/post_training/reason2/video_caption_vqa/SUMMARY.md | Navigation file linking to the post-training documentation |
| docs/recipes/post_training/SUMMARY.md | Updated table of contents adding the new Reason 2 recipe entry |
| docs/recipes/inference/reason2/intbot_showcase/inference.md | Typo fix in existing documentation (unrelated to main PR purpose) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/recipes/post_training/reason2/video_caption_vqa/post_training.md
Outdated
Show resolved
Hide resolved
docs/recipes/post_training/reason2/video_caption_vqa/post_training.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/recipes/post_training/reason2/video_caption_vqa/post_training.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ning.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ning.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Description
Brief description of the changes in this PR.
Type of Change
Changes Made
Testing
Checklist
Additional Notes
Any additional information that reviewers should know.