The document presents experiments assessing the effectiveness of training computer vision models using synthetic RGB images extracted from video games. The authors:
1) Collected over 60,000 synthetic samples from a video game with similar conditions to real-world datasets like CamVid and Cityscapes, providing groundtruth labels, depth, and other data.
2) Trained convolutional networks on the synthetic data for tasks like image segmentation and depth estimation, finding the networks achieved similar performance to those trained on real data.
3) Further improved performance by fine-tuning networks pre-trained on synthetic data on real data, outperforming networks pre-trained only on real data.