This document provides an overview of the CUDA programming model for parallel computing on GPUs. It describes key CUDA concepts like the host/device memory model, threads and blocks, and how to define and launch kernels. It also provides examples of CUDA APIs for memory management and host-device data transfer. The document aims to introduce the basic features of the CUDA programming model to developers.