This document discusses using deep learning techniques like LSTM and CTC for optical character recognition (OCR), specifically for Vietnamese documents. It provides an overview of OCR, the history including Tesseract, and challenges with traditional approaches. Connectionist temporal classification (CTC) is introduced as a way to directly train RNNs on unsegmented sequence data. CTC combined with LSTM networks allows for end-to-end training of OCR without needing pre-segmented text. The document demonstrates how this approach can be applied to perform OCR on Vietnamese documents.