Embed presentation
Download to read offline























This document summarizes the author's masters project on developing a document translation system. The system uses a multi-step pipeline including text detection with the CRAFT model, text recognition with STR, text merging, image inpainting with DeepFillV2, and translation via Google Translate API. Details are provided on the models used, data processing, and approach for each step of the pipeline to translate documents while preserving layout and design elements.






















