This document discusses a Python library for parsing Hadoop Record files.
The library includes a parser that can parse Hadoop's Data Definition Language into generic Python data types. It outputs the data structure, but the user must transform it into a class structure.
The parsing library is only part of what is needed - a DDL translator is still needed to fully convert the data definition language into Python classes. Feedback is welcomed to improve the library.