This paper describes ADAM, a genomics pipeline that uses Apache Spark and Parquet to achieve a 28x speedup over current pipelines while reducing costs by 63%. The paper discusses how ADAM leverages techniques like columnar storage, Spark's distributed processing, and data locality to improve performance. Evaluation shows ADAM outperforms tools like GATK and Sambamba on tasks like variant calling and duplicate marking. The system achieves near-linear scaling to 128 nodes, enabling faster and cheaper genomic analysis through distributed processing on commodity clusters.