Centromeric regions contain significant human genetic variation that is not represented in current reference genomes. This document proposes a two-part approach to characterize sequence variation in centromeric regions: (1) construct chromosome-specific reference maps of centromeric DNA, and (2) expand the human variation reference map to include centromeric regions. Key aspects include using long reads to assemble higher-order repeats, short reads to estimate array sizes and variant frequencies, and graph representations to model structural variation while retaining haplotype information. This would provide new insights into centromeric biology and identify centromeric variants associated with disease.