Alignment and Variant detection
DRAGEN v4.0.5b is used for alignment of short reads to the reference genome GRCh38. A graph mapper is used to improve variant calling accuracy in segmental duplications and other difficult to map regions. To evolve the FASTA reference to a graph reference, DRAGEN augments the FASTA reference with around 900,000 short alternate contigs derived from population haplotypes of phased variants. The mapper has alt-aware capabilities that project reads that match the population haplotypes to corresponding primary assembly alignments with a precise lift-over alignment. For more information on DRAGEN aligner see the supporting documentation.
Alignments are stored in CRAM files which contain both mapped and unmapped reads. Detection of small variants (single nucleotide variants (SNVs) and indels) and copy number variants (CNVs) are performed using the DRAGEN small variant caller and DRAGEN CNV respectively. Detection of smaller CNVs in the size range of 2-10kb is supplemented by DRAGEN SV caller, and in this size range only CNVs which presence is supported by both DRAGEN CNV and SV callers are annotated and reported. SMN1 and SMN2 copy numbers are inferred using DRAGEN SMN caller.
Information on inferred sex karyotype is considered during variant calling such that the overall ploidy of the X chromosome is considered (with possible values of 1 or 2 copies), and haploid calls are produced where appropriate. Variant calling is performed assuming a haploid model for chromosome X for individuals inferred to have to have a single copy of chromosome X (for example, XY, XO, XYY karyotypes) and assuming a diploid model for individuals inferred to have two or more copies of chromosome X (for example, XX, XXX, XXY karyotypes).