Sequencing data quality
Quality control checks are performed upon receipt of the data from the sequencing provider. These metrics are agreed with the sequencing provider in the service contract as the minimum data quality required for each genome. If any of these conditions are not satisfied, the sample is not processed via the pipeline and Genomics England troubleshoots with the sequencing provider. If, after troubleshooting, genomic data is redelivered and contractual metrics are met, the sample is processed through the pipeline. Otherwise, sample failure is reported and the sample does not proceed through the pipeline. The details of this procedure can be found in the Data QC issues section of the Generation Study Unhappy Paths SOP.
The following checks are performed upon receipt of the sample genome sequencing data:
- md5sum check to confirm integrity of the genomic data transferred from the sequencing provider.
- 95% of the autosomal genome covered at ≥15x calculated from reads with mapping quality >10, after removing duplicate reads and after adaptor and quality trimming.
- >85x10^9 bases with base quality ≥ Q30, after removing duplicate reads and after adaptor and quality trimming.