Skip to content

Genomic sex inference and check

As part of the data quality check process, possible sample swaps and data entry errors are identified using a comparison made between baby’s phenotypic sex as reported by the NHS recruiting site in the Generation Study Portal (GSP) and the inferred genomic sex. It is also possible that the discrepancy between phenotypic and genomic sex can occur as a disorder in sex development.

Genomic sex is inferred by calculating the ratios of coverage of the sex chromosomes to autosomes. The check passes if the inferred sex karyotype is consistent with the reported phenotypic sex, i.e. for males at least one Y chromosome is required to be present, while for females no chromosome Y should be present.

If genomic sex is different from phenotypic sex (i.e. when no Y chromosome coverage is detected for reported males or Y chromosome is present in reported females), an NGDCS (Newborns Genomic data check) ticket is raised. The pipeline proceeds to completion and variant detection and prioritisation is performed. If there are prioritised variants in the genome, the sample is dispatched to VRT for clinical scientist review and flagged with the INFERRED GENETIC AND REPORTED SEX DISCORDANT flag. Other downstream actions are summarised in the Unhappy Paths SOP.

In a small number of cases the pipeline cannot unambiguously infer sex karyotype. In those cases, an NGDCS ticket is raised and the data is reviewed by a trained Genomics England staff to manually assign the most appropriate karyotype. The reasons that most commonly cause this are sex chromosome mosaicism or sex chromosome structural rearrangements. After sex karyotype is assigned manually, the pipeline is restarted and proceeds the same way as if sex karyotype was automatically inferred.

Reported vs genomic sex check

Figure 2: Handling of samples with non-matching genomic vs reported sex