GA4GH WGS Quality Control Standards


Benchmarking Resources

Our Benchmarking Sample Manifest includes predefined input files (such as CRAM/BAM and VCF) along with corresponding expected outputs in JSON format, generated by the reference QC tool.

We execute our benchmarking pipelines on publicly available WGS data specifically, Genome in a Bottle (GIAB) samples from the 1000 Genomes Phase 3 Reanalysis, accessible through the Registry of Open Data on AWS.

The 1000 Genomes dataset generated by Illumina DRAGEN Germline is widely adopted across the genomics community, including in major population-scale efforts such as the UK Biobank and the All of Us Research Program. To promote diversity and representation, we curated a subset of 100 samples from this dataset, ensuring balanced coverage across multiple populations and both genders.

Sample_name Sex Biosample_ID Population_code Population_name Superpopulation_code Superpopulation_name Population_elastic_ID Data_collections
HG00100 female SAME125154 GBR British EUR European Ancestry GBR 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
HG00101 male SAME125153 GBR British EUR European Ancestry GBR 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
HG00102 female SAME123945 GBR British EUR European Ancestry GBR 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
HG00103 male SAME125151 GBR British EUR European Ancestry GBR 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
HG00176 female SAME124956 FIN Finnish EUR European Ancestry FIN 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
HG00177 female SAME124957 FIN Finnish EUR European Ancestry FIN 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
HG00183 male SAME123642 FIN Finnish EUR European Ancestry FIN 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
HG00185 male SAME123648 FIN Finnish EUR European Ancestry FIN 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
HG00403 male SAME123160 CHS Southern Han Chinese EAS East Asian Ancestry CHS 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
HG00404 female SAME123158 CHS Southern Han Chinese EAS East Asian Ancestry CHS 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
HG00551 female SAME124252 PUR Puerto Rican AMR American Ancestry PUR 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
HG00552 male SAME124253 PUR Puerto Rican AMR American Ancestry PUR 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG00759 female SAME125020 CDX Dai Chinese EAS East Asian Ancestry CDX 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG00766 female SAME124458 CDX Dai Chinese EAS East Asian Ancestry CDX 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG00844 male SAME123952 CDX Dai Chinese EAS East Asian Ancestry CDX 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01028 male SAME123524 CDX Dai Chinese EAS East Asian Ancestry CDX 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01112 male SAME123485 CLM Colombian AMR American Ancestry CLM 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
HG01113 female SAME123484 CLM Colombian AMR American Ancestry CLM 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
HG01114 female SAME123483 CLM Colombian AMR American Ancestry CLM 1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release
HG01121 male SAME1840127 CLM Colombian AMR American Ancestry CLM 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01500 male SAME124941 IBS Iberian EUR European Ancestry IBS 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01501 female SAME124942 IBS Iberian EUR European Ancestry IBS 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01502 male SAME124943 IBS Iberian EUR European Ancestry IBS 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01507 female SAME124948 IBS Iberian EUR European Ancestry IBS 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01565 male SAME124566 PEL Peruvian AMR American Ancestry PEL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01566 female SAME124564 PEL Peruvian AMR American Ancestry PEL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01567 female SAME124565 PEL Peruvian AMR American Ancestry PEL 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01571 male SAME124346 PEL Peruvian AMR American Ancestry PEL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01583 male SAME1839895 PJL Punjabi SAS South Asian Ancestry PJL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01595 female SAME123569 KHV Kinh Vietnamese EAS East Asian Ancestry KHV 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01596 male SAME125227 KHV Kinh Vietnamese EAS East Asian Ancestry KHV 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release
HG01597 female SAME125226 KHV Kinh Vietnamese EAS East Asian Ancestry KHV 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01840 male SAME123552 KHV Kinh Vietnamese EAS East Asian Ancestry KHV 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01879 male SAME1839037 ACB African Caribbean AFR African Ancestry ACB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01880 female SAME1839109 ACB African Caribbean AFR African Ancestry ACB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01881 female SAME1839193 ACB African Caribbean AFR African Ancestry ACB 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG01882 male SAME122849 ACB African Caribbean AFR African Ancestry ACB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG02461 male SAME1839993 GWD Gambian Mandinka AFR African Ancestry GWD 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,Gambian Genome Variation Project (GRCh38)
HG02462 female SAME1839991 GWD Gambian Mandinka AFR African Ancestry GWD 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,Gambian Genome Variation Project (GRCh38)
HG02463 male SAME1839983 GWD Gambian Mandinka AFR African Ancestry GWD 1000 Genomes 30x on GRCh38
HG02465 female SAME1839973 GWD Gambian Mandinka AFR African Ancestry GWD 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,Gambian Genome Variation Project (GRCh38)
HG02490 male SAME1839625 PJL Punjabi SAS South Asian Ancestry PJL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG02601 female SAME1840293 PJL Punjabi SAS South Asian Ancestry PJL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG02922 female SAME1839441 ESN Esan AFR African Ancestry ESN 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG02923 male SAME1839436 ESN Esan AFR African Ancestry ESN 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG02924 male SAME1839516 ESN Esan AFR African Ancestry ESN 1000 Genomes 30x on GRCh38
HG02945 female SAME1839075 ESN Esan AFR African Ancestry ESN 1000 Genomes 30x on GRCh38
HG03052 female SAME1839496 MSL Mende AFR African Ancestry MSL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03054 male SAME1839507 MSL Mende AFR African Ancestry MSL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03056 male SAME1839518 MSL Mende AFR African Ancestry MSL 1000 Genomes 30x on GRCh38
HG03073 female SAME1839019 MSL Mende AFR African Ancestry MSL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03593 male SAME1839171 BEB Bengali SAS South Asian Ancestry BEB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03598 female SAME1839194 BEB Bengali SAS South Asian Ancestry BEB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03604 female SAME1840236 BEB Bengali SAS South Asian Ancestry BEB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03606 male SAME1840248 BEB Bengali SAS South Asian Ancestry BEB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03616 female SAME1839955 BEB Bengali SAS South Asian Ancestry BEB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03642 female SAME1839627 STU Tamil SAS South Asian Ancestry STU 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03644 male SAME1839638 STU Tamil SAS South Asian Ancestry STU 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03646 male SAME1839643 STU Tamil SAS South Asian Ancestry STU 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03718 male SAME1839834 ITU Telugu SAS South Asian Ancestry ITU 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03720 male SAME1839562 ITU Telugu SAS South Asian Ancestry ITU 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03721 female SAME1839659 ITU Telugu SAS South Asian Ancestry ITU 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38
HG03722 female SAME1839664 ITU Telugu SAS South Asian Ancestry ITU 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
HG03890 male SAME1839864 STU Tamil SAS South Asian Ancestry STU 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA06991 female SAME124936 CEU CEPH EUR European Ancestry CEU 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA06993 male SAME124934 CEU CEPH EUR European Ancestry CEU 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA06994 male SAME124931 CEU CEPH EUR European Ancestry CEU 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
NA06997 female SAME124930 CEU CEPH EUR European Ancestry CEU 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA18484 female SAME124101 YRI Yoruba AFR African Ancestry YRI 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA18485 male SAME124100 YRI Yoruba AFR African Ancestry YRI 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA18486 male SAME124099 YRI Yoruba AFR African Ancestry YRI 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
NA18488 female SAME124103 YRI Yoruba AFR African Ancestry YRI 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
NA18525 female SAME124672 CHB Han Chinese EAS East Asian Ancestry CHB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA18526 female SAME124673 CHB Han Chinese EAS East Asian Ancestry CHB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,90 Han Chinese high coverage genomes
NA18530 male SAME124484 CHB Han Chinese EAS East Asian Ancestry CHB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA18546 male SAME124263 CHB Han Chinese EAS East Asian Ancestry CHB 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA18939 female SAME123041 JPT Japanese EAS East Asian Ancestry JPT 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA18941 female SAME124311 JPT Japanese EAS East Asian Ancestry JPT 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA18943 male SAME124309 JPT Japanese EAS East Asian Ancestry JPT 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA18944 male SAME124314 JPT Japanese EAS East Asian Ancestry JPT 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA19017 female SAME124259 LWK Luhya AFR African Ancestry LWK 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA19019 female SAME124000 LWK Luhya AFR African Ancestry LWK 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA19020 male SAME124855 LWK Luhya AFR African Ancestry LWK 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA19025 male SAME124852 LWK Luhya AFR African Ancestry LWK 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA19648 female SAME123307 MXL Mexican Ancestry AMR American Ancestry MXL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA19649 male SAME123442 MXL Mexican Ancestry AMR American Ancestry MXL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA19650 male SAME124976 MXL Mexican Ancestry AMR American Ancestry MXL 1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release
NA19651 female SAME123123 MXL Mexican Ancestry AMR American Ancestry MXL 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA19700 male SAME124233 ASW African Ancestry SW AFR African Ancestry ASW 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA19701 female SAME124232 ASW African Ancestry SW AFR African Ancestry ASW 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA19702 male SAME124231 ASW African Ancestry SW AFR African Ancestry ASW 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA19704 female SAME124236 ASW African Ancestry SW AFR African Ancestry ASW 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release
NA20502 female SAME124358 TSI Toscani EUR European Ancestry TSI 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
NA20503 female SAME124357 TSI Toscani EUR European Ancestry TSI 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
NA20509 male SAME124354 TSI Toscani EUR European Ancestry TSI 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
NA20510 male SAME124562 TSI Toscani EUR European Ancestry TSI 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis
NA20845 male SAME123430 GIH Gujarati SAS South Asian Ancestry GIH 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA20853 female SAME123243 GIH Gujarati SAS South Asian Ancestry GIH 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA20854 female SAME123248 GIH Gujarati SAS South Asian Ancestry GIH 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release
NA20858 male SAME123207 GIH Gujarati SAS South Asian Ancestry GIH 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release