GA4GH WGS Quality Control Standards
Benchmarking Resources
Our Benchmarking Sample Manifest includes predefined input files (such as CRAM/BAM and VCF) along with corresponding expected outputs in JSON format, generated by the reference QC tool.
We execute our benchmarking pipelines on publicly available WGS data specifically, Genome in a Bottle (GIAB) samples from the 1000 Genomes Phase 3 Reanalysis, accessible through the Registry of Open Data on AWS.
The 1000 Genomes dataset generated by Illumina DRAGEN Germline is widely adopted across the genomics community, including in major population-scale efforts such as the UK Biobank and the All of Us Research Program. To promote diversity and representation, we curated a subset of 100 samples from this dataset, ensuring balanced coverage across multiple populations and both genders.
Sample_name | Sex | Biosample_ID | Population_code | Population_name | Superpopulation_code | Superpopulation_name | Population_elastic_ID | Data_collections |
---|---|---|---|---|---|---|---|---|
HG00100 | female | SAME125154 | GBR | British | EUR | European Ancestry | GBR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
HG00101 | male | SAME125153 | GBR | British | EUR | European Ancestry | GBR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
HG00102 | female | SAME123945 | GBR | British | EUR | European Ancestry | GBR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
HG00103 | male | SAME125151 | GBR | British | EUR | European Ancestry | GBR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
HG00176 | female | SAME124956 | FIN | Finnish | EUR | European Ancestry | FIN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
HG00177 | female | SAME124957 | FIN | Finnish | EUR | European Ancestry | FIN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
HG00183 | male | SAME123642 | FIN | Finnish | EUR | European Ancestry | FIN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
HG00185 | male | SAME123648 | FIN | Finnish | EUR | European Ancestry | FIN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
HG00403 | male | SAME123160 | CHS | Southern Han Chinese | EAS | East Asian Ancestry | CHS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
HG00404 | female | SAME123158 | CHS | Southern Han Chinese | EAS | East Asian Ancestry | CHS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
HG00551 | female | SAME124252 | PUR | Puerto Rican | AMR | American Ancestry | PUR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
HG00552 | male | SAME124253 | PUR | Puerto Rican | AMR | American Ancestry | PUR | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG00759 | female | SAME125020 | CDX | Dai Chinese | EAS | East Asian Ancestry | CDX | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG00766 | female | SAME124458 | CDX | Dai Chinese | EAS | East Asian Ancestry | CDX | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG00844 | male | SAME123952 | CDX | Dai Chinese | EAS | East Asian Ancestry | CDX | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01028 | male | SAME123524 | CDX | Dai Chinese | EAS | East Asian Ancestry | CDX | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01112 | male | SAME123485 | CLM | Colombian | AMR | American Ancestry | CLM | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
HG01113 | female | SAME123484 | CLM | Colombian | AMR | American Ancestry | CLM | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
HG01114 | female | SAME123483 | CLM | Colombian | AMR | American Ancestry | CLM | 1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release |
HG01121 | male | SAME1840127 | CLM | Colombian | AMR | American Ancestry | CLM | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01500 | male | SAME124941 | IBS | Iberian | EUR | European Ancestry | IBS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01501 | female | SAME124942 | IBS | Iberian | EUR | European Ancestry | IBS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01502 | male | SAME124943 | IBS | Iberian | EUR | European Ancestry | IBS | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01507 | female | SAME124948 | IBS | Iberian | EUR | European Ancestry | IBS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01565 | male | SAME124566 | PEL | Peruvian | AMR | American Ancestry | PEL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01566 | female | SAME124564 | PEL | Peruvian | AMR | American Ancestry | PEL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01567 | female | SAME124565 | PEL | Peruvian | AMR | American Ancestry | PEL | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01571 | male | SAME124346 | PEL | Peruvian | AMR | American Ancestry | PEL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01583 | male | SAME1839895 | PJL | Punjabi | SAS | South Asian Ancestry | PJL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01595 | female | SAME123569 | KHV | Kinh Vietnamese | EAS | East Asian Ancestry | KHV | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01596 | male | SAME125227 | KHV | Kinh Vietnamese | EAS | East Asian Ancestry | KHV | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release |
HG01597 | female | SAME125226 | KHV | Kinh Vietnamese | EAS | East Asian Ancestry | KHV | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01840 | male | SAME123552 | KHV | Kinh Vietnamese | EAS | East Asian Ancestry | KHV | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01879 | male | SAME1839037 | ACB | African Caribbean | AFR | African Ancestry | ACB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01880 | female | SAME1839109 | ACB | African Caribbean | AFR | African Ancestry | ACB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01881 | female | SAME1839193 | ACB | African Caribbean | AFR | African Ancestry | ACB | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG01882 | male | SAME122849 | ACB | African Caribbean | AFR | African Ancestry | ACB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG02461 | male | SAME1839993 | GWD | Gambian Mandinka | AFR | African Ancestry | GWD | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,Gambian Genome Variation Project (GRCh38) |
HG02462 | female | SAME1839991 | GWD | Gambian Mandinka | AFR | African Ancestry | GWD | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,Gambian Genome Variation Project (GRCh38) |
HG02463 | male | SAME1839983 | GWD | Gambian Mandinka | AFR | African Ancestry | GWD | 1000 Genomes 30x on GRCh38 |
HG02465 | female | SAME1839973 | GWD | Gambian Mandinka | AFR | African Ancestry | GWD | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,Gambian Genome Variation Project (GRCh38) |
HG02490 | male | SAME1839625 | PJL | Punjabi | SAS | South Asian Ancestry | PJL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG02601 | female | SAME1840293 | PJL | Punjabi | SAS | South Asian Ancestry | PJL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG02922 | female | SAME1839441 | ESN | Esan | AFR | African Ancestry | ESN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG02923 | male | SAME1839436 | ESN | Esan | AFR | African Ancestry | ESN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG02924 | male | SAME1839516 | ESN | Esan | AFR | African Ancestry | ESN | 1000 Genomes 30x on GRCh38 |
HG02945 | female | SAME1839075 | ESN | Esan | AFR | African Ancestry | ESN | 1000 Genomes 30x on GRCh38 |
HG03052 | female | SAME1839496 | MSL | Mende | AFR | African Ancestry | MSL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03054 | male | SAME1839507 | MSL | Mende | AFR | African Ancestry | MSL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03056 | male | SAME1839518 | MSL | Mende | AFR | African Ancestry | MSL | 1000 Genomes 30x on GRCh38 |
HG03073 | female | SAME1839019 | MSL | Mende | AFR | African Ancestry | MSL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03593 | male | SAME1839171 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03598 | female | SAME1839194 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03604 | female | SAME1840236 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03606 | male | SAME1840248 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03616 | female | SAME1839955 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03642 | female | SAME1839627 | STU | Tamil | SAS | South Asian Ancestry | STU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03644 | male | SAME1839638 | STU | Tamil | SAS | South Asian Ancestry | STU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03646 | male | SAME1839643 | STU | Tamil | SAS | South Asian Ancestry | STU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03718 | male | SAME1839834 | ITU | Telugu | SAS | South Asian Ancestry | ITU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03720 | male | SAME1839562 | ITU | Telugu | SAS | South Asian Ancestry | ITU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03721 | female | SAME1839659 | ITU | Telugu | SAS | South Asian Ancestry | ITU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38 |
HG03722 | female | SAME1839664 | ITU | Telugu | SAS | South Asian Ancestry | ITU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
HG03890 | male | SAME1839864 | STU | Tamil | SAS | South Asian Ancestry | STU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA06991 | female | SAME124936 | CEU | CEPH | EUR | European Ancestry | CEU | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA06993 | male | SAME124934 | CEU | CEPH | EUR | European Ancestry | CEU | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA06994 | male | SAME124931 | CEU | CEPH | EUR | European Ancestry | CEU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
NA06997 | female | SAME124930 | CEU | CEPH | EUR | European Ancestry | CEU | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA18484 | female | SAME124101 | YRI | Yoruba | AFR | African Ancestry | YRI | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA18485 | male | SAME124100 | YRI | Yoruba | AFR | African Ancestry | YRI | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA18486 | male | SAME124099 | YRI | Yoruba | AFR | African Ancestry | YRI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
NA18488 | female | SAME124103 | YRI | Yoruba | AFR | African Ancestry | YRI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
NA18525 | female | SAME124672 | CHB | Han Chinese | EAS | East Asian Ancestry | CHB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA18526 | female | SAME124673 | CHB | Han Chinese | EAS | East Asian Ancestry | CHB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,90 Han Chinese high coverage genomes |
NA18530 | male | SAME124484 | CHB | Han Chinese | EAS | East Asian Ancestry | CHB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA18546 | male | SAME124263 | CHB | Han Chinese | EAS | East Asian Ancestry | CHB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA18939 | female | SAME123041 | JPT | Japanese | EAS | East Asian Ancestry | JPT | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA18941 | female | SAME124311 | JPT | Japanese | EAS | East Asian Ancestry | JPT | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA18943 | male | SAME124309 | JPT | Japanese | EAS | East Asian Ancestry | JPT | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA18944 | male | SAME124314 | JPT | Japanese | EAS | East Asian Ancestry | JPT | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA19017 | female | SAME124259 | LWK | Luhya | AFR | African Ancestry | LWK | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA19019 | female | SAME124000 | LWK | Luhya | AFR | African Ancestry | LWK | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA19020 | male | SAME124855 | LWK | Luhya | AFR | African Ancestry | LWK | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA19025 | male | SAME124852 | LWK | Luhya | AFR | African Ancestry | LWK | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA19648 | female | SAME123307 | MXL | Mexican Ancestry | AMR | American Ancestry | MXL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA19649 | male | SAME123442 | MXL | Mexican Ancestry | AMR | American Ancestry | MXL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA19650 | male | SAME124976 | MXL | Mexican Ancestry | AMR | American Ancestry | MXL | 1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release |
NA19651 | female | SAME123123 | MXL | Mexican Ancestry | AMR | American Ancestry | MXL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA19700 | male | SAME124233 | ASW | African Ancestry SW | AFR | African Ancestry | ASW | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA19701 | female | SAME124232 | ASW | African Ancestry SW | AFR | African Ancestry | ASW | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA19702 | male | SAME124231 | ASW | African Ancestry SW | AFR | African Ancestry | ASW | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA19704 | female | SAME124236 | ASW | African Ancestry SW | AFR | African Ancestry | ASW | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
NA20502 | female | SAME124358 | TSI | Toscani | EUR | European Ancestry | TSI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
NA20503 | female | SAME124357 | TSI | Toscani | EUR | European Ancestry | TSI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
NA20509 | male | SAME124354 | TSI | Toscani | EUR | European Ancestry | TSI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
NA20510 | male | SAME124562 | TSI | Toscani | EUR | European Ancestry | TSI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
NA20845 | male | SAME123430 | GIH | Gujarati | SAS | South Asian Ancestry | GIH | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA20853 | female | SAME123243 | GIH | Gujarati | SAS | South Asian Ancestry | GIH | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA20854 | female | SAME123248 | GIH | Gujarati | SAS | South Asian Ancestry | GIH | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
NA20858 | male | SAME123207 | GIH | Gujarati | SAS | South Asian Ancestry | GIH | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |