Benchmarking Resources
Our Benchmarking Sample Manifest includes predefined input files (such as CRAM/BAM and VCF) along with corresponding expected outputs in JSON format, generated by the reference QC tool.
We execute our benchmarking pipelines on publicly available WGS data specifically, Genome in a Bottle (GIAB) samples from the 1000 Genomes Phase 3 Reanalysis, accessible through the Registry of Open Data on AWS.
The 1000 Genomes dataset generated by Illumina DRAGEN Germline is widely adopted across the genomics community, including in major population-scale efforts such as the UK Biobank and the All of Us Research Program. To promote diversity and representation, we curated a subset of 100 samples from this dataset, ensuring balanced coverage across multiple populations and both genders.
| Sample_name | Sex | Biosample_ID | Population_code | Population_name | Superpopulation_code | Superpopulation_name | Population_elastic_ID | Data_collections |
|---|---|---|---|---|---|---|---|---|
| HG00100 | female | SAME125154 | GBR | British | EUR | European Ancestry | GBR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| HG00101 | male | SAME125153 | GBR | British | EUR | European Ancestry | GBR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| HG00102 | female | SAME123945 | GBR | British | EUR | European Ancestry | GBR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| HG00103 | male | SAME125151 | GBR | British | EUR | European Ancestry | GBR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| HG00176 | female | SAME124956 | FIN | Finnish | EUR | European Ancestry | FIN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| HG00177 | female | SAME124957 | FIN | Finnish | EUR | European Ancestry | FIN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| HG00183 | male | SAME123642 | FIN | Finnish | EUR | European Ancestry | FIN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| HG00185 | male | SAME123648 | FIN | Finnish | EUR | European Ancestry | FIN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| HG00403 | male | SAME123160 | CHS | Southern Han Chinese | EAS | East Asian Ancestry | CHS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| HG00404 | female | SAME123158 | CHS | Southern Han Chinese | EAS | East Asian Ancestry | CHS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| HG00551 | female | SAME124252 | PUR | Puerto Rican | AMR | American Ancestry | PUR | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| HG00552 | male | SAME124253 | PUR | Puerto Rican | AMR | American Ancestry | PUR | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG00759 | female | SAME125020 | CDX | Dai Chinese | EAS | East Asian Ancestry | CDX | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG00766 | female | SAME124458 | CDX | Dai Chinese | EAS | East Asian Ancestry | CDX | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG00844 | male | SAME123952 | CDX | Dai Chinese | EAS | East Asian Ancestry | CDX | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01028 | male | SAME123524 | CDX | Dai Chinese | EAS | East Asian Ancestry | CDX | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01112 | male | SAME123485 | CLM | Colombian | AMR | American Ancestry | CLM | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| HG01113 | female | SAME123484 | CLM | Colombian | AMR | American Ancestry | CLM | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| HG01114 | female | SAME123483 | CLM | Colombian | AMR | American Ancestry | CLM | 1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release |
| HG01121 | male | SAME1840127 | CLM | Colombian | AMR | American Ancestry | CLM | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01500 | male | SAME124941 | IBS | Iberian | EUR | European Ancestry | IBS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01501 | female | SAME124942 | IBS | Iberian | EUR | European Ancestry | IBS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01502 | male | SAME124943 | IBS | Iberian | EUR | European Ancestry | IBS | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01507 | female | SAME124948 | IBS | Iberian | EUR | European Ancestry | IBS | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01565 | male | SAME124566 | PEL | Peruvian | AMR | American Ancestry | PEL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01566 | female | SAME124564 | PEL | Peruvian | AMR | American Ancestry | PEL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01567 | female | SAME124565 | PEL | Peruvian | AMR | American Ancestry | PEL | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01571 | male | SAME124346 | PEL | Peruvian | AMR | American Ancestry | PEL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01583 | male | SAME1839895 | PJL | Punjabi | SAS | South Asian Ancestry | PJL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01595 | female | SAME123569 | KHV | Kinh Vietnamese | EAS | East Asian Ancestry | KHV | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01596 | male | SAME125227 | KHV | Kinh Vietnamese | EAS | East Asian Ancestry | KHV | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release |
| HG01597 | female | SAME125226 | KHV | Kinh Vietnamese | EAS | East Asian Ancestry | KHV | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01840 | male | SAME123552 | KHV | Kinh Vietnamese | EAS | East Asian Ancestry | KHV | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01879 | male | SAME1839037 | ACB | African Caribbean | AFR | African Ancestry | ACB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01880 | female | SAME1839109 | ACB | African Caribbean | AFR | African Ancestry | ACB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01881 | female | SAME1839193 | ACB | African Caribbean | AFR | African Ancestry | ACB | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG01882 | male | SAME122849 | ACB | African Caribbean | AFR | African Ancestry | ACB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG02461 | male | SAME1839993 | GWD | Gambian Mandinka | AFR | African Ancestry | GWD | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,Gambian Genome Variation Project (GRCh38) |
| HG02462 | female | SAME1839991 | GWD | Gambian Mandinka | AFR | African Ancestry | GWD | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,Gambian Genome Variation Project (GRCh38) |
| HG02463 | male | SAME1839983 | GWD | Gambian Mandinka | AFR | African Ancestry | GWD | 1000 Genomes 30x on GRCh38 |
| HG02465 | female | SAME1839973 | GWD | Gambian Mandinka | AFR | African Ancestry | GWD | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,Gambian Genome Variation Project (GRCh38) |
| HG02490 | male | SAME1839625 | PJL | Punjabi | SAS | South Asian Ancestry | PJL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG02601 | female | SAME1840293 | PJL | Punjabi | SAS | South Asian Ancestry | PJL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG02922 | female | SAME1839441 | ESN | Esan | AFR | African Ancestry | ESN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG02923 | male | SAME1839436 | ESN | Esan | AFR | African Ancestry | ESN | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG02924 | male | SAME1839516 | ESN | Esan | AFR | African Ancestry | ESN | 1000 Genomes 30x on GRCh38 |
| HG02945 | female | SAME1839075 | ESN | Esan | AFR | African Ancestry | ESN | 1000 Genomes 30x on GRCh38 |
| HG03052 | female | SAME1839496 | MSL | Mende | AFR | African Ancestry | MSL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03054 | male | SAME1839507 | MSL | Mende | AFR | African Ancestry | MSL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03056 | male | SAME1839518 | MSL | Mende | AFR | African Ancestry | MSL | 1000 Genomes 30x on GRCh38 |
| HG03073 | female | SAME1839019 | MSL | Mende | AFR | African Ancestry | MSL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03593 | male | SAME1839171 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03598 | female | SAME1839194 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03604 | female | SAME1840236 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03606 | male | SAME1840248 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03616 | female | SAME1839955 | BEB | Bengali | SAS | South Asian Ancestry | BEB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03642 | female | SAME1839627 | STU | Tamil | SAS | South Asian Ancestry | STU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03644 | male | SAME1839638 | STU | Tamil | SAS | South Asian Ancestry | STU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03646 | male | SAME1839643 | STU | Tamil | SAS | South Asian Ancestry | STU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03718 | male | SAME1839834 | ITU | Telugu | SAS | South Asian Ancestry | ITU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03720 | male | SAME1839562 | ITU | Telugu | SAS | South Asian Ancestry | ITU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03721 | female | SAME1839659 | ITU | Telugu | SAS | South Asian Ancestry | ITU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38 |
| HG03722 | female | SAME1839664 | ITU | Telugu | SAS | South Asian Ancestry | ITU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| HG03890 | male | SAME1839864 | STU | Tamil | SAS | South Asian Ancestry | STU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA06991 | female | SAME124936 | CEU | CEPH | EUR | European Ancestry | CEU | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA06993 | male | SAME124934 | CEU | CEPH | EUR | European Ancestry | CEU | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA06994 | male | SAME124931 | CEU | CEPH | EUR | European Ancestry | CEU | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| NA06997 | female | SAME124930 | CEU | CEPH | EUR | European Ancestry | CEU | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA18484 | female | SAME124101 | YRI | Yoruba | AFR | African Ancestry | YRI | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA18485 | male | SAME124100 | YRI | Yoruba | AFR | African Ancestry | YRI | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA18486 | male | SAME124099 | YRI | Yoruba | AFR | African Ancestry | YRI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| NA18488 | female | SAME124103 | YRI | Yoruba | AFR | African Ancestry | YRI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| NA18525 | female | SAME124672 | CHB | Han Chinese | EAS | East Asian Ancestry | CHB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA18526 | female | SAME124673 | CHB | Han Chinese | EAS | East Asian Ancestry | CHB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,90 Han Chinese high coverage genomes |
| NA18530 | male | SAME124484 | CHB | Han Chinese | EAS | East Asian Ancestry | CHB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA18546 | male | SAME124263 | CHB | Han Chinese | EAS | East Asian Ancestry | CHB | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA18939 | female | SAME123041 | JPT | Japanese | EAS | East Asian Ancestry | JPT | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA18941 | female | SAME124311 | JPT | Japanese | EAS | East Asian Ancestry | JPT | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA18943 | male | SAME124309 | JPT | Japanese | EAS | East Asian Ancestry | JPT | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA18944 | male | SAME124314 | JPT | Japanese | EAS | East Asian Ancestry | JPT | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA19017 | female | SAME124259 | LWK | Luhya | AFR | African Ancestry | LWK | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA19019 | female | SAME124000 | LWK | Luhya | AFR | African Ancestry | LWK | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA19020 | male | SAME124855 | LWK | Luhya | AFR | African Ancestry | LWK | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA19025 | male | SAME124852 | LWK | Luhya | AFR | African Ancestry | LWK | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA19648 | female | SAME123307 | MXL | Mexican Ancestry | AMR | American Ancestry | MXL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA19649 | male | SAME123442 | MXL | Mexican Ancestry | AMR | American Ancestry | MXL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA19650 | male | SAME124976 | MXL | Mexican Ancestry | AMR | American Ancestry | MXL | 1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release |
| NA19651 | female | SAME123123 | MXL | Mexican Ancestry | AMR | American Ancestry | MXL | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA19700 | male | SAME124233 | ASW | African Ancestry SW | AFR | African Ancestry | ASW | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA19701 | female | SAME124232 | ASW | African Ancestry SW | AFR | African Ancestry | ASW | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA19702 | male | SAME124231 | ASW | African Ancestry SW | AFR | African Ancestry | ASW | 1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA19704 | female | SAME124236 | ASW | African Ancestry SW | AFR | African Ancestry | ASW | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release |
| NA20502 | female | SAME124358 | TSI | Toscani | EUR | European Ancestry | TSI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| NA20503 | female | SAME124357 | TSI | Toscani | EUR | European Ancestry | TSI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| NA20509 | male | SAME124354 | TSI | Toscani | EUR | European Ancestry | TSI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,Human Genome Structural Variation Consortium, Phase 2,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| NA20510 | male | SAME124562 | TSI | Toscani | EUR | European Ancestry | TSI | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release,1000 Genomes phase 1 release,Geuvadis |
| NA20845 | male | SAME123430 | GIH | Gujarati | SAS | South Asian Ancestry | GIH | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA20853 | female | SAME123243 | GIH | Gujarati | SAS | South Asian Ancestry | GIH | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA20854 | female | SAME123248 | GIH | Gujarati | SAS | South Asian Ancestry | GIH | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |
| NA20858 | male | SAME123207 | GIH | Gujarati | SAS | South Asian Ancestry | GIH | 1000 Genomes on GRCh38,1000 Genomes 30x on GRCh38,1000 Genomes phase 3 release |