Reference panel for imputation of Indian SNP data

India has one of most diverse human genetic pool in the world with 4635 anthropologically well-defined populations, who speaks one of the 4 major language families; Indo-European (IE), Dravidian (DR), Austro-Asiatic (AA) and Tibeto-Burmans (TB); and maintaining endogamy for the last thousands of years. We have earlier demonstrated that the contemporary Indian populations are the admixture of 2 ancestral populations; Ancestral North Indians (ANI) and Ancestral South Indians (ASI)1. While, ANI has genetic affinities with Middle Easterns, West-Eurasians and Europeans; ASI is not related to any group outside Indian sub-continent. Hence, genetic data on Indian populations should be analyzed with caution!

To increase the statistical power of genome-wide association study (GWAS), it has been the common practice to impute the missing genotypes with various tools, including Beagle2. Since, this tool needs the reference panel of haplotype for imputation; we used existing reference panels, which include;African, European and East Asian HapMap population samples; South-Asians of 1000 genome project3; Indian population samples (Indo-Europeans and Dravidians); and combined HapMap and Indian population samples. We found that the Indian reference samples have shown better performance, compared to other reference samples. Based on this, we generated our own reference panel (8,717,71 SNPs) for imputation, which includes haplotype of founders in 15 Dravidians trios and 13 Indo-Europeans trios4,5.

We believe that this Indian reference panel would be highly useful for those, who are working on Indian population genetics. Hence, the data and the script (for examining imputation accuracy) are made freely available for the research purpose.


Please feel free to contact us in case of any difficulty: /


