DNA Ethnicity
Debbie Kennett explains how to make sense of your DNA test – and how you can follow your roots around the world
Debbie Kennett reveals how to make sense of your DNA test, and how to follow your roots around the world
DNA testing has now become an integral part of family history research. For genealogy purposes it’s the list of cousin matches that is the most important part of the test, but all of the companies also provide a biogeographical ancestry report where they try to assign our DNA to different countries or regions of the world. These reports are becoming increasingly granular in nature. They can sometimes help to inform your research, and will occasionally lead to surprise discoveries.
How Do Companies Calculate Your Ethnicity?
Each company tries to put together a panel of reference populations. These are modern individuals who are deemed to be representative of a particular country, region or population. The samples are collected either from publicly available research projects or from the companies’ own databases. The research samples are taken from projects such as the 1000 Genomes Project, the Human Genome Diversity Project, the Simons Genome Diversity Project, the Asian Diversity Project and the People of the British Isles Project. Samples are generally collected from people with deep roots in a particular location. For some projects the requirement is that the individual should have four grandparents from the same country, region or county.
The next stage is to analyse the reference samples and to put them into a predefined number of genetic clusters. Outliers are removed. The clusters are then given names based on the present-day countries or regions they represent. The clusters
will inevitably have some overlap because our DNA is not confined to modern political boundaries, so a French cluster might also include people from other countries in NorthWest Europe. Each individual customer’s DNA is then compared with the reference populations, and you are given percentages based on your closest matching populations. You can of course only be matched to populations represented in the company database. For example, if you are from Denmark and the company has no Danish samples, you will be matched to the next closest population. The companies sell most of their tests in Europe, North America and Australasia, and it is these populations that are best represented in the databases. It is now possible to get quite granular results for people of European ancestry, but the results for people of African or Asian heritage are much less detailed and can often be disappointing. Academic research has also focused on Europe, although this is slowly starting to change. The testing companies started out by comparing individual markers that are mainly found in specific populations. These are known as ‘ancestry informative markers’. Living DNA and AncestryDNA have adopted a different approach and use ‘haplotypes’ – markers that are linked together. This approach is more representative of recent ancestry within the past 500 years or so. Each company uses different algorithms and compares your results to different reference populations, so results vary from company to company.
The companies are also constantly striving to improve their results, so your results with any one company will change over time. For example, at AncestryDNA I came out as
21 per cent Great Britain when
‘It is now possible to get quite results granular for people of European ancestry’
there were just 3,000 people in the reference panel. But when the panel increased to 16,000 my results changed and I was 94 per cent England and Wales. With its most recent update Ancestry now has 40,000 reference samples and I became 80 per cent England, Wales and North-West Europe. Both MyHeritage and FamilyTreeDNA are working on updating their ancestry estimates, and we can expect to see these rolled out later this year. FamilyTreeDNA provided a preview of myOrigins 3.0 at the family history show RootsTech in Salt Lake City, Utah, in February. The number of its reference populations will increase from 24 to 90. The number of populations in Africa will increase from four to about 21, and there will be four distinct Jewish populations.
‘Percentages under 1 per cent are often just noise’
Interpreting The Results
In general, the percentages are most accurate at the continental level. Within Europe the companies can broadly separate North-Western European, Southern European and Eastern
European ancestry, even if the countries assigned within these regions are not correct. The ancestry proportions with the largest percentages tend to be the most reliable. Small percentages under 1 per cent are often nothing more than noise, and percentages under about 20 per cent will not necessarily provide a true reflection of your recent ancestry. If you find that a company gives you 8 per cent Iberian or 9 per cent Italian and you have no documented ancestry from these locations, you shouldn’t start looking for Spanish or Italian ancestors in your family tree. You’ll probably find that these percentages disappear the next time the company updates its product. Results are more likely to be reliable if the admixture is found consistently in the results from a number of different companies. The most reliable indicator of your ancestry is not your ethnicity estimate but the
the early 1800s. However, the results can be expected to improve as more people join the database.
MyHeritage is working on a similar feature known as genetic groups that was previewed at its users conference in Amsterdam in September. MyHeritage has a large customer base in continental Europe, Finland and Scandinavia, so we can look forward to good sub-regional resolution in these locations.
Living DNA has specialised in providing fine-scale regional breakdowns in the UK. Rather than assign you to communities or subregions, it incorporates the UK subregions in its main ancestry report. A combination of clustering and chromosome painting assigns people to subregions. If more than 20 per cent of your ancestry is assigned to Great Britain, you are compared to the 21 subregions with the results displayed on a map. Living DNA takes its UK reference data from the People of the British Isles Project, which collected samples from individuals who have four grandparents born in the same rural county.
Chromosome Painting
If you have distinctive ancestry from a particular location then it can be interesting to map your chromosome information so that you can assign specific segments to particular ancestors. This approach works best at the continental level. For example, if most of your ancestry is British but you have an ancestor from Africa or India then the African or Indian segments will be easy to identify. 23andMe is currently the only company offering an ethnicity chromosome painting feature allowing you to download the segment data for the ethnicity assignments, but FamilyTreeDNA’s myOrigins 3.0 update will also provide access to segment data and an ethnicity chromosome painting feature.
Conclusion
The most effective way of using DNA results is to combine your DNA matches with the genealogical records. However, the biogeographical ancestry reports are improving all the time, and can provide useful genealogical information. Everyone will have their own favourite company whose results correspond most closely with their expectations, and people with European ancestry receive the most granular results. As more people test we can expect to see more reference populations added to the databases, and can look forward to updated results.
DEBBIE KENNETT
is the author of DNA and Social Networking (The History Press, 2011)