PCWorld (USA)

DNA testing is more detailed for white people

IT'S ALL IN THE DATA.

- ILLUSTRATI­ON BY HARRY CAMPBELL BY DIETER HOLGER

As DNA tests for ancestry explode in popularity, a fundamenta­l problem remains: The tests deliver more detailed results for people of European descent, as evidenced by the ethnicitie­s and data that major DNA testing companies represent. While this bias should recede as more people take the tests and add their genetic data to the mix, the companies have some work to do before their kits can work reasonably well on a worldwide population.

In 2017, more people took DNA tests than in all the previous years combined ( go. pcworld.com/mitr), according to the MIT Technology Review, and that number keeps climbing. According to the Internatio­nal Society of Genetic Genealogy ( go.pcworld. com/isgg; ISOGG), more than 18 million people have tested their DNA to learn about their ethnic identity or to find relatives. DNA testing companies like Ancestrydn­a ( go. pcworld.com/adna) and 23andme ( go. pcworld.com/23me) have become household names as a result, while new tests ( go.pcworld.com/ldna) claiming more specialize­d results crop up every few years.

It’s easy to see the appeal. For $99, 23andme and Ancestrydn­a simply require that you spit in a cup, send it off to a lab for testing, and then wait a matter of weeks to learn the ethnic breakdown of your genes by region. (See our comparison of these two popular kits starting on page 90.)

THE DATA PROBLEM

The risk for racial bias starts with the data used by DNA tests. Ancestrydn­a, for instance, bases its ethnicity estimate on a reference panel ( go.pcworld.com/rfpn) sourced from the DNA of 16,638 people representi­ng 43 different population­s. The people in the reference panel are screened to ensure they represent a certain ethnicity strongly— “people with a long family history in one place or within one group,” the company explains. The screening involves controls, such as removing close relatives, to avoid skewing the ethnicity profile.

While this pre-screened data can identify ethnicity on a broad level, more detail comes only with more data. Every DNA test kit sent in adds to the company’s database. That’s why leading contenders Ancestrydn­a and 23andme have some of the best estimates available—they have more customers, and therefore more data.

Because DNA tests like Ancestrydn­a and 23andme were at first available only in the United States and have expanded mostly to European countries or former colonies, the customer base continues to be fairly uniform. ISOGG estimates that four-fifths of the people who have taken DNA tests are U.S. citizens, meaning their data reflects a population with majority European ancestry.

Challenges in funding and poor infrastruc­ture make it more difficult to gather genetic data on underrepre­sented DNA groups like Africans, Asians, and indigenous peoples. Sarah Tishkoff ( go.pcworld.com/stlb), a professor at the University of Pennsylvan­ia who has studied African genomics for 18 years, told Pcworld, “Right now, it’s not possible to infer the exact sources of ancestry of African Americans, and it would be unfortunat­e if they have the expectatio­n that they will be able to get that informatio­n.” Tishkoff said that gathering a more diverse set of DNA data brings its own challenges, both financial and ethical. “There needs to be better funding and resources for generating that data. It’s also important to do the research in an ethical manner. I personally think there should be caution about using informatio­n from indigenous population­s for commercial purposes such as ancestry testing.”

REGIONAL REPRESENTA­TION: A BREAKDOWN

Now that you know how data for these DNA tests come in, the ethnicity breakdowns should be no surprise. Both Ancestrydn­a and 23andme skew toward people of European descent.

Ancestrydn­a is the most popular DNA test in the world, having sampled more than 10 million people. Yet 296 of the

392 ethnic regions it represents are for people of European heritage. That’s more than three-fourths European.

23andme, the world’s second-most popular DNA test, became more representa­tive of non-european ethnicitie­s earlier this year after it added regions for

Asia ( go.pcworld.com/xanc) and

Africa ( go.pcworld.com/nwaf).

The company has tested the DNA of more than 5 million people.

Of the ethnicitie­s it represents in the Ancestry

Compositio­n panel if you take the test, 52 of 171, or 30 percent, are European. That’s nearly 50 percent more regions than East Asia, which has the second highest number.

What’s more, half of the DNA reference samples 23andme uses to test a customer’s genes and estimate ethnicity come from Europeans, suggesting it’s better at evaluating people of European descent.

Ancestrydn­a also has a disproport­ionately higher amount of reference samples ( go. pcworld.com/rfpn) from people of

European heritage. Of the 16,636 samples Ancestrydn­a uses, more than 65 percent come from people of European ethnicity.

Even though Africa is geographic­ally larger than Europe, China, and the U.S. combined, Ancestrydn­a offers only 33 ethnic regions for people of African descent, while 23andme has 34 regions. Compare that to the 296 regions Ancestrydn­a offers for people of European descent, and 23andme’s 52 regions.

In the case of Ancestrydn­a, many of these regions include migrations out of Europe. The company lists 173 ethnic regions where Europeans settled in America, South Africa, and elsewhere. It does something similar for African Americans, but only 24 of the 33 regions in its Africa category track the lineage of Africans forced into slavery.

HOW DNA TESTERS ARE DIVERSIFYI­NG THEIR DATA

In a statement to Pcworld, an Ancestrydn­a spokespers­on said the company plans for its test to include more than 500 regions by early 2019, with a particular focus on African American and Hispanic communitie­s. To improve its test, Ancestrydn­a is gathering more DNA reference samples from around the world, updating its algorithms, and adding and updating the genetic markers of diverse global population­s.

“Our company’s history is one of continued evolution and progress and our platform is constantly improving as more and more people participat­e through Ancestrydn­a and build family trees,” the spokespers­on said.

When 23andme first offered its ethnicity estimate in 2008, the company included only three regions. Now, it represents 171.

The rapid growth is a testament to how algorithms and big data can quickly improve genetic science. But there’s still more to be done.

To better serve underrepre­sented DNA groups, 23andme launched the Global Genetics Project ( go. pcworld.com/glgn) in February of this year to gather more genetic data. If you have a grandparen­t from one of 59 underrepre­sented countries, 23andme provides you with a free test and access to its more than 90 genetic reports.

Joanna Mountain, senior director of research at 23andme, told Pcworld in an interview that the Global Genetics Project has already exceeded its original two-year goal of collecting 5,000 samples in less than a year.

“We really have captured the genetic diversity of the world in a way that I would never have imagined 20 years ago,” she said.

Mountain said 23andme is also collaborat­ing with researcher­s and academics to gather more data and better educate the world about genetic science.

“Many people in this country and beyond have very little understand­ing of genetics and

concerns about privacy,” Mountain said. “So there is a lot of education to be done.”

Mountain said 23andme noticed early on that there was a bias in its reference sample data because they had more U.S. customers. “We have more representa­tives of Italy than we have of Devon ( go.pcworld.com/dvon), [South Africa], for instance, which is not surprising given our customer base.”

But she said that doesn’t always mean 23andme is less detailed for people of nonEuropea­n descent. Someone from Mexico could learn about both their indigenous and Spanish ancestry, for example.

“It varies so much from person to person depending on your family’s history,” Mountain said. “You could at a very crude level say that Europeans might get a bit more detail, but that’s going to be very much variable.”

The good news is that 23andme and Ancestrydn­a are regularly updating their models to improve the accuracy and detail of their tests.

“We are going to be looking where people get less detail and working to fill those gaps and to provide more detail to as many people as we can,” 23andme’s Mountain said. “So that’s going to be something we continue to push on in the next five years.”

 ??  ??
 ??  ??
 ??  ??
 ??  ?? Ancestrydn­a’s ethnic regions. Colored areas represent locations that came up in Pcworld’s review ( go.pcworld.com/anrv).
Ancestrydn­a’s ethnic regions. Colored areas represent locations that came up in Pcworld’s review ( go.pcworld.com/anrv).
 ??  ?? 23andme’s map of its ethnic population­s.
23andme’s map of its ethnic population­s.

Newspapers in English

Newspapers from Australia