Ancient-DNA Study Identifies Originators of Indo-European Language Family

Findings resolve “steppe hypothesis” enigma

Photorealistic illustration of a wheel with a horse at the center and the word “horse” in different languages along the rim
Ancient DNA locates the geographic origin of Proto-Indo-Europeans and Proto-Anatolians and describes their early migrations out of the Caucasus Lower Volga and North Pontic regions of present-day Russia and Ukraine. The languages these cultures brought with them are depicted in this artist’s rendition as spokes of a wooden wheel, a technology that facilitated their dispersal. Some spokes are intact, denoting living languages; others are broken, hinting at those lost or known only from archaeology. The word for “horse,” an important animal in Indo-European culture, adorns the rim in diverse scripts from across Europe and Asia. A scattering of red ochre represents a part of early Indo-European burial rites. Image: Oliver Uberti

At a glance:

  • Ancient-DNA analyses identify a Caucasus Lower Volga people as the ancient originators of Proto-Indo-European, the precursor to the massive Indo-European language family.

  • The population lived on the Eurasian steppe within the borders of current-day Russia during the Copper Age about 6,500 years ago, data show.

  • Findings indicate members of the group mixed with people to the west to form the distinct genome of the Yamnaya people, who went on to spread Indo-European languages across the world.

A pair of landmark studies has genetically identified the originators of the massive Indo-European family of 400-plus languages.

Results of the international ancient-DNA studies, published Feb. 5 in Nature and supported in part by the National Institutes of Health and National Science Foundation, place these linguistic pioneers within the borders of current-day Russia during the Eneolithic or Copper Age about 6,500 years ago. They were spread from the steppe grasslands along the lower Volga River to the northern foothills of the Caucasus Mountains.

Get more HMS news here

The discovery marks a collaborative triumph that builds on decades of work by linguists, archaeologists, and geneticists.

It provides the missing piece from the century-old “steppe hypothesis,” which positions the birthplace of Indo-European languages — and their precursor, Proto-Indo-European, which predated writing — on the Eurasian steppe, where Russia and Ukraine stand today.

Earlier work had pointed to the ancient Yamnaya people of the steppe as the originators of Proto-Indo-European, but a sticking point was that ancient speakers of one extinct branch of Indo-European languages didn’t have Yamnaya ancestry. Some geneticists, including co-senior author David Reich of Harvard Medical School and Harvard University, hypothesized that an even older population was the ultimate source. The new studies identify that population as the Caucasus Lower Volga people.

Authorship, funding, disclosures

“The genetic origin of the Indo-Europeans”

Reich is co-senior author with Ron Pinhasi of the University of Vienna. Leonid Vyazov is co-first author with Lazaridis, Patterson, and Anthony. Additional authors can be found in the paper.

The study was funded in part by Polish scientific project grant NCN OPUS 2015/17/B/HS3/01327, the Russian Science Foundation (grants 21-18-00026 and 22-18-00470), the Museum of the Institute of Plant and Animal Ecology (UB RAS) grant FWRZ-2021-0006, the National Institutes of Health (R01HG012287), the John Templeton Foundation (grant 61220), J.-F. Clin, the Allen Discovery Center (a Paul G. Allen Frontiers Group advised program of the Paul G. Allen Family Foundation), and the Howard Hughes Medical Institute. This study depended on support from the research computing group at HMS. Additional supporters can be found in the paper.

“A genomic history of the North Pontic Region from the Neolithic to the Bronze Age”

Lazaridis is co-first author with Alexey G. Nikitin of Grand Valley State University in Allendale, Michigan. Additional authors can be found in the paper.

The research was supported in part by the National Science Foundation (grants BCS-0922374 and BCS-2208558), the National Institutes of Health (grant HG012287), the John Templeton Foundation (grant 61220), J.-F. Clin, the Allen Discovery Center (a Paul G. Allen Frontiers Group advised program of the Paul G. Allen Family Foundation), and the Howard Hughes Medical Institute. Additional funders can be found in the paper.