Haplogroup R1a

A 2014 study by Peter A. Underhill et al., using 16,244 individuals from over 126 populations from across Eurasia, concluded that there was "a compelling case for the Middle East, possibly near present-day Iran, as the geographic origin of hg R1a".

[10] According to Underhill et al. (2014), the downstream M417 (R1a1a1) subclade diversified into Z282 (R1a1a1b1a) and Z93 (R1a1a1b2) circa 5,800 years ago "in the vicinity of Iran and Eastern Turkey".

[13] Semino et al. (2000) proposed Ukrainian origins, and a postglacial spread of the R1a1 haplogroup during the Late Glacial Maximum, subsequently magnified by the expansion of the Kurgan culture into Europe and eastward.

[14] Spencer Wells proposes Central Asian origins, suggesting that the distribution and age of R1a1 points to an ancient migration corresponding to the spread by the Kurgan people in their expansion from the Eurasian steppe.

[15] According to Pamjav et al. (2012), R1a1a diversified in the Eurasian Steppes or the Middle East and Caucasus region: Inner and Central Asia is an overlap zone for the R1a1-Z280 and R1a1-Z93 lineages [which] implies that an early differentiation zone of R1a1-M198 conceivably occurred somewhere within the Eurasian Steppes or the Middle East and Caucasus region as they lie between South Asia and Central- and Eastern Europe.

According to those studies, haplogroups R1b and R1a, now the most common in Europe (R1a is also common in South Asia) would have expanded from the Pontic–Caspian steppes, along with the Indo-European languages; they also detected an autosomal component present in modern Europeans which was not present in Neolithic Europeans, which would have been introduced with paternal lineages R1b and R1a, as well as Indo-European languages.

[17][18][19] Silva et al. (2017) noted that R1a in South Asia most "likely spread from a single Central Asian source pool, there do seem to be at least three and probably more R1a founder clades within the Indian subcontinent, consistent with multiple waves of arrival.

[26] According to Marc Haber, the absence of haplogroup R1a-M458 in Afghanistan does not support a Pontic-Caspian steppe origin for the R1a lineages in modern Central Asian populations.

[1] A number of studies from 2006 to 2010 concluded that South Asian populations have the highest STR diversity within R1a1a,[36][37][12][3][1][38] and subsequent older TMRCA datings.

[1][38] From these findings some researchers argued that R1a1a originated in South Asia,[37][1][note 5] excluding a more recent, yet minor, genetic influx from Indo-European migrants in northwestern regions such as Afghanistan, Balochistan, Punjab, and Kashmir.

[note 9] Part of the South Asian genetic ancestry derives from west Eurasian populations, and some researchers have implied that Z93 may have come to India via Iran[43] and expanded there during the Indus Valley civilization.

[2][44] Mascarenhas et al. (2015) proposed that the roots of Z93 lie in West Asia, and proposed that "Z93 and L342.2 expanded in a southeasterly direction from Transcaucasia into South Asia",[43] noting that such an expansion is compatible with "the archeological records of eastward expansion of West Asian populations in the 4th millennium BCE culminating in the so-called Kura-Araxes migrations in the post-Uruk IV period.

According to Underhill et al. (2014) the diversification of Z93 and the "early urbanization within the Indus Valley ... occurred at [5,600 years ago] and the geographic distribution of R1a-M780 (Figure 3d[note 11]) may reflect this.

As of 2025, ten ancient basal R1a* genotypes have been recovered and published, from remains found in Estonia, Poland, Russia, and Ukraine; the oldest sample (Vasilevka 497) dated to c. 8700 BCE, and excavated in the Vasylivka, Bakhmut Raion, Donetsk Oblast.

[37] Underhill et al. (2009) reported 1/51 in Norway, 3/305 in Sweden, 1/57 Greek Macedonians, 1/150 (or 2/150) Iranians, 2/734 ethnic Armenians, 1/141 Kabardians, 1/121 Omanis, 1/164 in the United Arab Emirates, and 3/612 in Turkey.

[16] R-M458 is a mainly Slavic SNP, characterized by its own mutation, and was first called cluster N. Underhill et al. (2009) found it to be present in modern European populations roughly between the Rhine catchment and the Ural Mountains and traced it to "a founder effect that ... falls into the early Holocene period, 7.9±2.6 KYA."

[61] R1a1a1b1a1a (R-L260), commonly referred to as West Slavic or Polish, is a subclade of the larger parent group R-M458, and was first identified as an STR cluster by Pawlowski et al. 2002.

[72] According to archaeologist David Anthony, the paternal R1a-Z93 was found at the Oskol river near a no longer existing kolkhoz "Alexandria", Ukraine c. 4000 BCE, "the earliest known sample to show the genetic adaptation to lactase persistence (13910-T).

[83] The skeletal remains of a father and his two sons, from an archaeological site discovered in 2005 near Eulau (in Saxony-Anhalt, Germany) and dated to about 2600 BCE, tested positive for the Y-SNP marker SRY10831.2.

The ancestral clade was thus present in Europe at least 4600 years ago, in association with one site of the widespread Corded Ware culture.

[14] Other groups with significant R1a1a, ranging from 27% to up to 58%, include Czechs, Poles, Slovenians, Slovaks, Moldovans, Belarusians, Rusyns, Ukrainians, and Russians.

[90][91] Vikings and Normans may have also carried the R1a1a lineage further out, accounting for at least part of the small presence in the British Isles, the Canary Islands, and Sicily.

[95] In Southern Europe R1a1a is not common, but significant levels have been found in pockets, such as in the Pas Valley in Northern Spain, areas of Venice, and Calabria in Italy.

[96][better source needed] The Balkans shows wide variation between areas with significant levels of R1a1a, for example 36–39% in Slovenia,[97] 27–34% in Croatia,[87][98][99][100][101] and over 30% in Greek Macedonia, but less than 10% in Albania, Kosovo and parts of Greece south of Olympus gorge.

Hundreds of Slovenian samples and Czechs lack the Z92 subclade of Z280, while Poles, Slovaks, Croats and Hungarians only show a very low frequency of Z92.

[2] The Balts, East Slavs, Serbs, Macedonians, Bulgarians and Romanians demonstrate a ratio Z280>M458 and a high, up to a prevailing share of Z92.

In Pakistan it is found at 80% among Yusufzai tribe of Pashtuns (51%) from Swat District,[112] 71% among the Mohanna community in Sindh province to the south and 46% among the Baltis of Gilgit-Baltistan to the north.

[117] Note that Darya Boyi Village is located in a remote oasis formed by the Keriya River in the Taklamakan Desert.

According to Changmai et al. (2022), these haplogroup frequencies originate from South Asians, who left a cultural and genetic legacy in Southeast Asia since the first millennium CE.

Several populations studied have shown no sign of R1a1a, while highest levels so far discovered in the region appears to belong to speakers of the Karachay-Balkar language among whom about one quarter of men tested so far are in haplogroup R1a1a.

Map showing frequency of R1a haplogroup in Europe
R1a origins (Underhill 2009; [ 3 ] R1a1a origins ( Pamjav et al. 2012 ); possible migration R1a to Baltic coast; and R1a1a oldest expansion and highest frequency ( Underhill et al. 2014 )
European middle-Neolithic period. Comb Ware culture c. 4200 – c. 2000 BCE
Corded Ware culture (c. 2900 – c. 2350 BCE
Frequency distribution of R-M458
Distribution of R1a (purple) and R1b (red)