Having introduced the basics of archaeological use of DNA evidence, and discussed some other applications of DNA studies in archaeology, let’s take a look at the data relevant to the Southwest specifically. For modern populations in North America overall, there are some broad trends that have been identified in mitochondrial haplogroup distribution by region, as first elucidated by Joseph Lorenz and David Glenn Smith of UC Davis in 1996. They only looked at haplogroups A, B, C, and D, since haplogroup X had not yet been identified as a founding haplogroup at that time. Their results showed that there are definite patterns in haplogroup distributions by region. For the Southwest specifically, they found most groups showed very high levels of B and low levels of A, despite the fact that A was the most common haplogroup in their sample overall. The main Southwestern groups that showed high levels of A were the Athabascan-speaking tribes (Navajo and Apache), which is unsurprising since northern Athabascan groups, along with most other groups in the Arctic and Subarctic, are almost exclusively A, and it’s well established that the southern Athabascans immigrated into the Southwest from the north relatively recently. Some other Southwestern groups show some representation of A as well, which Lorenz and Smith attribute to intermixing with the Athabascans (although as I’ll discuss below this doesn’t seem to be the whole story). Similarly, the Navajos and Apaches showed substantial representation of B and C, unlike their northern cousins, and this is probably due to intermixing with the Pueblos and other Southwestern populations.
A subsequent study by Smith, Lorenz, and some of their students at Davis looked specifically at haplogroup X, which had been identified in both modern and ancient Native American samples by then and was established as a founding haplogroup. They found it widely distributed among modern populations speaking a variety of languages but particularly among speakers of Algonquian and Kiowa-Tanoan languages. The Kiowa-Tanoan connection is of particular interest for Southwestern purposes, of course, as this is one of the main language families spoken by the eastern Pueblos in New Mexico. In this case, haplogroup X was found in the Kiowa and Jemez samples. This is very interesting since the Jemez are Pueblo and the Kiowa are not, and the relationship between the Kiowa and the Tanoan-speaking Pueblos is a longstanding mystery. It’s hard to know how to interpret the haplogroup X data in this connection. Since X is so rare overall the fact that it is so concentrated in certain groups seems meaningful somehow, but since it’s still pretty rare in those groups and little follow-up research on this has since been done it remains quite mysterious.
Turning to the ancient evidence, the first work in the Southwest was associated mostly with the University of Utah. In 1996 Ryan Parr, Shawn Carlyle, and Dennis O’Rourke published a paper reporting on aDNA research on the remains of 47 Fremont individuals from the Great Salt Lake area, 30 of which could be assigned to a haplogroup. The Fremont have always been something of a mystery, with many Southwestern cultural features but living on the northern fringes of the Southwest and having some notable differences from Pueblo cultures to the south. What the Utah researchers found, however, seemed to show the Fremont patterning genetically with the Pueblos rather than with other groups in the Great Basin or Plains. Haplogroup A was completely missing from their sample, while B was by far the most common haplogroup and C and D were also present in small numbers. This seems to clearly rule out one theory about the Fremont, which is that they were composed in part of Athabascans on their way south from the Subarctic, and also casts in serious doubt other theories linking them to later cultures on the Plains (where haplogroup A is also very common). It’s true that there is internal cultural variation within the construct “Fremont” and it’s quite possible there was genetic variation as well, but the Great Salt Lake Fremont were the furthest north of the identified subdivisions and the closest to the Plains, so if even they show more genetic similarities to the Southwest that is strong evidence against theories associating them with areas to the north and east.
It’s also noteworthy that the Fremont distribution is in contrast to what Lorenz and Smith found among modern Numic peoples who now occupy the Fremont’s Great Basin home. The Numic Paiute/Shoshone sample that Lorenz and Smith looked at lacked haplogroup A, but it showed a very high proportion of haplogroup D (the highest in their whole study, in fact) and a low proportion of B and C. This doesn’t totally rule out some Fremont contribution to Numic ancestry, but it makes it seem unlikely that there was substantial genetic continuity between Fremont and Numic populations, which supports the “Numic Expansion” hypothesis for the late prehistory of the Great Basin. Smith and his student Frederika Kaestle later published a paper making this exact argument, using not only the Fremont data but additional ancient remains from the western Great Basin to argue that the differences in haplogroup frequencies supported a replacement of the earlier Basin inhabitants by the Numa.
Following up on this research, a subsequent paper by the same Utah researchers added in data from the Anasazi. They successfully assigned 27 Anasazi samples to haplogroups. Of these, 12 were from southeastern Utah, 9 were from Canyon del Muerto, 4 were from Canyon de Chelly, and 2 were from Chaco Canyon. Of the Chaco remains, one came from the debris in Room 56 at Pueblo Bonito, a part of the north burial cluster in Old Bonito which was very crudely worked over by Warren K. Moorehead in the 1890s. The other I can’t seem to find any specific information on. All of the Anasazi remains analyzed in this study were from the collections of the American Museum of Natural History, which makes me surprised that only two Chaco samples were involved. It’s possible that more were analyzed but only these two produced enough DNA to work with. In any case, if in fact there are more Chaco remains at the AMNH that have not yet been analyzed for DNA it would be very helpful to analyze them.
The results of this analysis were consistent with the standard archaeological understanding that the modern Pueblos are the descendants of the Anasazi. B was the most common haplogroup, with smaller levels of A and C. D wasn’t present at all, and two of the specimens didn’t fall into any of the four haplogroups, implying that they might have belonged to X. (The two Chaco samples belonged to haplogroups B and C; the sample from Room 56 belonged to haplogroup B.) Note that A is present here in populations dating well before any likely admixture with Athabascans, which is evidence against Lorenz and Smith’s contention that the presence of A in modern Pueblos can be attributed entirely to mixture with Athabascans.
Based on the dominance of B and low levels of other haplogroups, these researchers concluded that the Anasazi remains they analyzed were not significantly different from the Fremont remains they had analyzed earlier, adding further support to their contention that the Fremont pattern with the Pueblos. Note, however, that the Fremont hadn’t shown haplogroup A at all, while the Anasazi had it at a low but still respectable level (22%). Also, the Fremont showed a low level of haplogroup D, which the Anasazi didn’t have at all. These differences don’t necessarily mean the Fremont and Anasazi weren’t related, of course, but they do show how much that similarity is a judgment call supported by questionable statistics. In this case one big problem with the statistical analysis was treating the haplogroup frequencies as ratio-level data, which implies that they are meaningfully representative of the underlying populations despite the very small and non-random samples. This is highly implausible. This problem means that the authors’ conclusions about whether differences between samples were “significant” or not in a statistical sense is not really meaningful since it can’t reasonably be expected to generalize to the populations, which are what we really care about.
In addition, as Connie Mulligan pointed out in the general paper on aDNA that I discussed previously, the differences that the Davis researchers found between the haplogroup frequencies of the Fremont and Numic samples, which they used as evidence of a lack of population continuity, were actually quite similar statistically to the differences the Utah researchers found between the Fremont and Anasazi, which they interpreted as not being significant! This disconnect goes to show that there’s actually quite a bit of subjective judgment in interpreting results like this, despite the superficial impression of “objective” statistical data.
One way to overcome this confusion would be to increase the number of samples analyzed and try to make them as close to representative of the underlying populations as possible. That would certainly help, but the fundamental problem of defining the ancient population of interest, and the apparent impossibility of analyzing a sample from it that could be assumed to be truly representative, are daunting challenges. A more productive approach, which subsequent research has in fact been following, is to do more in-depth analysis of available samples, so that more detailed data than crude haplogroup assignments are possible.
One way to do more in-depth analysis would be to move away from relying exclusively on haplogroup assignments and look instead at the nuclear genome. Sequencing the whole nuclear genome provides vastly more, and more statistically robust, information than mitochondrial haplogroup assignment, as commenter ohwilleke pointed out in response to my initial DNA post. Most of the studies mentioned in my previous post in other parts of the world have used this methodology, with very informative results. This type of analysis has, however, not been done on ancient remains from the American Southwest to my knowledge. I’m not sure why exactly, but there are various reasons including cost and level of preservation of remains that could account for this lacuna.
Instead, Southwestern researchers have mostly doubled down on mitochondrial haplotype analysis and extended its reach by looking at further mutations within the defined haplogroups to identify sub-haplogroups that can further narrow down genetic relationships. This has been a productive line of investigation, as exemplified by a very interesting paper from 2010 dealing with Chaco-era sites in the area of Farmington, New Mexico.
The paper, by Meradeth Snow and David Glenn Smith of Davis and Kathy Durand of Eastern New Mexico University, analyzed human remains from two sites on the B-Square Ranch, a large ranch that includes most of the land south of the San Juan River in Farmington. The ranch is owned by the Bolack family, which has long been prominent in local and statewide affairs. Its patriarch for many years was Tom Bolack, who was governor of New Mexico for a brief period in the 1960s and was also well known for his elaborate produce displays at the State Fair. His son Tommy Bolack, who took over management of the ranch when Tom died, has long had an interest in archaeology and did his own excavations in various of the many archaeological sites on the ranch. In recent years rather than continuing his own excavations he has worked with Linda Wheelbarger, a professional archaeologist who teaches at San Juan College in Farmington, to conduct field schools in the summers for SJC students as well as analyses of artifacts and human remains from both these recent excavations and his own earlier amateur work.
Among these analyses was the aDNA analysis of remains that Bolack excavated from the Tommy and Mine Canyon sites, two small-house sites on the ranch dating to the Chaco era. The Tommy site is slightly earlier, dating to approximately AD 800 to 1100, while the Mine Canyon site dates to approximately AD 1100 to 1300. Since the Tommy site seems to have been abandoned at approximately the same time the Mine Canyon site was founded, one obvious interpretation is that the Mine Canyon site was founded by the same people who had previously lived at the Tommy site. The DNA evidence, however, challenges this interpretation and suggests a more complicated story.
For this study, 73 samples were sent to Davis for aDNA analysis. This included a mix of tooth and bone samples. Of these samples, 48 (65.7%) could be assigned to a mitochondrial haplogroup. Of these, 26 were from the Tommy site and 12 from the Mine Canyon site.
The successfully analyzed samples from the Tommy site showed a typical distribution of haplogroups for a Southwestern population: 3% A, 69% B, 14% C, and 14% D. (This study didn’t look for haplogroup X, and all successfully analyzed samples fell into one of the other founding haplogroups.) The Mine Canyon sample, however, showed a very unusual distribution: 58% A, 33% B, 8% C, and 0% D. This is an exceptionally high proportion of haplogroup A, which is generally fairly rare in the Southwest except in Athabascan groups which are generally thought to have arrived in the region well after these sites were abandoned. Haplogroup A is also very common in Mesoamerica, which makes its dominance in a Chaco-associated site particularly intriguing given the evidence for contact with Mexico seen at Chaco Canyon itself and some outlying Chacoan sites.
The authors are careful to note that these are very small sample sizes, which makes sampling bias a very real possibility to account for this sort of striking result. They compare these distributions to several other ancient and modern Southwestern and Mesoamerican populations using Fisher’s exact test and find, unsurprisingly, that the Tommy site sample isn’t significantly different from other ancient Southwestern populations but is significantly different from all the modern populations as well as the ancient Mesoamerican ones. The Mine Canyon sample, on the other hand, was found to be significantly different from all the ancient Southwestern samples as well as all the modern Southwestern ones except the Athabascan Navajo and Apache, while it wasn’t significantly different from any of the ancient or modern Mesoamerican samples. This result is clearly driven primarily by the unusually high proportion of haplogroup A at Mine Canyon, which means it doesn’t really add much to the paper. Although Fisher’s exact test does take into account the small sample sizes, it doesn’t address the more fundamental problem with this sort of use of statistics on this type of data which can’t really be trusted to be representative of the underlying population of interest. This is the sort of thing I was talking about in the earlier post under the somewhat tongue-in-cheek label of “elaborate statistical techniques” on data that don’t necessarily fit the necessary requirements for their use. This sort of technique is not actually very elaborate compared to more sophisticated statistical analyses used for studies of whole genomes, where the number of data points is immense and they can actually be assumed to be representative of the analyzed individual’s full ancestry. Calculating P-values for differences between two samples based on four data points for each, when neither sample is necessarily representative of its underlying population of interest, is not very useful, but very common in mtDNA studies at least in the Southwest. To their credit, the authors of this paper are well aware of the weaknesses of this part of it and are careful to downplay the significance of the statistical analysis.
With these intriguing preliminary results, the researchers attempted further sequencing to identify more specific mutations that might define sub-haplogroups and clarify relationships on a more granular scale. Of the 48 samples that could be assigned to haplogroups, 23 were successfully sequenced for mutations in a region of the mitochondrial genome known to be highly variable. (Note how small the sample gets with subsequent levels of analysis.) Poor preservation was a major problem at this point, and there wasn’t enough genetic material remaining to construct the sort of network diagram that is often included in papers like this, showing specific mutations and the relationships they imply between specific ancient and modern samples.
The most interesting results from this further sequencing were with haplogroup A. Of the 8 samples initially identified as belonging to this haplogroup, 6 samples from the Mine Canyon site showed two distinctive mutations that are otherwise known only from 3 modern Zuni samples, along with one Tohono O’odham and one Chumash sample. Importantly, this set of mutations is unknown from both Mesoamerican and Athabascan groups. This is strong evidence that the dominance of haplogroup A at the Mine Canyon site does not indicate either migration from Mesoamerica or an early Athabascan presence in the Southwest; instead, it seems that this site just happens to have had an unusually high proportion of a rare but natively Southwestern lineage which survived into modern times at Zuni (and may have had some connections further west). The samples belonging to haplogroup B similarly showed the dominance of a sub-haplogroup distinctive to the Southwest and unknown in Mesoamerica.
The differences between the Tommy site and the Mine Canyon site in haplogroup frequencies, while they may well be a function in part of the small sample sizes, may also provide evidence for complex population movements within the late prehistoric Southwest. The exact parameters of these movements can’t be defined until more evidence is available from other areas, however, especially Chaco Canyon and the Mesa Verde region.
Overall, despite the poor preservation of the samples involved, this study provides important support for a finding that has come out consistently across all lines of evidence relating ancient to modern Pueblo people: there is a lot of evidence for continuity over time on a regional scale with complex movements within the Southwest, but little to no evidence of significant population movement into or out of the Southwest in recent centuries. (There is a whole other debate about the extent of population movement into the Southwest much earlier, at the time when agriculture was first introduced, which I haven’t discussed much in these posts and which isn’t of much importance for the specific issue I’m addressing here.) I think there is a lot of potential for more detailed reconstruction of movement within the Southwest based on a combination of lines of evidence, but we’re certainly not there yet.
I’ve gotten some questions about how the DNA evidence relates to the issue of hierarchy at Chaco. I’ll have a more extensive post on the evidence for social hierarchy, which I think is extensive, but the short answer is that DNA doesn’t really provide any evidence one way or the other on this point. Since all evidence points to a general pattern of population continuity in the Southwest at least since the introduction of agriculture, the genetic patterns of any elites that arose wouldn’t be likely to differ in any noticeable way from those of the commoners they rose from. Indeed, the one sample to be analyzed for mitochondrial DNA that is very likely to come from an elite Chacoan context, the sample from Room 56 at Pueblo Bonito, belonged to haplogroup B, the most common in both ancient and modern Southwestern populations. It’s theoretically possible to imagine an elite group immigrating into the Southwest from Mesoamerica, and theories have been proposed along these lines, but the DNA evidence doesn’t particularly support this, and it’s much more likely based on all lines of evidence that the rise of an elite at Chaco was a primarily indigenous development involving some indirect influence from Mexico but little to no permanent population movement over that distance.
This is the last substantive post in my series about “tracing the connections” between the ancient and modern Southwest, although I will probably do a follow-up post linking to all the others for the convenience of readers. Overall, I think these posts have shown that we have substantial evidence from various perspectives that the modern Pueblos are the descendants of the ancient Anasazi (and other prehistoric Southwestern groups), but the evidence we have so far is not sufficient to connect any specific ancient sites with any specific modern pueblos. I am hopeful, however, that that may change as more evidence comes in and we are able to tie together new data with the evidence we already have to make some more specific connections.
Carlyle SW, Parr RL, Hayes MG, & O’Rourke DH (2000). Context of maternal lineages in the Greater Southwest. American journal of physical anthropology, 113 (1), 85-101 PMID: 10954622
Kaestle FA, & Smith DG (2001). Ancient mitochondrial DNA evidence for prehistoric population movement: the Numic expansion. American journal of physical anthropology, 115 (1), 1-12 PMID: 11309745
Lorenz JG, & Smith DG (1996). Distribution of four founding mtDNA haplogroups among Native North Americans. American journal of physical anthropology, 101 (3), 307-23 PMID: 8922178
Smith DG, Malhi RS, Eshleman J, Lorenz JG, & Kaestle FA (1999). Distribution of mtDNA haplogroup X among Native North Americans. American journal of physical anthropology, 110 (3), 271-84 PMID: 10516561
Snow, M., Durand, K., & Smith, D. (2010). Ancestral Puebloan mtDNA in context of the greater southwest Journal of Archaeological Science, 37 (7), 1635-1645 DOI: 10.1016/j.jas.2010.01.024