Races, geography, and genetic clusters
Every so often, a correspondent asks me about something I haven't really given much thought to. Having had the question raised, I am attuned to things I might otherwise have passed over. You know how that happens?
One person asked me for my view on human races. My default view was that of Richard Lewontin's - that race is a biological construct [RealAudio stream], and that biological differences between "races" is less than within-"race" difference. But I hadn't given it much more thought than that. So I was rather surpised to read this on the Gene Expression blog, with which I normally find myself in agreement:
The paper, by Tang et al. 2004, linked to in the quote is by 12 authors from a number of institutions of repute, though, and it cannot be dismissed. The context is the medical problem of identifying genetic diseases by race. Some diseases, such as sickle cell anemia, are strongly "racial" in distribution, largely because the diseases come from a particular geographic region. And other authors, particularly Luigi Cavalli-Sforza and Marcus Feldman at Stanford have argued that genes and cultural ethnicities, such as language, often covary. So perhaps there is something to race.
Racial classifications such as found in the United States derive from the writings of 18th century biologist Johann Blumenbach in 1775, in which he assigned both physiological and psychological characters to five "races": Caucasian, Mongoloid, Malay, American, and Ethiopian (which became known as "Negroes"). In other countries, this was not so tightly adopted in the social fabric usually. In Australia, we have what is known as racism, but no great attachment to these categories, for example. Racism is pure xenophobia, irrespective of racial classification. In the historical context of American society, "Hispanic" is added, although 42% of the Latino social group failed to self-identify on the 2000 census due to their admixture of American, African and European ancestry (see this paper, to which we shall return).
The paper by Tang et al. takes several alleles and maps them onto a genome space and finds that the self-reported racial groups match the genetic clusters very well. Others have criticised this sort of work as being statistically questionable and subject to all kinds of artifacts - I'm not competent to discuss this, but I'll link to the papers for those who are.
What I find worrisome here is that it is another example of lumping versus splitting. Begin with a set category structure, and you can find covariance. Try to derive a covariance out of the data, though, and you get a much higher degree of differentiation. Perhaps we find covariance between genes and alleles races because we set out to.
That the human species has geographic variation is not at issue. It clearly does, as Feldman's colleagues have argued. It just doesn't support the standard racial typology. Alleles had to evolve and spread somewhere. But they do spread, too. The human species (convention makes me want to type "human race") is massively interbreeding. A friend, Marc Buhler, did his PhD on an allele that shows gene flow from the Vikings to the Ashkenazi Jews in the middle ages. And more recently, geneticist Alan Templeton has argued that there are at least three major "out of Africa" migrations, and at least one "back to Africa" event.
So while ancestry is a useful way to classify species (because species are isolated gene pools, most of the time), it is rarely a good way to classify populations within species. There are haplotype groups in many species, including humans, and in some species, such as the California seal, this shows geography fairly well. But not in humans. We move about too much.
So, do I think there are races in biology as well as culture? No. Nothing I have seen indicates that humans nicely group into distinct populations of less than the 54 found by Feldman's group (probably a lot more - for instance, Papua New Guinea is not represented in their sample set). And this leads us to the paper by the Human Race and Ethnicity Working Group (rare to see a paper that doesn't list all the authors). They rightly observe that while there are continental differences in genetics, there is no hard division, and genetic variation doesn't match up with cultural differences per se. There is a genetic substructure to the human population, but it isn't racial.
"Race" is a difficult term. It was invented to account for within-species groupings in the early days of modern taxonomy. Other terms for it are "sub-species" (which doesn't mean a group that is less valuable or advanced than the species as a whole, but just a part of the species), "breed", "variant", "cultivar" (in botany), and so forth. Blumenbach's value-laden scheme, however, makes it a matter of social valuation. And this is magnified by the social inequalities of so-named races. No biologist would identify Hispanics as a race under any circumstances, for example. They are identified purely in terms of social factors, like migration, social status and political influence.
Within-species groupings are, evolutionarily speaking, ephemeral. Ten thousand years ago, almost none of the non-African races existed. Ten thousand years from now, almost none of the modern races will continue to exist, I warrant. And Africa is so genetically diverse (being the source of all genetic variation that hasn't evolved in the past 60,000 years) that one cannot fairly call "African" (or Ethiopian) a single group. While it is true that some alleles, like the sickle cell allelle that confers a measure of resistance to malarial complications, have geographic origins, they do not mark out races. Ironically, this was pointed out in terms of physiological, or as we would now call it, phenotypic, traits by a contemporary critic of Blumenbach's, Buffon. The more things change...
One person asked me for my view on human races. My default view was that of Richard Lewontin's - that race is a biological construct [RealAudio stream], and that biological differences between "races" is less than within-"race" difference. But I hadn't given it much more thought than that. So I was rather surpised to read this on the Gene Expression blog, with which I normally find myself in agreement:
represent each individual's genome as a point in a space of extremely high dimension, and define a race as a set of points whose distance from each other is less than some radius. These clusters map onto intuitive self-identified race with a very high degree of accuracy.The idea of mapping genetic differences as coordinates in a space of genetic alleles in a genome is old. It was originally proposed by Sewall Wright, as a "space of genetic recombination", which became the "adaptive landscape. Clustering genomes in a genomic space is a way of identifying groups that share ancestry, although a similar approach in taxonomy known as phenetics failed because of several problems - the clusters were often not conserved when the variables changed, and the division was fundamentally arbitrary. Some proposed a 95% rule, others a 70% rule, and so on.
The paper, by Tang et al. 2004, linked to in the quote is by 12 authors from a number of institutions of repute, though, and it cannot be dismissed. The context is the medical problem of identifying genetic diseases by race. Some diseases, such as sickle cell anemia, are strongly "racial" in distribution, largely because the diseases come from a particular geographic region. And other authors, particularly Luigi Cavalli-Sforza and Marcus Feldman at Stanford have argued that genes and cultural ethnicities, such as language, often covary. So perhaps there is something to race.
Racial classifications such as found in the United States derive from the writings of 18th century biologist Johann Blumenbach in 1775, in which he assigned both physiological and psychological characters to five "races": Caucasian, Mongoloid, Malay, American, and Ethiopian (which became known as "Negroes"). In other countries, this was not so tightly adopted in the social fabric usually. In Australia, we have what is known as racism, but no great attachment to these categories, for example. Racism is pure xenophobia, irrespective of racial classification. In the historical context of American society, "Hispanic" is added, although 42% of the Latino social group failed to self-identify on the 2000 census due to their admixture of American, African and European ancestry (see this paper, to which we shall return).
The paper by Tang et al. takes several alleles and maps them onto a genome space and finds that the self-reported racial groups match the genetic clusters very well. Others have criticised this sort of work as being statistically questionable and subject to all kinds of artifacts - I'm not competent to discuss this, but I'll link to the papers for those who are.
What I find worrisome here is that it is another example of lumping versus splitting. Begin with a set category structure, and you can find covariance. Try to derive a covariance out of the data, though, and you get a much higher degree of differentiation. Perhaps we find covariance between genes and alleles races because we set out to.
That the human species has geographic variation is not at issue. It clearly does, as Feldman's colleagues have argued. It just doesn't support the standard racial typology. Alleles had to evolve and spread somewhere. But they do spread, too. The human species (convention makes me want to type "human race") is massively interbreeding. A friend, Marc Buhler, did his PhD on an allele that shows gene flow from the Vikings to the Ashkenazi Jews in the middle ages. And more recently, geneticist Alan Templeton has argued that there are at least three major "out of Africa" migrations, and at least one "back to Africa" event.
So while ancestry is a useful way to classify species (because species are isolated gene pools, most of the time), it is rarely a good way to classify populations within species. There are haplotype groups in many species, including humans, and in some species, such as the California seal, this shows geography fairly well. But not in humans. We move about too much.
So, do I think there are races in biology as well as culture? No. Nothing I have seen indicates that humans nicely group into distinct populations of less than the 54 found by Feldman's group (probably a lot more - for instance, Papua New Guinea is not represented in their sample set). And this leads us to the paper by the Human Race and Ethnicity Working Group (rare to see a paper that doesn't list all the authors). They rightly observe that while there are continental differences in genetics, there is no hard division, and genetic variation doesn't match up with cultural differences per se. There is a genetic substructure to the human population, but it isn't racial.
"Race" is a difficult term. It was invented to account for within-species groupings in the early days of modern taxonomy. Other terms for it are "sub-species" (which doesn't mean a group that is less valuable or advanced than the species as a whole, but just a part of the species), "breed", "variant", "cultivar" (in botany), and so forth. Blumenbach's value-laden scheme, however, makes it a matter of social valuation. And this is magnified by the social inequalities of so-named races. No biologist would identify Hispanics as a race under any circumstances, for example. They are identified purely in terms of social factors, like migration, social status and political influence.
Within-species groupings are, evolutionarily speaking, ephemeral. Ten thousand years ago, almost none of the non-African races existed. Ten thousand years from now, almost none of the modern races will continue to exist, I warrant. And Africa is so genetically diverse (being the source of all genetic variation that hasn't evolved in the past 60,000 years) that one cannot fairly call "African" (or Ethiopian) a single group. While it is true that some alleles, like the sickle cell allelle that confers a measure of resistance to malarial complications, have geographic origins, they do not mark out races. Ironically, this was pointed out in terms of physiological, or as we would now call it, phenotypic, traits by a contemporary critic of Blumenbach's, Buffon. The more things change...