Steven Mithen, The Singing Neanderthals
Kris Shaffer, Sound and Mind, January 2007
Steven Mithen: The Singing Neanderthals: The Origins of Music, Language, Mind and Body
Weidenfeld & Nicolson, 2005 (UK)
Harvard University Press, 2006 (US)
page numbers in this review refer to the UK edition
Overview and Summary Steven Pinker's claim that music is no more than 'auditory cheesecake,' a 'technology' not an adaptive trait (1997, pp. 529-39), has prompted a number of reactions from the academic music community attempting to defend the honor of the discipline by insisting that music is indeed an adaptation. The Singing Neanderthals, however, is not such a book. Coming from a 'cognitive archaeologist' and not a music scholar, Mithen's stance on the evolutionary relationship between music and language is more even-keeled. While Mithen disagrees with Pinker’s assessment, his premise in this book is not to restore music to the honorable status of ‘adaptation,’ but rather to demonstrate the role that certain aspects of music played in the evolutionary and social development of early Homo sapiens and other hominids.
Generally speaking, Mithen presents the view that music and language as we know them today had a common precursor in early hominids. This form of communication was holistic, not compositional (that is, humans communicated in complete statements/messages, not by a system of words and rules), manipulative (meant to affect the behavior of others), multi-modal (included both vocalizations and gestures), musical (made use of pitch contours), and memetic (incorporated onomatopoeia and sound synaesthesia). Thus he names it ‘Hmmmmm’ communication (holistic, manipulative, multi-modal, musical, and memetic). Around 200,000 years ago, early Homo sapiens in Africa began to segment these holistic messages into words which could be recombined into new messages. This process, combined with the newly evolved ‘cognitive fluidity’ (the ability to combine thoughts from multiple cognitive domains to generate abstract and metaphorical thoughts) led to the development of modern language. The musical and gestural traits of Hmmmmm communication which remained evolved separately from language into music/dance (these two most often remained tied together and were likely only separated in very recent human history). While music (and dance) lacked the specificity of language, it still retained its power to express emotion and promote group cohesion, and thus it was not lost, but rather took on a separate social role than its cousin language.
Thus proceeds Mithen’s account of the evolutionary role of music and musical aspects of early hominid communication. Naturally, such an account will be filled with a substantial amount of speculation. We know little enough about the specifics of the music of the ancient Greeks, and yet that knowledge is infinitely greater than that of the music of early hominids. However, despite the speculative nature of a number of the images Mithen paints of Neanderthal and early Homo sapiens societies, I found Mithen to have a good critical eye when laying out the issues most central to his thesis, and when evaluating the works of others. Far from being a wild tale, his evolutionary account of music and Hmmmmm communication grew out of a sense in Mithen’s earlier works and those of the rest of his field that something was missing. This feeling grew particularly strong after he encountered the work of Alison Wray on a ‘holistic proto-language’ and the work of John Blacking on man’s innate musical characteristics. Mithen writes:
I am embarrassed by my own previous neglect of music, persuaded by Alison Wray’s theory of a holistic proto-language, ambitions to understand how our prehistoric ancestors communicated, and convinced that the evolution of music must also hold the key to language (p. 5).
Mithen’s presentation of music as ‘the key to language’ is by and large a convincing one. And I would like to provide a more detailed look into two ways in which he explains the co-evolution of music and language: the relationship of music, vocalization, and gesture in Hmmmmm communication, and the rise of bipedalism and its implications for Hmmmmm and the acquisition of cognitive capacities for language.
’Grunts, barks and gestures’: the stuff of Hmmmmm
For Mithen, the study of the communication of apes and other primates is integral to knowing what capabilities early humans and earlier primate ancestors had. If modern apes and humans share a cognitive ability, they would likely have received that ability from our most recent common ancestor. Thus, skills shared by modern apes and humans would most likely have been possessed by all Homo species. Thus the comparison of modern humans, what we know about humans in the past (from archaeological and genetic evidence) and the social and communicative practice of modern apes is an invaluable part of studying the communicative efforts of early humans.
In chapter 8, ‘Grunts, barks and gestures,’ Mithen explores the vocal call repertoire of various species of modern apes. He writes, ‘By understanding how these differences relate to the particular lifestyle of each species, we should be able to predict how the call repertoire of our hominid ancestors is likely to have evolved as their lifestyles diverged from that of the forest-living apes’ (p. 114). Mithen presents a few studies of the call repertoires of several species of African apes and shows them to be ‘holistic and manipulative in character’ (p. 118), in addition to having distinct musical traits (that is, pitch and pitch contour are critical parts of the message). So far, so good, in terms of explaining Hmmmmm in terms of our nearest evolutionary relatives. However, these call repertoires are remarkably small, and, due to the limitations of their vocal organs, they are not likely to be able to increase substantially into anything approaching language. Thus Mithen also explores these apes’ gestural communication. After the research of Joanne Tanner and Richard Byrne (p. 118ff.), Mithen explains that ape gestural communication is ‘iconic’ (rather than arbitrary in the selection of particular gestures), holistic, and manipulative.
The differences between ape communication and early Homo communication can largely be accounted for by biological and lifestyle/social changes, as Mithen lays out in chapter 9, ‘Songs on the savannah.’ Early Homo species had significantly larger brains than their ape-like ancestors, which typically means larger domestic groups (p. 127) and possibly an ‘enhanced theory of mind capability’ (p. 128), which would lead to more complex communicative behavior. Mithen claims that the larger social groups likely would have resulted in an increase in the number and complexity of emotions experienced and communicated by the early hominids.
At the same time that hominids are growing in the cognitive ability to communicate and probably the social need to communicate more complex messages, they were also experiencing biological changes which would have greatly enhanced Homo’s ability to communicate vocally. Mithen follows Michael Studdert-Kennedy in the idea that ‘we can think of sounds emitted from the mouth as deriving from “gestures,” each created by a particular position of the so-called articulatory machinery’ (p. 129). I’ll quote Mithen at length here, because I find this explanation fascinating:
As motor actions, such gestures ultimately derive from ancient mammalian capacities for sucking, licking, swallowing and chewing. These began the neuroanatomical differentiation of the tongue that has enabled the tongue tip, tongue body and tongue root to be used independently from each other in order to create particular gestures, which in turn create particular sounds, some of which involve a combination of gestures. Consequently, even though we should think of the hominid vocalizations as holistic in character, they must have been constituted by a series of syllables deriving from oral gestures. These, therefore, had the potential ultimately to be identified as discrete units in themselves, or in combination with one another, which could be used in a compositional language (p. 129).
Studdert-Kennedy’s explanation of gesture-based phonemes provides a potential physical basis (alongside the cognitive developments which took place at the same stage in human history) for the evolution of compositional, words-and-rules-based language out of an earlier, holistic system of communication. Also interesting is that the mimetic property of Hmmmmm communication which became possible with these biological changes (which up until this biological change has been limited to the iconic gestures of early hominids, not their vocal communication) can still be heard in infant ‘speech’ today. Australian Priscilla Dunstan has recently released a DVD explaining the basic ‘words’ of infants, all of which are aural manifestations of mouth-based gestures such as sucking, swallowing, etc. She claims these ‘words’ are universal among human infants, and her interpretation of their origin is right in line with Mithen and Studdert-Kennedy. In fact, infant speech and infant-directed speech (with their ‘musical’ extended vowels and exaggerated pitch contours) may be the closest thing we have today to Hmmmmm communication.
Bipedalism and Rhythm
Though hominids possessed the cognitive abilities to use pitch and pitch contour in their expression and interpretation of simple messages in the early development of Hmmmmm communication, human rhythmic capabilities came much later. Homo ergaster (c. 1.8 million years ago) first managed the task of walking on two legs. As Mithen explains, this rise of bipedalism had a profound impact on human language and music. Mithen outlines a number of research projects which argue that bipedalism brought about the following:
- a larger brain to accomodate ‘new demands on sensorymotor control’ (p. 146) and the regular rhythm of complex, periodic body movement (p. 150)
- a lower-placed, new and improved ‘valvular’ larynx, due to the repositioning of the spinal cord relative to the brain in upright Homo ergaster and to promote chest stabalization (p. 147)
- vocal cords capable of producing a more diverse range of sounds (p. 147)
The increase in cognitive capacity, temporal measurement and replication, and vocal capacity all contributed to an enhanced linguistic and particularly musical ability, ‘even though there were no selective pressures for speaking or singing’ (p. 147).
Mithen goes on to explain a number of social and lifestyle changes that would have had an impact on linguistic and musical development, and I would like to outline one which I find particularly fascinating: the mother-infant relationship. The rise of bipedalism brought with it a change in pelvic structure which would mean that infants would have to be born smaller, and thus earlier, than previously (p. 196). These premature infants would have been more parent-dependent than previous generations of infants. Further, there is strong reason to think (though not conclusive evidence) that Homo ergaster would have lost most of its body hair around the same time in its evolutionary development (p. 154), in order to combat heat when running on the African savannah. So infants would have needed more direct parental care early in life, would have been more dependent on physical parental contact, and would have had less of their mother’s hair on which to cling. The mother would have had to set the infant down frequently while working. Mithen provides one potential solution to this problem, from Dean Falk:
Falk suspects that such ‘putting down’ did indeed occur and was essential to the development of ‘prelinguistic communication.’ For, once the baby is ‘put down,’ the mother would still have eye contact, gestures, expressions and utterances to reassure the infant, these being substitutes for the physical contact that the infant would desire. The emotionally manipulative prosodic utterances that we associate with IDS [infant-directed speech] would, Falk suggests, have been a ‘disembodied extension of the mother’s cradling arms’ (p. 201).
In other words, ‘music is a form of touch’ (so said percussion soloist Evelyn Glennie). We still see this IDS today as a means of settling a baby’s cries. It is a means of establishing an awareness and an emotional connection when physical touch is not possible. This is very much what music is today for many people, not just mothers and infants. Taking this social development (assuming that the speculation here is not too far fetched) along with the increase in rhythmic capabilities and general cognitive capacity, it seems that the rise of bipedalism 1.8 million years ago and the other physical changes which accompanied it constitute the single greatest development in the evolution of human musical capacity.
Final thoughts on The Singing Neanderthals
I very much enjoyed this read. I found Mithen’s approach to be informed and intuitive, and it was interesting to hear the thoughts of an archaeologist on the evolution of music and language, after hearing a number of linguists and musicologists on the subject. However, since Mithen is a scientist and not a music scholar, I did find a few things lacking in his discussion of contemporary musical practice and some of his assumptions of past musical abilities. For instance, when he discussed absolute pitch (the ability of an individual to hear an isolated pitch and know its name without a reference pitch), he cited only a small number of the studies that have been done and concluded that humans are born with absolute pitch, but typically lose it later. This is by no means the consensus of the musical community. He also later assumes that early humans would have possessed absolute pitch, since we are born with it but lose it, which is too big of a stretch given the broad range of opinions on the issue today.
He also fell into the trap that many cognitive scientists do when studying music: relying on the word of folks like Lerdahl and Jackendoff or Isabelle Peretz. (A search on Sound and Mind for those names will likely return my previous criticism of their work and others who have relied on it too heavily.) For example, his acceptance of Peretz’s work on the ‘modularity’ of musical processing (p. 62ff.) is not critical enough for me. Her model in general is consistent with neurological data (that is, regarding the general idea of modularity and some of the broad categories). However, more detailed experiments by David Huron and others whom he references in his work demonstrate the deficiency of a contour -> interval -> tonal encoding process like Peretz outlines, when faced head-to-head with a scale-degree-based model. Fortunately, though, Mithen is careful not to rely too heavily on research from far outside his field in formulating the crux of his argument, and none of these unconvincing or less-than-widely-accepted ideas hinder his thesis.
There were two aspects of this book, though, which pleased me greatly in a study of this kind. First, he was not stuck in a production-consumption model of music (i.e., where there is a strict dichotomy between composer/performer and listener). Second, he did not a priori separate music from dance. In fact, he made the point that for most cultures, these go hand in hand. Studies of non-western music can often render themselves less than useful by erring in one of these two ways (or by relying on the written score, which, fortunately, could not even begin to be an issue in The Singing Neanderthals).
All in all, this was an interesting and well-though-out study. Of course, given the nature of the topic, there is much speculation involved. However, Mithen seemed to be careful about keeping his speculation reasonable and transparent, which makes for successful theory building on his part and valuable critical discussion on ours. I also admire Mithen for being the first to tie many of the issues in this study together in a cohesive way. It is an admirable act of sensitive interdisciplinarity. I heartily recommend this book. Though it is in many ways the first word on many of the topics discussed (it is surely not the last), it should give us a lot of interesting things to talk about.
