home music theory technology Am Steg archives

Hmmmmm, cheesecake . . . Pinker and Mithen on the origins of music
Kris Shaffer, Sound and Mind, November 2007

Is Pinker's 'auditory cheesecake' consistent with Mithen's 'Hmmmmm' communication?

This was the topic of an in-class debate today in the music cognition course I am assisting as a teaching fellow. While the discussion in class was brief, it got my gears rolling, and I thought it would make a good discussion topic here on Sound and Mind.

First, some background on Pinker and Mithen. Pinker ignited a cross-disciplinary firestorm in his 1997 book How the Mind Works (and various talks leading up to that publication) when he asserted that music is 'auditory cheesecake.' That is, music is not an adaptation, but rather music draws on a number of evolved cognitive features of the human mind which give it its emotive and communicative power. Music, along with recreational drugs and pornography, is a 'pleasure technology' (528) 'crafted to tickle the sensitive spots of at least six of our mental faculties,' (534) not an adaptive, selected trait. These six faculties (all adaptations) are language, auditory scene analysis, emotional calls, habitat selection, motor control, and 'something else' (538). In other words, just as we do not have an evolved preference for the taste of cheesecake (after all, there were not cheesecake trees on the African savannah), but makers of cheesecake do take advantage of our evolved preference for sugars, fats, etc., we likewise do not have an evolved capacity for making or appreciating music, though we do have evolved capacities for making and understanding meaningful sound sequences (both referential and emotional) in hierarchical structures, preference for certain types of acoustic consonance, etc. Of course, most controversial in these statements about music's evolutionary state is not the equating of music and the arts with pornography and recreational drugs (which I would think that most musicians and music scholars would find most offensive), but rather the statement that language is an adaptation, and human music piggybacks on our capacity for language. This is what musicians and music scholars have reacted to most aggressively in the past 10 years, and this is what is central to the question at hand.

Steven Mithen, as my review of his book The Singing Neanderthals points out, does not subscribe to the view that language is an adaptation and music an exaptation. However, nor does he believe that music is an adaptation and language the exaptation (an idea as old as J.-J. Rousseau, which some music scholars will argue in response to Pinker). He, following Alison Wray and others, argues for a common precursor to music and language, a 'musilanguage,' or 'holistic proto-language,' or simply 'Hmmmmm' communication. The following paragraph is an excerpt from my previous review which summarizes Mithen's view:

Generally speaking, Mithen presents the view that music and language as we know them today had a common precursor in early hominids. This form of communication was holistic, not compositional (that is, humans communicated in complete statements/messages, not by a system of words and rules), manipulative (meant to affect the behavior of others), multi-modal (included both vocalizations and gestures), musical (made use of pitch contours), and memetic (incorporated onomatopoeia and sound synaesthesia). Thus he names it ‘Hmmmmm’ communication (holistic, manipulative, multi-modal, musical, and memetic). Around 200,000 years ago, early Homo sapiens in Africa began to segment these holistic messages into words which could be recombined into new messages. This process, combined with the newly evolved ‘cognitive fluidity’ (the ability to combine thoughts from multiple cognitive domains to generate abstract and metaphorical thoughts) led to the development of modern language. The musical and gestural traits of Hmmmmm communication which remained evolved separately from language into music/dance (these two most often remained tied together and were likely only separated in very recent human history). While music (and dance) lacked the specificity of language, it still retained its power to express emotion and promote group cohesion, and thus it was not lost, but rather took on a separate social role than its cousin language.

So the question at hand is: are Pinker's and Mithen's views on music consistent with each other? That's not to say 'are they the same?' because clearly they are not. Rather, the question is, 'are they both true?' or, at least, 'can they both be true? or does one preclude the other?' I have a few thoughts, which I will share below, but I am curious in hearing yours as well, since this discussion is far from my area of expertise, but close to my area of interest. Also, opening up a question like this (if today's class debate was any indication) leads to theoretical and ontological questions that are difficult if not impossible to answer, but which are at the heart of music cognition. So I think this could lead to some interesting discussion. So here are my thoughts, and I look forward to reading yours in the comments below.

First, I think that these two views are consonant, or at least they do not clearly contradict each other, in the general status of music. That is, Pinker claims that music is not an adaptation, but piggybacks on things that are. Mithen is silent on music's adaptation status, but claims that Hmmmmm, the common precursor to music and language, is an adaptation. Thus, in Mithen's theory, music could be vestigial, a 'pleasure technology' built upon the selected traits of Hmmmmm. Pinker's general point about music is consistent, then, with Mithen's framework. Further, language as we know it today could be seen as an adaptation within Mithen's framework as well, as developments between the Hmmmmm and language stages provided survival or reproductive advantages, and cognitive faculties selected for Hmmmmm which became unnecessary for language could still be used in the development of a musical 'pleasure technology.' Mithen does not lay things out in this way, and he very well may disagree with such an explanation (I have a feeling that he would in fact disagree), but one could work out the details of his general theory in such a way, making it generally consistent with Pinker's theory.

However, there is a major point of dissonance between the two theories, and this is the crux of the cross-disciplinary debate. Pinker does not claim that music is an exaptation which builds upon a selected trait which is a common precursor to language and music. Rather he claims that music is an exaptation which builds upon the selected trait of language. For Pinker, language is wholly prior, and music piggybacks on it entirely. (Note, also, and this is also central to the controversy of his claims, that Pinker is a linguist making these claims.) This is, of course, in contradiction to Mithen's theory, which puts a heavy emphasis on Hmmmmm as the selected trait. Mithen's theory doesn't provide a possible mechanism for the evolution of language without 'musical' properties (and here we can almost feel the ontology of these terms 'language' and 'music' being stretched beyond comfort), but rather finds that the 'musical' properties of Hmmmmm are indispensible to the trait, given the biological, environmental, and social situation of early hominids. Further, it was the lack of consideration given to music in earlier accounts (including his own) of early hominids, their evolution, and their social structures that largely prompted the writing of The Singing Neanderthals, so it is unlikely that one could simply remove the 'musical' aspects of Mithen's theory and be left with something fundamentally identical.

So there are clearly some significant differences between these two theories, but the question remains. Can they both be true? or does one preclude the other? I think that the answer to this question depends on our interpretation of what is fundamental to each theory. The 'musical' properties of Hmmmmm is fundamental to Mithen's, that's for sure. But is Pinker's main point simply that music is an exaptation? (in which case both theories can be true, at least on the fundamental level) or is his main point that music piggybacks on language? (in which case they are mutually exclusive, both in detail and in general)