Music: The Universal Language of Mankind

Aditya Kumar
DataDrivenInvestor
Published in
14 min readMay 16, 2020

--

How closely are music and language ability related?

Photo by Mohammad Metri on Unsplash

Musicality and Language Comprehension

Music is to the soul what words are to the mind. It has always been suggested that learning music at an early age makes one better at mathematics and critical thinking, and that writing from an early age cultivates creativity and imagination. But what exactly is music? And as silly as it may sound, what exactly are words?

Music, unlike language, is not based in practical communication; it is rawer in nature, a feature often attributed to emotion and art. And music offers no obvious evolutionary benefit, whereas a common language fosters societal unification and persistence.

Yet, despite their apparent differences, both music and language consist of fundamental units of sound strung together in formal patterns to create meaningful phrases and passages. And in many animals, music is language. So what about humans?

In this article, I will report findings on the musicality of language and the grammar of music to discuss whether musical ability is related to language comprehension in the development of human cognition. I will connect relevant approaches from linguistics, emotion, and evolution in cognitive science to examples from recent studies and whether their results corroborate a common origin of music and language.

As an amateur writer and classical pianist of sixteen years, I have often found myself thinking in music and ascribing lyrical qualities to common prose. Hence my personal interest in discovering whether there is a divide between these two realms or not. To quote Jane Swan, how is it that music can, without words, evoke our laughter, our fears, our highest aspirations?

An Introduction to Language and Music Structure

It is best to begin this discussion with a formal introduction of language structure. All spoken human languages, from English and Spanish to Sanskrit and Japanese, are composed of units of sound. This is to say that what we speak and what we write are identical, which is an important distinction, as, unlike in science fiction stories such as Story of Your Life by Ted Chiang (which became the film Arrival), our methods for verbal and written communication are based in the same “language”, i.e. what we think is what we say and write.

With courtesy of IMDb. In the film, the alien ‘heptapods’ have entirely separate languages for speaking and writing. Their written language, as a result, is not based in sound but in time (imagine writing a complete sentence with both hands, one starting at the beginning and the other at the end, until they meet at the middle, all without planning what exactly you want to write).

The most basic unit of spoken language is the phoneme, which has no intrinsic meaning but corresponds to the types of sounds we can make, often categorized by vowels and consonants.

Phonemes, in turn, group together to form morphemes, which are the smallest units of sound that have meaning. These can be categorized by prefixes, postfixes, and root words. In addition, languages have rules by which phonemes or morphemes can be combined together, called phonology and morphology, respectively. Lastly, syntax governs the arrangements of words in sentences, and semantics the meanings of these words. Grammar embodies the collective set of all these rules, from basic phonology to higher-level semantics (Friedenberg & Silverman, 2011, p. 271–272).

Similarly, music derives from basic formal elements. The most fundamental unit of sound in musical notation is the note, which represents a relative frequency such as “middle C” or “A”; this is quite like a phoneme in nature. But whereas letters are combined into words, notes are combined within a certain time interval, known as a bar. These bars, in turn, give rise to musical phrases, which then combine into musical passages.

Just like written and spoken language, music has “rules” by which notes can be ordered. Musical grammar is not identical to linguistic grammar in which meaning is ascribed to semantics (Western Michigan University School of Music), but rather one where meaning is ascribed to emotional perception within a key convention (i.e. a piece written in C major will follow the C major scales and harmonics that build it up, rather than creating unnecessary, off-key discord).

Another difference between language and music interpretation is that language has distinct external referents (e.g. the word ‘apple’ refers to a physical apple, or the word ‘happiness’ refers to a state of being) whereas music is not linked to any set meanings (i.e. a group of notes does not refer to any object or idea), physical or abstract (Ravignani et al., 2018).

Photo by Hannah Wright on Unsplash. All spoken languages from across the world are built in units of sound that can be reflected in a written format.

Language Attunement: Our Preferences for Sound

To begin comparing musicality to language acquisition in cognitive development, we can consider the extent to which language acquisition and speech perception are shaped by music-based auditory principles. In a study at the Institute of Musicology of the University of Cologne, Dr. Barbara Hoehle (Spring School, 2018) addressed this question by comparing head turning preference between German and French infants.

The German language is stress-based (on the first syllable, called trochaic meter), whereas French has no lexical stress. The hypothesis was that if infants show a preference to a rhythm with the same lexical stress as their parents’ household language, then language attunement and language acquisition are indeed correlated in cognitive development.

Hoehle’s results showed that “German-learning infants were sensitive to a trochaic bias as was demonstrated in experiments using the head-turn preference procedure. At the same time, age-matched French-learning infants did not show this bias.” While these results corroborate language attunement with language learning due to the fact that infants become conditioned to their mother tongue, and show the strong effect of rhythm on language, they only hint at an underlying understanding of melody.

To dig deeper into the basis for this correlation in language attunement and acquisition, I searched for a neural basis of intonation in speech and music. From a study by Koelsch and Rohrmeier (Spring School, 2018),

“pitches are arranged according to music-syntactic rules that are learned simply through exposure to music. Knowledge of these rules guides listeners’ music perception, shown by early right anterior negativity (ERAN) responses evoked by syntactic violations in music (similar to (early) left anterior negativity ((E)LAN) responses in linguistic violations), localized to regions in the inferior frontal gyrus.”

These music-syntactic rules allude to the “grammar” of music I mentioned earlier, such as staying within key with a preference of harmony over dissonance. And because violations in music syntax evoked the same neural circuits in the inferior frontal gyrus as violations in grammar, we can conclude that music and language are linked in human cognition. Furthermore, as this result displays underlying musicality, it corroborates language attunement and acquisition with an understanding of rhythm and melody.

Lastly, it is interesting to consider case studies from patients who are afflicted with amusia, an inability to detect, recognize, or reproduce musical tones. From the evidence above, we see that musicality and language acquisition are interrelated in cognition; and now, if an individual lacks such musicality, we expect that there to be detriment in linguistic auditory perception ability.

A case study report from the NIH (2018) shows

“evidence from brain imaging research [that] indicates that amusic participants show deficits in fine-grained perception of pitch; patients fail to distinguish a question from a statement solely on the basis of changes in pitch direction. This observed difficulty in a sample of amusic patients supports the hypothesis that music and speech intonation share specific neural resources for processing pitch patterns. Further brain imaging studies report a considerable overlap in the brain areas involved in the perception of pitch and rhythm patterns in words and songs, and in sound pattern processing in melodies and linguistic phrases. In adults and children, musical training facilitates syllabic and pitch processing in language.”

In other words, amusia corresponds with an inability to understand vocal inflections and the meanings these pitch changes confer. Thus, due to the role musicality has on speech perception, it is safe to conclude that musicality and language comprehension are shaped together in cognitive development by music-based auditory principles.

Labeling Emotion

Another feature of acquiring language is the ability to label environmental cues and produce an appropriate emotional response. From the emotional approach to cognitive science, there are a set of seven universally recognized human emotions: happiness, sadness, contempt, surprise, fear, disgust, and anger (Friedenberg & Silverman, 2011, p. 300–301). This set is based in Darwin’s theory of evolution that suggests that because emotion is the result of evolution, the expressions associated with emotion are also based in evolution.

From Wikimedia. The seven universally-recognized human emotions. Can you tell which is which?

The importance of a universal basis of emotion when comparing musicality with language acquisition is exhibited in cross-cultural, cross-linguistic studies of music and emotion. Labeling emotion requires language, and music evokes emotion. Notwithstanding for nature versus nurture arguments, if groups of different people from across the world experience similar emotions to the same music (i.e. if they cognitively label music with the same fundamental emotions), then we can surmise that auditory and verbal cognition (i.e. musicality and language comprehension) are related.

A group of researchers from McGill University, the Berlin Institute of Technology, and the University of Montreal conducted a study in 2015 to compare the subjective and psychophysiological emotional responses to music between an isolated population of Mebenzélé Pygmies in the Congolese rainforests (who had no prior exposure to Western music) and a group of Canadian Western music listeners (who had no prior exposure to Congolese music).

The researchers found that

“[t]here were several similarities in both groups’ responses to the different musical stimuli. When Canadians rated Western music as subjectively arousing, both groups responded with increased physiological arousal (heart rate, skin conductance, and respiration). Additionally, in Pygmies, subjective arousal ratings increased for arousing Western music. Taken together, these similarities observed in emotional responding could be explained by universal reaction patterns to several low-level features. Both groups responded with increased subjective and physiological arousal (SCL) when the music had a higher tempo. Furthermore, increases in spectral centroid and pitch lead to increased skin conductance for both participant groups” (Egermann et al., 2015).

The results from this study indicate that emotional labeling and physiological reactions to music were quite similar across isolated groups, suggesting a common set of emotions for a common set of musical principles. Therefore it is plausible to conclude that musicality has a shared basis in cognition, and that because both groups assigned similar cognitive labels via similar physiological reactions, language and music are related in human cognition.

Stress Appraisal

Continuing with cognitive labeling in emotion, acquiring language gives the ability to evaluate the environment and produce stress response. This is called stress appraisal. A stressor is a biological element, external condition, or event that leads to a change in the status quo of an immediate or future state. A negative change is distress (e.g. an injury), and a positive change is eustress (e.g. a marriage).

With language, an example of stress response is error detection in grammar, as mentioned above with activity in the inferior frontal gyrus. Similarly, stress in music occurs when the melody and harmony take an unsuspected turn.

Photo by Austin Neill on Unsplash. Music, as an art form, is strongly affected by our emotional perceptions. When you hear a piece that moves you or gives you “chills”, it is often because the artist successfully ties his/her song with a very personal memory and stimulates a good balance between what you expect to hear next and what he/she actually does.

This might be in the form of a key change, which creates eustress by establishing a new tonal center (think of Bon Jovi’s minor third leap in “Livin’ on a Prayer”, or Celine Dion’s historic E major-to-F minor/A-flat major jump in “My Heart Will Go On”), or it might be in the form of dissonance, which creates distress by straying away from the melody and harmony (think of jazz music or some of the works of Beethoven and Chopin; it must be noted, however, that notable pieces employ this strategy for capturing attention yet always return to consonance after a brief lapse. A true example of dissonance would be a child banging a piano).

Stress response in music is an important consideration when comparing to linguistic comprehension as the element which determines the magnitude of the stressor is expectation. Just as nonsensical words or phrases induce activity in the inferior frontal gyrus, and just as we can understand correct sentence flow, “[w]hen music establishes a pitch centre (tonality), we rapidly learn to anticipate its continuation. What happens when anticipations are challenged by new events? … [P]roviding a melodic challenge to an established tonality leads to progressive changes in the impact of the features of the stimulus on listeners’ expectations” (Bailes et al., 2013).

This study on the temporal flux of musical expectations shows how predicting melody is similar to any “assessment of any event stream of biological relevance,” such as linguistic auditory perception or simply spoken language. If music and language are similar in cognition, then we expect that the stress response to unexpected directionalities is related to the magnitude of the indiscretion. For example, the more a melody deviates from its tonal center, the more brain activity we see in the inferior frontal gyrus; or, on the other hand, the more a sentence violates basic grammar principles, the more brain activity we see in the inferior frontal gyrus.

From the discussion of the paper, the results from this study “suggest that very unstable stimuli are likely to be progressively and then integrally reappraised, more stable stimuli less so” (Egermann et al., 2015). In other words, unstable stimuli, or more deviated stressors, correlate with greater stress reappraisal than do stable stimuli. Because both music and language follow the same mechanism in stress appraisal, they likely share a common basis in cognitive development.

The Evolutionary Approach

Evolutionarily, however, music offers no obvious benefits to human cognition, whereas language directly makes group communication efficient and builds culture, a phenomenon that fosters safety. From the evolutionary approach to cognitive science, we learn that “[l]anguage is a form of social bonding and a way for individuals who possess this capacity to form alliances within groups.” In turn, forming larger groups “prompted further innovations, which were adaptive in nature. Larger groups are better able to defend themselves against predators in an open savanna environment. Language allows for complex, coordinated social behavior. Language would have facilitated hunting, foraging, child care, and other crucial survival behaviors” (Friedenberg & Silverman, 2011, p. 254).

Photo by Johannes Plenio on Unsplash

Essentially, language helped power the development of modern civilization, the growth of our ancestors from hunter-gatherer societies in danger from predatory animals to settled, agrarian societies with the ability to create and defend.

But where might music fit in this picture? The core of music, as with language, is cohesion. Just as language is only useful if it can be understood and reproduced by others, so music can only be enjoyed if others experience emotion from it.

It may seem that enjoyment is a pure subjective term, but this simply refers to what a group finds collectively pleasing. For example, a major key generally indicates happiness, a minor key sadness; a fast tempo creates energy, whereas a ballad contours sentimentality.

These features are quite common across all of human society, but note that, objectively, there is no particular reason why a major key should sound “happy” and minor key “sad”; rather, it is likely that co-evolution of human linguistic comprehension with musicality selected for traits that promoted societal cohesion.

A Royal Society study from Laurel Trainor (2015) mentions that these “emotional and social consequences of music may have conferred survival benefits, leading to adaptations that promoted and enhanced musical behaviour.” Essentially, the uniqueness of higher-order musicality within humans and its early emergence in social development are consistent with music as an evolutionary adaptation.

Another study from the NIH (2018) looks at a more technical approach to co-evolution of language and music by comparing the auditory principles of both. The study finds that

“in both music and verbal language, emotions are expressed through similar patterns of pitch, tempo and intensity. For instance, in both channels, happiness is expressed by fast speech rate/tempo, medium-high voice intensity/sound level, medium-high frequency energy, high fundamental frequency (F0)/pitch level, high F0/pitch variability, rising F0/pitch contour, fast voice onsets/tone attacks. Importantly, the use of voice modulation to express emotional information within interpersonal communication might have had adaptive value in the early species of our genus Homo, improving their ability to respond appropriately to survival opportunities.”

The tight correlation in language, emotion, and music suggests that linguistic comprehension and musicality, backed by emotion, develop with one another in human cognition.

Conclusion

So how can music, without words, evoke our laughter, our fears, our highest aspirations? Well, it might not be inconceivable to suggest that music is, in a sense, a form of language. Just as words enable us to translate our thoughts, music enables us to express our emotions. And there is plenty of overlap.

Through this article, I have compared musicality to language acquisition in human cognition via the linguistic, emotional, and evolutionary approaches to cognitive science. For each subject, I have reviewed relevant literature that connects music and language. From the linguistic approach to cognitive science, we learned that grammar and musical syntax activate the same regions in the brain, and that infants are conditioned to the auditory principles of their home language, seeking preference in it.

From the emotional approach to cognitive science, we overviewed the fundamental set of emotions and how cultures across the world express them identically when listening to music. Both these approaches suggest a strong link between musicality and language comprehension.

And yet, evolutionarily, I mentioned that music has no direct benefits that improve chances of survival, whereas language is the key difference that demarcates the line between advanced human civilization and other social primates and other animals in general. So why would the two be linked? Looking deeper into evolution, we discovered the link music and language had on fostering group cohesion to a degree much larger than primitive social grooming. Thus these findings indicate the strong correlation between musicality and language comprehension in human cognition.

So can music be our universal language? It certainly seems like!

And in the words of de Lamartine, music is simply the literature of the heart; it commences where speech ends.

References

Asano, R., Bornus, P., Craft, J. T., et al. (2018, October 10). Spring School on Language, Music, and Cognition: Organizing Events in Time — 2018. Retrieved from https://journals.sagepub.com/doi/10.1177/2059204318798831

Bailes, F., Dean, R. T., & Pearce, M. T. (2013, September 18). Music Cognition as Mental Time Travel. Retrieved from https://www.nature.com/articles/srep02690

Boll-Avetisyan, N., Bhatara, A., & Höhle, B. (2017, April 27). Effects of Musicality on the Perception of Rhythmic Structure in Speech. Retrieved from https://www.journal-labphon.org/articles/10.5334/labphon.91/

Egermann, H., Fernando, N., Chuen, L., & McAdams, S. (2015, January 7). Music induces universal emotion-related psychophysiological responses: comparing Canadian listeners to Congolese Pygmies. Retrieved from https://www.frontiersin.org/articles/10.3389/fpsyg.2014.01341/full

Friedenberg, J., & Silverman, G. (2011). The Philosophical Approach. In Cognitive Science: An Introduction to the Study of Mind (2nd ed., pp. 259–321).

Honing, H., ten Cate, C., Peretz, I., & Trehub, S. E. (2015, March 19). Without it no music: cognition, biology and evolution of musicality. Retrieved from https://royalsocietypublishing.org/doi/full/10.1098/rstb.2014.0088

Pearce, M. T. (2018, May 11). Statistical learning and probabilistic prediction in music cognition: mechanisms of stylistic enculturation. Retrieved from https://nyaspubs.onlinelibrary.wiley.com/doi/full/10.1111/nyas.13654

Ravignani, A., Thompson, B., & Filippi, P. (2018, February 6). The Evolution of Musicality: What Can Be Learned from Language Evolution Research? Retrieved from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5808206/

Trainor, L. J. (2015, March 19). The origins of music in auditory scene analysis and the roles of evolution and culture in musical creation. Retrieved from https://royalsocietypublishing.org/doi/10.1098/rstb.2014.0089

Western Michigan University School of Music. (n.d.). The Elements of Music. Retrieved from https://wmich.edu/mus-gened/mus170/170notes/Ch1-elements.pdf

--

--

Aditya Kumar is a medical student at the University of Colorado with a background in computer science. He researches AI mechanisms in economics and medicine.