Sperm Whales Have Vowels: What Project CETI's Latest Discovery Means for Language

A whale named Pinchy is making linguists rethink what counts as language.

She lives off the coast of Dominica, in the Eastern Caribbean, and she belongs to a clan of sperm whales that has been recorded continuously since 2005 by the Dominica Sperm Whale Project. When researchers at Project CETI sped up her vocalizations and ran them through the kind of acoustic analysis usually reserved for human phoneticians, they found something they did not expect to find in any non-human animal. Pinchy makes vowels. Two of them are distinct enough that you can write them with the Latin letters "a" and "i." She combines them in patterns that look, on a spectrogram, uncannily close to the diphthongs in the English word "site" or the rising tones of Mandarin.

The latest paper, published earlier this month in Proceedings of the Royal Society B, goes further. It argues that these whale vowels do not just exist as acoustic curiosities. They behave the way human vowels behave inside a phonological system. They have length contrasts. They interact with the older, click-based categories researchers have been cataloguing for half a century. And the first click of a coda, the way the first sound of a syllable does in many human languages, carries special structural weight.

This is the third major Project CETI paper in two years, and the trajectory matters more than any single result. In May 2024, a team from MIT's CSAIL and Project CETI proposed what they called the sperm whale phonetic alphabet, identifying 143 distinct combinations of rhythm, tempo, rubato, and ornamentation across roughly 8,700 codas. In November 2025, the same group reported that whales also modulate the spectral properties of their clicks, producing the vowel-like a and i patterns Pinchy demonstrates so clearly. The new paper closes the loop by showing that those vowels are not isolated acoustic features but participants in a structured phonology.

If you have followed any of this work over the past few years, you have probably noticed how carefully the researchers phrase their claims. They are not saying whales have language. They are saying whale communication shares specific structural features that linguists used to consider exclusive to human language. That is a narrower claim and a much more interesting one because it forces a question the field has been avoiding for decades. If discrete vowel categories, combinatorial structure, contextual modulation, and coarticulation are not uniquely human, then what is?

What the recordings actually show

Sperm whales communicate in short bursts of clicks called codas. Each coda lasts under two seconds. The clicks themselves are produced not by vocal cords but by an organ called the phonic lips, located in the whale's nose, which slap together and force air through a complex of sacs that act as an acoustic filter. The result is a sound that, played at normal speed, hits the human ear as something between Morse code and a snare drum being struck by a nervous percussionist.

For most of the history of cetacean acoustics, researchers categorized codas the way you might categorize bar codes. You counted the clicks. You measured the gaps between them. You assigned each unique pattern a name like 1+1+3 or 5R, where the numbers and letters described the rhythm and the regularity. This is still useful work, and it produced the first evidence that different sperm whale clans use different coda repertoires, what biologists now call vocal dialects.

The problem with the bar-code approach is that it threw away most of the information. A click is not a single point in time. It has internal acoustic structure, including resonant frequencies that linguists call formants. Human vowels are defined by their formants. The difference between ah and ee is not a difference in pitch or loudness but in which frequency bands resonate when you shape your vocal tract a certain way. Open your mouth wide, and the lower formant rises. Spread your lips into a smile, and the upper formants climb.

When Gašper Beguš and his colleagues at Berkeley applied formant analysis to the Dominica recordings, they found that whale clicks fall into two clearly distinct frequency profiles. A-codas have one prominent peak in their spectrum. I-codas have two. The categories are discrete, not continuous. A whale produces one or the other, almost never something in between. That is exactly how human vowel systems work. Spanish has five vowels. English has roughly fourteen, depending on the dialect. In every language, the inventory is finite even though the underlying acoustic space is not.

Then there are the diphthongs. Some a-codas show a sweeping change in formant frequency over the course of the coda, rising or falling, sometimes both. The acoustic shape mirrors what happens when an English speaker says the word "site," and the vowel slides from one position to another mid-syllable. In Mandarin, similar pitch contours distinguish entirely different words. The whales appear to be doing something analogous, though nobody yet knows what they are saying.

The new April 2026 paper adds a wrinkle that is worth understanding even if you have never thought about phonology before. When a whale switches from an a-coda to an i-coda or back, the first click in the new coda is sometimes acoustically intermediate, partially shaped by the vowel that came before it. Linguists call this coarticulation. It is the reason your tongue is already moving toward the "t" position while you are still finishing the "a" in "cat." Coarticulation is a hallmark of fluent speech production in humans. Finding it in whales, in a system that evolved entirely independently for nearly a hundred million years, is the kind of result that makes evolutionary biologists put down their coffee.

The independence problem

Whales and humans last shared a common ancestor in the Cretaceous, when Triceratops was still a few tens of millions of years away from existing. Whatever that ancestor was, it almost certainly did not have language. So if both lineages now show structured, combinatorial vocal communication with discrete categorical units, the resemblance is convergent. Two evolutionary paths, separated by ninety million years and the entire problem of how to make sounds underwater versus in air, arrived at similar solutions.

I find this more philosophically uncomfortable than it should be. The standard story we tell ourselves about human language goes something like this. At some point in the last few hundred thousand years, our ancestors developed a unique cognitive capacity, possibly underwritten by a specific genetic change, that allowed them to combine a small inventory of meaningless sounds into an unbounded number of meaningful units. This is what Noam Chomsky has called the language faculty, and it is supposed to be the thing that separates us, cognitively, from every other species on Earth.

The whale data does not refute that story. We still do not know whether sperm whale codas carry meaning the way human words do, whether they have anything resembling syntax, or whether the combinatorial structure researchers have identified actually generates an unbounded space of expressions. What the data does is shrink the gap. Each new finding pushes a feature that was supposed to be uniquely ours into the ocean. First it was tool use, then culture, then teaching, then mourning. Now it is the discrete combinatorial structure of phonology.

You can argue, and many linguists do, that none of this matters until we can show that the whales are using these structures to talk about something. A parrot can produce human phonemes without having a language. A starling can string together complex acoustic patterns without meaning a word of it. The bar for calling something a language has always been semantic, not acoustic. Fair enough. But the same critique applies in reverse. We assumed for a long time that animals could not be doing anything semantically rich because their acoustic systems looked too simple. That assumption now needs a defense it did not need ten years ago.

The thing that gets lost in the technical debate is how strange the whale data is on its own terms. Sperm whales have the largest brains of any animal that has ever lived. They live for seventy years or more. They form matrilineal social units that pass cultural information across generations. They babysit each other's calves. Females help one another give birth, a behavior Project CETI documented on video last year, with adult whales physically supporting a laboring mother near the surface. They sleep vertically in groups, suspended in the water column like enormous bottles. Whatever they are saying to each other, they have had a very long time to develop something to say.

Project CETI's working dataset is now somewhere over twelve thousand codas and growing. The team uses a combination of generative adversarial networks and more traditional acoustic analysis, and the GANs have been useful precisely because they do not know what they are looking for. They surface patterns that human researchers then go back and verify by ear and by eye on the spectrograms. This is the inverse of how machine learning is usually deployed in linguistics, where the patterns are known and the algorithms are tuned to detect them. Here, the algorithm is the prospector, and the linguists are the assayers.

The next phase, according to David Gruber, who runs Project CETI, is to start correlating specific coda structures with observable behaviors. The Dominica clan has been observed and tagged for nearly two decades. Researchers know which whales are related, which ones travel together, which ones are diving, and which ones are nursing. If a particular sequence of codas reliably precedes a coordinated dive or shows up only when a calf is present, that is the beginning of an interpretive framework. It is not a Rosetta Stone. There is no bilingual text. But it is the kind of patient, longitudinal work that might eventually let us assign tentative meaning to specific patterns.

What it will not do, and this needs saying, is allow us to have a conversation with a sperm whale. The fantasy of cetacean translation, where we put on headphones and chat with the leviathans, is not a serious near-term goal. The whales are not waiting to talk to us. They are talking to each other in a system optimized for their purposes, transmitted through water at frequencies and timescales that our auditory systems cannot natively process. Even if we decoded every coda in the Dominica clan's repertoire, we would have learned what those particular whales say to each other. The whales of the Pacific, who use different dialects, would still be opaque.

Maybe that is the part of this story that bothers me most when I think about where it is heading. There is a real possibility that we will, within the next decade or two, demonstrate conclusively that another species on this planet has something we are willing to call language. The legal and ethical implications of that are enormous, and the people working on it know it. Beguš has talked publicly about what it would mean for animal welfare law if sperm whales meet whatever criteria we eventually settle on. The honest answer is that no legal system anywhere is prepared for that question.

In the meantime, the whales continue. The Dominica clan went about its business this morning, somewhere off the leeward coast of the island, hunting squid at depths where light does not reach and surfacing in tight social groups to breathe and to talk. The hydrophones are still recording. Pinchy, presumably, is still making vowels.

The question is not whether we will eventually understand her. The question is what we will owe her once we do.