Biology

Why We Have Billions of Whale Sounds and Still Cannot Understand Them

Project CETI is collecting 4 billion sperm whale clicks, but decoding them requires finding a mathematical signature of language without a Rosetta Stone.

Sankalp
Sankalp
Engineering Lead
Why We Have Billions of Whale Sounds and Still Cannot Understand Them

Humans have lived alongside other intelligent species for thousands of years, yet we remain functionally deaf to their communication. For centuries, biology treated animal vocalizations as simple, instinctual signals-noises meant to signify hunger, danger, or the desire to mate. We assumed that because their brains were different from ours, their communication lacked the "grammar" and abstract "thought" that define human language. This human-centric view has been challenged as our ability to collect data from the natural world has exploded.

Today, we use underwater recording arrays and microphones in the deep canopy to capture trillions of data points from species like sperm whales and fruit bats. We have moved from the era of the lone researcher with a notebook to the era of Big Data bioacoustics. However, more data has not yet translated into understanding. We are effectively listening to a global conversation in a language we have never heard, with no dictionary to guide us.

The challenge of "animal translation" is fundamentally different from translating between two human languages. When we translate French to English, we rely on a shared cultural context and a known Rosetta Stone. When we attempt to decode the clicks of a whale, we have neither. We must find a way to detect the presence of language itself-to see if the sounds follow the same mathematical rules as human speech-before we can even begin to guess what they mean.

The search for animal language has collided with a massive data problem. Project CETI has collected over four billion "codas"-the rhythmic click sequences used by sperm whales-but researchers lack a Rosetta Stone to translate them. We possess the raw tokens of a global communication network, yet we have no way to verify if these sounds carry abstract meaning or are simply complex biological reflexes.

The Search for Mathematical Structure

Decoding a language without a translation key requires looking for the mathematical signatures of information. Researchers are applying Zipf’s Law-a statistical rule that describes how word frequency and rank are related in every human language-to the whale data. If the click patterns follow this specific distribution, it provides strong evidence that the vocalizations are a structured combinatorial system rather than random signals. This mathematical test allows us to distinguish between noise and information, providing a baseline for whether a species is capable of abstract communication.

Beyond Zipf's Law, bioacoustics researchers utilize Shannon entropy to measure the "information density" of animal calls. High entropy suggests a system capable of transmitting complex, unpredictable messages, while low entropy indicates a repetitive, simple signal. By measuring the entropy of sperm whale codas across different social groups, AI models have identified distinct "dialects" that suggest cultural transmission-knowledge passed down through generations rather than purely through genetics.

Combinatorial Syntax: The Whale Alphabet

A 2024 study in Nature Communications by the Project CETI team identified what they call a "phonetic alphabet" in sperm whale codas. By analyzing the precise timing, tempo, and rhythm of clicks, the AI discovered that whales combine these elements to create a vastly larger set of signals than previously documented. This suggests that whales are not just signaling "danger" or "food," but are communicating through a structured syntax that allows for potentially infinite variations.

For example, the researchers found that whales adjust the "tempo" (the speed of the click sequence) and the "ornamentation" (the addition of extra clicks at the end) to change the meaning of a coda. This is analogous to how humans use suffixes or tone of voice to alter a word's function. By mapping these variations, the AI has revealed a combinatorial system where a small set of basic units (clicks) are assembled into complex structures. This "combinatoriality" is one of the hallmark features of human language, and seeing it in a marine mammal suggests that the rules of information are universal across biological life.

The Contextual Mapping Problem

Even with a phonetic alphabet, we are still missing the most critical piece: context. A word only has meaning in relation to the world. To solve this, researchers are using AI to correlate vocalizations with high-resolution behavioral data. D-TAGs-sensors attached to the whales-record their depth, orientation, and proximity to other whales at the exact moment a coda is emitted.

By feeding both the audio and the sensor data into a multimodal model, researchers hope to identify "action-vocalization pairs." If a specific coda is always emitted when a whale is nursery-feeding or socializing after a deep dive, we can begin to infer the "semantic" meaning of the sound. This is a process of mapping the latent space of whale behavior onto the latent space of their audio, looking for the points where the two systems intersect.

Anthropomorphic Hallucinations

The primary danger in animal bioacoustics is the anthropomorphic hallucination. When researchers apply large language models like GPT-4 to animal data, the models can "find" English-like grammar and sentence structures that do not exist. These AI agents were pre-trained on trillions of human words, making them biased toward seeing human logic in non-human data.

The discovery of a phonetic alphabet in whales does not guarantee a shared emotional or intellectual context. We are decoding the rules of a communication system, but we may find that the internal life of a deep-diving mammal remains fundamentally inaccessible to the human mind. The ultimate tension of interspecies communication is the risk of hearing only our own echo in the vast, structured silence of the deep.

Insight

Project CETI (Nature Communications, 2024) discovered a phonetic alphabet in sperm whale codas, revealing that click timing and rhythm form a structured combinatorial system.

Frequently Asked Questions

Can AI actually translate what animals are saying?+
Not yet in the way we translate English to French. However, AI can identify repeating patterns (phonemes), sentence-like structures (syntax), and even distinct 'dialects' between different groups of the same species.
What is Project CETI?+
Project CETI (Cetacean Translation Initiative) is a multidisciplinary project using state-of-the-art machine learning to listen to, contextualize, and eventually communicate with sperm whales.

Join the EulerFold community

Track progress and collaborate on roadmaps with students worldwide.

🐢

Discussion

0

Join the discussion

Sign in to share your thoughts and technical insights.

Loading insights...

Recommended Readings

The author of this article utilized generative AI (Google Gemini 3.1 Pro) to assist in part of the drafting and editing process.

Technical explainers on AI, research, and modern engineering.

Follow us