Humans have lived alongside other intelligent species for thousands of years, yet we remain functionally deaf to their communication. For centuries, biology treated animal vocalizations as simple, instinctual signals-noises meant to signify hunger, danger, or the desire to mate. We assumed that because their brains were different from ours, their communication lacked the "grammar" and abstract "thought" that define human language. This human-centric view has been challenged as our ability to collect data from the natural world has exploded.
Today, we use underwater recording arrays and microphones in the deep canopy to capture trillions of data points from species like sperm whales and fruit bats. We have moved from the era of the lone researcher with a notebook to the era of Big Data bioacoustics. However, more data has not yet translated into understanding. We are effectively listening to a global conversation in a language we have never heard, with no dictionary to guide us.
The challenge of "animal translation" is fundamentally different from translating between two human languages. When we translate French to English, we rely on a shared cultural context and a known Rosetta Stone. When we attempt to decode the clicks of a whale, we have neither. We must find a way to detect the presence of language itself-to see if the sounds follow the same mathematical rules as human speech-before we can even begin to guess what they mean.
The search for animal language has collided with a massive data problem. Project CETI has collected over four billion "codas"-the rhythmic click sequences used by sperm whales-but researchers lack a Rosetta Stone to translate them. We possess the raw tokens of a global communication network, yet we have no way to verify if these sounds carry abstract meaning or are simply complex biological reflexes.
The Search for Mathematical Structure
Decoding a language without a translation key requires looking for the mathematical signatures of information. Researchers are applying Zipf’s Law-a statistical rule that describes how word frequency and rank are related in every human language-to the whale data. If the click patterns follow this specific distribution, it provides strong evidence that the vocalizations are a structured combinatorial system rather than random signals. This mathematical test allows us to distinguish between noise and information, providing a baseline for whether a species is capable of abstract communication.
Beyond Zipf's Law, bioacoustics researchers utilize Shannon entropy to measure the "information density" of animal calls. High entropy suggests a system capable of transmitting complex, unpredictable messages, while low entropy indicates a repetitive, simple signal. By measuring the entropy of sperm whale codas across different social groups, AI models have identified distinct "dialects" that suggest cultural transmission-knowledge passed down through generations rather than purely through genetics.
Combinatorial Syntax: The Whale Alphabet
A 2024 study in Nature Communications by the Project CETI team identified what they call a "phonetic alphabet" in sperm whale codas. By analyzing the precise timing, tempo, and rhythm of clicks, the AI discovered that whales combine these elements to create a vastly larger set of signals than previously documented. This suggests that whales are not just signaling "danger" or "food," but are communicating through a structured syntax that allows for potentially infinite variations.
For example, the researchers found that whales adjust the "tempo" (the speed of the click sequence) and the "ornamentation" (the addition of extra clicks at the end) to change the meaning of a coda. This is analogous to how humans use suffixes or tone of voice to alter a word's function. By mapping these variations, the AI has revealed a combinatorial system where a small set of basic units (clicks) are assembled into complex structures. This "combinatoriality" is one of the hallmark features of human language, and seeing it in a marine mammal suggests that the rules of information are universal across biological life.
The Contextual Mapping Problem
Even with a phonetic alphabet, we are still missing the most critical piece: context. A word only has meaning in relation to the world. To solve this, researchers are using AI to correlate vocalizations with high-resolution behavioral data. D-TAGs-sensors attached to the whales-record their depth, orientation, and proximity to other whales at the exact moment a coda is emitted.
By feeding both the audio and the sensor data into a multimodal model, researchers hope to identify "action-vocalization pairs." If a specific coda is always emitted when a whale is nursery-feeding or socializing after a deep dive, we can begin to infer the "semantic" meaning of the sound. This is a process of mapping the latent space of whale behavior onto the latent space of their audio, looking for the points where the two systems intersect.
Anthropomorphic Hallucinations
The primary danger in animal bioacoustics is the anthropomorphic hallucination. When researchers apply large language models like GPT-4 to animal data, the models can "find" English-like grammar and sentence structures that do not exist. These AI agents were pre-trained on trillions of human words, making them biased toward seeing human logic in non-human data.
The discovery of a phonetic alphabet in whales does not guarantee a shared emotional or intellectual context. We are decoding the rules of a communication system, but we may find that the internal life of a deep-diving mammal remains fundamentally inaccessible to the human mind. The ultimate tension of interspecies communication is the risk of hearing only our own echo in the vast, structured silence of the deep.
Project CETI (Nature Communications, 2024) discovered a phonetic alphabet in sperm whale codas, revealing that click timing and rhythm form a structured combinatorial system.
Frequently Asked Questions
Can AI actually translate what animals are saying?+
What is Project CETI?+
Join the EulerFold community
Track progress and collaborate on roadmaps with students worldwide.
Recommended Readings
The author of this article utilized generative AI (Google Gemini 3.1 Pro) to assist in part of the drafting and editing process.
Discussion
0Join the discussion
Sign in to share your thoughts and technical insights.
Loading insights...