Subtitles section Play video
This is Animalese.
KK Slider: [unintelligible babble]
It's the pseudo-language of Animal Crossing. This is what
it sounds like in the Japanese version:
KK Slider: [unintelligible babble]
And here's what it sounds like in the
English version:
KK Slider: [slightly deeper unintelligible babble]
They sound different, which is weird because it's supposed to
be nonsense, right?
So... why does Nintendo dub KK Slider?
To understand we need to
do a taxonomy of all the ways games have tried to represent -- or avoid representing --
human speech.
In the beginning, there was the Word, and the Word was:
Oop! That... was supposed to say voice synthesis.
Early attempts at adding audio to games were a
mix of pre-recorded voices and genuine voice synthesizers. But they were mostly
gimmicky, expensive add-ons. Voice chips made more sense in arcade machines
because they were already a huge investment of space and money -- but it was
still a technical struggle to get them to work. Like Q*bert, known for his mad ups
and foul mouth, this drop of Tang was originally supposed
to speak English instead of:
Q*Bert: [garbled synthetic phonemes]
but audio engineer David Thiel couldn't get the voice chip to
produce the sounds he was hoping for. So instead of continuing to mess with it,
he just said [Q*bert garble curse] and had it string together some incoherent phonemes instead.
Thiel, like many designers that followed, came to the conclusion that
human voices just weren't worth the fuss. Other developers opted for a style
that's entirely unique to video games, and I looked around but I couldn't find
a single definitive phrase used to describe this style -- which I think speaks
to how much we take it for granted, even though it is super weird.
I'm talking about using nonsensical sound effects to stand in for language, or simply put
[slow, low beeps that appear in time with the words]
[very high-pitched piercing beeps that appear in time with the words]
[sharp, high-pitch beeps that appear in time with the words]
for the purpose of this video -- and because it's cool to name things --
I'm going to call this beep speech.
The earliest examples of beep speech I could
find were in JRPGs like Star Arthur Legend - Planet Mephius [short pattern of beeps]
and Legend of Zelda [mid-pitch beeps]
Some American games used a similar trope of mimicking on-screen text, but
it's not meant to stand in for a voice so it's not quite the same.
That distinction is important because of beep speech's peculiar function; games that
use beep speech slowly reveal text and accompany each word with audio, which
makes the player process information as if they were really listening to somebody speak.
It's not a straight info-dump; it replicates the act of listening,
which makes it easier to stay engaged with the written text. That's
assuming you enjoy listening to bebe bebe be bebeep which is a great weakness
of the beep speech of the cartridge era. Because audio capabilities were still
limited, most games use the same beep for every character in every situation. Later
games - including Animal Crossing - could pitch the beeps higher or lower, and that
really helped spice things up.
Then there were games like Star Fox which gave each
character a different kind of "voice" so you could easily distinguish your kind
friends Slippy [synthetic sounds similar to frog croaks]
from that no-good hotshot Falco [deeper babble]
These were synthetic voices and
total nonsense with no real association with the text. Another strategy was to
use vocal grunts -- things like sighs and yells and other non-language forms
of communication. These were great for adding variety,
conveying emotion, and giving a character a voice without giving them language.
Although they use different strategies Star Fox and Ocarina of Time have two
weird things in common:
first of all, both have friendly frogs that never get their due.
[rhythmic frog croaking]
Second of all, both have English language lines even in the original Japanese versions.
Navi in Ocarina of Time: Hello!
The [GOOD LUCK] and [HEY, LISTEN] were the same in every
version of the games. And that points to one of the biggest strengths of beep speech
and vocal grunts: you DON'T have to translate them. A shiver is a shiver
in every language.
Link in Ocarina of Time: [shivers]
Localizing a game was - and is - a huge expenditure of time and
money, which makes these non-voice options the perfect replacement for
voice lines. Quality localization is basically a requirement for most games
now, but the 90s and early 2000s were a dark time for translations and voice
acting alike, leaving us with such gems as:
Dracula in Castlevania: What is a man?!
Barry in Resident Evil: A Jill Sandwich!
And that's when localization happened at all - sorry Earthbound fans. During this period,
beep speech was usually a stand-in for a real language. But Banjo Kazooie made a
huge innovation in that their gibberish was... just what it was.
Bottles: [a gentle honking]
Like Star Fox, the characters
had distinct voices - but they weren't synthetic. They were powered by
real human pipes, which is wild because it's human voices replicating a
synthetic style, that was made to replace human voices, like an aural ouroboros.
An auralboros.
Plenty of games of this era had full voice acting... but they weren't
on the N64. Nintendo's insistence on using cartridges would continue to
restrict their options for speech representation. On other consoles, games
became more invested in the cinematic experience of having characters say real shit.
Which mean a greater investment in voice acting. That caused a split in style where beep speech,
previously just fine for serious stories, came to represent a more lighthearted
cartoonish feel.
Mushi's Mama in Okami: [soft mid-tone beeps]
By the early 2000s, a trend emerged of
entirely fictional spoken languages. Whereas beep speech stood in for the
player's native speech, these constructed languages were more about making certain
characters and settings appear foreign -- while still empowering the player to
understand what they're saying. It's during this period that
both Animal Crossing and The Sims arrived.
Simlish actually predates The Sims; it first
appeared in Simcopter. But for The Sims, the team at Maxis knew they'd need
something more elaborate. Because the game was so much about the human
condition, they wanted to communicate emotions which would encourage players
to connect with their creations. Plus the practical considerations - anything that's
comprehensible can become repetitive, and having a huge scroll of dialogue meant
writing, translating, and redubbing a huge scroll of dialogue. Following the style
of Banjo Kazooie, they captured the real human voices of two improvisers and then
spent a year remix that audio to become the perfect blend of nonsense.
Sim 1: Dag dag aulf, Sim 2: Anamana blastamana
But that strategy can't work with every franchise. Animal Crossing had different
intentions and different styles, and so they needed a different approach. When
you hear Animalese for the first time, it sounds a lot like a standard
voice synthesis. But KK Slider is actually saying REAL WORDS.
Here he is slowed down:
KK Slider: [deeper and slower than normal, words that match the text box identifiable]
The synthetic voice doesn't exactly nail the pronunciation of each word, but
that works to its advantage; once it's sped back up, it's even harder to tell
that KK is speaking English. Dōbutsu no Mori, the original n64 Japanese
version of Animal Crossing, features Animalese in Japanese. Region-specific
Animalese is also the default language in New Leaf... but not Wild World or City
Folk. Instead they use a pretty standard sounding voice synthesis called Bebebese.
That's because Animal Crossing was never intended to be localized. In fact,
Nintendo didn't localize the first version of Animal Crossing; the American
release was based on the updated GameCube game Dōbutsu no Mori+.
Members of the Nintendo treehouse had to advocate for it to be translated,
partially because they had already gotten addicted to playing it. Because
they never intended to localize the game, Nintendo included a lot of specific
Japanese cultural elements, including of course the language. All of those had to
be changed in the American version because the style of translation at the
time called for completely eradicating any hint of a foreign culture. The
prevailing notion was that American audiences didn't want anything that had
what cultural theorist Kōichi Iwabuchi called "cultural odor,"
a phrase I hate to say out loud but have to respect the usefulness of. The localizers for the
first Animal Crossing did an amazing job replacing content and adding new
events for American audiences -- so much so that their game was actually real localized
back into Japanese and released for the Gamecube as Dōbutsu no Mori e+.
So when it came time to make Animal Crossing: Wild World, Nintendo needed a
localization strategy from the start.
And that strategy was to make a game with no regionality at ALL.
No cherry blossoms.
No Halloween.
And no regional Animalese.
The Bebebese of Wild World stuck around for City Folk,
but by New Leaf, Animalese had made a triumphant,
multilingual return. Why?
Well!
I don't know.
But my theory is this! City Folk got
a lot of criticism for being too similar to early entries in the franchise. The
next game had to distinguish itself significantly to avoid another letdown.
Aya Kyogoku, who co-directed New Leaf alongside Isao Moro,
viewed the game as a tool to communicate with both animal characters and other
players. So it made sense that the communication in game would be more
elaborate than Bebebese.
But doing full voice acting would have
been exorbitantly expensive;
City Folk had a huge script around, 640,000 words.
For perspective, Infinite Jest clocks in about 483,000 words,
so this cute little game about bugs and letters has it beat
by over a hundred thousand words, and that means
it's better.
On top of that it's just plain science that when creatures speak in adorable baby talk
they're cute and you just want to squish their widdle faces.
All of that probably
made it worthwhile to switch from bee Bebebese back to Animalese, an easy way
to show that you're turning over a New Leaf.
That brings us up to New Horizons and Nintendo has once again flipped the
script by making the language...
Tom Nook: [a few recognizable English words then nonsense babble]
semi-Animalese?
It's not quite Bebebese; there are
parts that still sound like words, and certain sounds do repeat with specific
text like "I" being pronounced like:
Goose: Ah
Rocket: Ah
Gulliver: Ah
but the weird thing is this has still been localized!
You can hear the difference in the way Nook addresses the audience in
the Japanese and English Nintendo Direct.
He's using the same quasi-English here that appears in New Horizons, which
brings us to an important question:
Why localized Animalese?
and why not localize Simlish?
Melissa Baese-Berk: One of the the ways
that we best understand how languages
differ from each other is in terms of what's called their prosody or their
sort-of rhythmic and pitch information.
Jenna: This is Dr. Baese-Berk,
a psycholinguist at the University of Oregon, studying how people process speech,
especially as a second language. Prosody is a linguistic concept that
covers a lot of speech elements that aren't explicitly phonetic -- like if you
hear somebody talking through a wall, you can often tell if they're speaking your
native language or not, even if you can't hear specific words.
Dr. Baese-Berk: We know that the rhythm matters a ton for recognizability, and when you disturb the rhythm
information and pitch information, it can have really big consequences for how you
understand the speech. That said there's a lot of variability, so I could have
sort of weird prosody, weird pitch and rhythm information, and you could
probably still tell that I'm the native speaker of English.
Jenna: Which is exactly why
Animalese is so difficult to parse, even though it's just been peppered with a
few audio artifacts. The sped-up pace alters the prosody and makes it very
hard to understand.
Hard... but not impossible.
Dr.Baese-Berk: There areways in which you can distort the signal so much that it feels at
first just like it's impossible for you to understand it, but once you have
started to figure it out, it becomes easy to understand.
Jenna: I can vouch for that,
having listened to a lot of Animal Crossing clips while researching this
video. I've gotten to the point where I can sort of like... half-understand the
villagers while they're speaking, and I've also begun to dream in Animalese,
and that's... that's probably fine right?
Dr. Baese-Berk: How we define gibberish is
going to be based on our native language, right? So how gibberish-y something
sounds is going to be related to how similar or different it might sound to
your native language, and... but I could imagine if it sounded so distinct from
your native language. It might not even sound like gibberish; it might just sound
like something that isn't really language-y
Jenna: So even though it's gibberish,
the distance from your native language can determine, even
subconsciously, whether you perceive it as a language. Which means that Simlish
probably isn't as universal as the team at Maxis hopes.
Dr. Baese-Berk: The specific sounds that
they're using are English-like sounds, in part because we know producing
non-english-like sounds is something that's really hard if you're a native
English speaker. So if you're improvising and producing gibberish, you're going to
produce the sounds that are within your inventory.
Jenna: A lot of effort was put into
mixing and chopping up this audio but the raw materials were still inevitably
lacking in variety. Simlish could still be localized to make more regionally
accessible forms of gibberish.
But you're not a member of the community in The Sims;
you're an overseer God who occasionally intervenes to drown
somebody. So it matters less how familiar the nonsense sounds.
But for Animal Crossing, the villagers can't speak a gibberish that's too distant from the
player's native language, because the games are about becoming part of a
community. Animal Crossing needs to feel - and sound - warm and familiar.
Dr. Baese-Berk: That level of comfort and familiarity
is something that is probably easy to induce via language.
Jenna: Localized Animalese can make you feel more comfortable and at home,
because even if you don't realize it, everybody is speaking your language.