The Human Cost of AI

A couple of years ago, a good friend* achieved their lifelong dream: After a decade of grueling agency work, they had finally secured an in-house localization job on a game they loved. This was a highly coveted position, as it offered stable work and income in a field otherwise ruled by uncertainty.

Sadly, that all came crashing down when one day, the company behind the game decided to introduce AI translation systems to their localization processes. The entire department went through an upheaval, and the translators who were once tasked with crafting compelling narratives for international audiences suddenly found themselves forced to “proofread” (“rewrite” being a more apt term) machine-generated texts under ever-shrinking deadlines. Unable to adapt to this new paradigm and unwilling to compromise on their quality standards, my friend was unceremoniously let go a few months after this change was introduced.

Stories like these are a dime a dozen in game localization. They’ve proliferated over the last few years, as Silicon Valley giants persist in their quest to reduce creative endeavors to mindless word-assembly lines. With tighter budgets and heightened profit expectations came the need to always produce more, faster, for cheaper, sending the gaming industry into a craze for disposable “content”.

The Promise of Automation

Of course, this new gold rush attracted plenty of shovel sellers, among which publisher DMM Games, who led their 2025 GDC talk with the bold claim that AI agents would “revolutionize” the way games are translated. Having localized games professionally for over 10 years and studied both traditional MT models and LLMs extensively, I think those claims are worth examining with a critical eye.

If you’ve ever talked to a professional translator, you’ll know that most of us have an adversarial relationship with Machine Translation (MT), especially for games and other creative content that require a high degree of artistry. Since the dawn of Babel Fish and Google Translate, companies have sought to integrate MT technologies in order to save time and reduce costs, oftentimes incurring a loss in quality. And while MT has practical and valid applications, it has historically been used to force translators to accept gradually worsening conditions, much in the way many creatives are now being pressured into shoehorning ChatGPT or Midjourney into their pipelines.

The mechanical part — converting text from Language A to Language B — can be mostly automated.

One argument used by DMM is that traditional MT + post-editing models (MTPE) are both unsuitable and unsustainable for game localization, due to the high number of variables (both literal and figurative), interconnected elements, neologisms, etc. On that, at least, we agree.

However, their proposed solution is a rather dubious one: What if we were to try to eliminate as much of the human element as possible?

In this scenario, translators overseeing the output would be elevated to the role of “directors” [sic], leaving the menial task of “converting text from Language A to Language B” to the AI agent. If you work in the localization business, there are two ways you can interpret this statement:

  1. You’re in an executive role, and your eyes are flashing dollar signs
  2. You’re in a role that involves working directly with the content (translators, editors, etc.), and you’re frantically looking for the nearest emergency exit

Is it an oversimplification? Absolutely. However, it’s also representative of how this industry has operated for decades – The introduction of new technologies that could have benefited those “in the trenches” is inevitably used to disempower them. This time is no different, and trying to spin this as a positive evolution is totally disingenuous.

First off, the definition of translation is specifically not to convert words, but meaning. And while LLMs are able to replicate human speech patterns with eerie accuracy, it bears reminding that they don’t think nor understand like the human brain does. Moreover, most translators don’t want to play a constant game of whack-a-mole to try and get an AI agent to translate in a way that sounds natural, as opposed to logical. In the amount of time it takes to re-prompt, they could have simply typed the correct translation a dozen times over.

Another argument used by DMM is that by letting the AI take care of the “basic, repetitive lines”, translators will be able to focus on more crucial parts of a game, such as characterization or highly creative content (poems, wordplays, key cutscenes, etc.). I see three major issues with this approach:

  • First, no game text exists in a vacuum, and many of those repetitive lines are tied to other narrative/gameplay elements, which means they deserve as much attention as the rest of the game. (Also, all modern CAT tools already take care of the repetitive parts through Fuzzy matching systems, something they conveniently left out).
  • Second, highly creative translations incur a certain measure of intellectual strain, which is not sustainable on a daily basis. Assuming this approach still employs the standard pay-per-word model (which seems to be the case, according to DMM’s presentation, and goes as low as 3 ¢/word), it would make no financial sense for translators to work exclusively on this type of content.
  • Third, translating basic lines is critical for upskilling: I know AI companies rarely concern themselves with long-term vision, but even assuming an optimistic scenario where all experienced translators jump on the bandwagon, where does that leave juniors? How will they learn the basics of writing engaging dialogues or working with tight character limits when LLMs fall short?

DMM claimed that “In just six months, [they’ve] tackled about 50 titles in 10 languages, some clients even releasing the AI output with minimal or no post-editing”. If that is true, this is a remarkable achievement, one that will no doubt put them far ahead of their “traditional” competitors in the upcoming years. Although, if their confidence is well-placed, surely they wouldn’t mind showcasing the full raw output for anyone to judge?

Or… could it be that all this AI veneer hides a much grimier reality?

Putting Claims to the Test

After all that yapping about theory, I’d like to illustrate my point with a couple of case studies. Since I don’t have access to the advanced products touted by AI companies (and even if I did, I would never entrust them with anyone’s IP without their knowledge or consent), I’ll be relying on basic prompts using Claude and Gemini. Bear in mind that while this methodology has its flaws, it’s fairly representative of the AI workflows used by many localization companies.

Let’s start with an excerpt from Cyberpunk 2077, as it’s both thematically fitting (the game’s corpo-tech dystopia is starting to look a whole lot like our reality), and because its French translation has earned praise from both fans and media. For this exercise, I’ve picked a piece of information on an in-game element called “braindance”, which allows users to record and play back someone’s experience, including their physical sensations, emotions, and thoughts. Given that this text was written in an editorial style, the results should be better than average, as most LLMs were trained on large amounts of editorial content.

The prompt given to both LLMs was: “Translate the following text for a video game set in a dystopian universe from English to French. Make sure that the text reads naturally to native French speakers, while preserving the accuracy and intent of the original English text.” (Click to enlarge)

Text version – Click to expand
Source text
Official translation

Claude output

Gemini output
GUEST EDITORIAL

Braindance editors have long strived to strike a balance between real, lived experience and technological, experiential purity. The more heavily processed the material, the more abstracted the pathways, the clearer the braindance recording; these fundamental elements of design have guided editors since the first wave of brain-to-brain experience sharing technology took hold. In their pursuit of balance, however, editors have clearly shown a bias for purity over naturalism over the years, even going so far to use it as a point of pride in the quality of their production.

But in the industry’s latest push for greater purity has the purpose of the technology already been forgotten? Will we not find ourselves processing and filtering a braindance recording to the point that the emotional experience no longer extends beyond what we receive from film, television and video games? After reliving some of the latest titles on my feeder unit, this once-academic question now feels all too inevitable with the industry’s current trajectory.

For a moment, let’s consider why some reports suggest more and more users are searching for unlicensed titles on the black market (so-called black braindances, extreme braindances, or XBDs). Are we so sure it’s the illicit content they are after? Or maybe the real draw is the residual grit we editors try so hard to remove? Distracting thoughts, irrelevant memories, loose associative threads, emotions stretching beyond the desired spectrum… What if this noise is not so superfluous as we believe it to be? What if these peripheral experiences hold the potential to elevate a good braindance to an exquisite one? We do ourselves a disservice by not exploring these questions before our blind crusade for braindance purity leads this industry straight into the bin of obsolete, flash-in-the-pan technology.

– J.A.

Relive.it – The quarterly magazine for braindance editors, amateurs, and enthusiasts, vol. 4/78, December 2076
INVITÉ DE LA RÉDACTION

Depuis des années, les éditeurs de danse sensorielle s’efforcent de trouver un juste équilibre entre l’expérience réelle, vécue, et la pureté expérientielle et technologique. Plus le matériau est traité et les connexions dissociées, plus l’enregistrement de danse sensorielle est clair. Les éditeurs sont guidés par ces éléments de conception fondamentaux depuis l’arrivée des toutes premières technologies de partage entre cerveaux connectés. Pourtant, dans leur quête d’équilibre, ils ont clairement démontré avec le temps une préférence pour la pureté plutôt que le naturalisme, allant même jusqu’à en faire un argument de vente quant à la qualité de leurs produits.

On peut d’ailleurs se demander si l’industrie, dans ses derniers efforts pour atteindre une pureté optimale, n’a pas oublié l’objectif de la technologie. N’allons-nous pas nous retrouver à traiter et filtrer les enregistrements de danse sensorielle à un point tel que leur intérêt en tant qu’expériences émotionnelles cessera de dépasser celui des films, de la télévision et des jeux vidéo ? Au vu de la trajectoire actuelle de l’industrie, et après avoir revécu quelques titres récents de ma collection, cette question auparavant théorique me semble désormais inévitable.

Réfléchissons un instant aux raisons pour lesquelles certains rapports suggèrent que de plus en plus d’utilisateurs recherchent des titres non autorisés sur le marché noir (danses sensorielles clandestines, danses sensorielles extrêmes ou DSX). Pouvons-nous être sûrs que le contenu illicite soit ce qui intéresse ces gens ? Peut-être que ce qui leur plaît vraiment, c’est ce grain « sale » que nous autres éditeurs nous efforçons de faire disparaître ? Ces pensées distrayantes, ces souvenirs inutiles, ces associations d’idées confuses, ces émotions non désirées… Et si ce « bruit » n’était pas aussi superflu que nous le pensons ? Si ces expériences périphériques détenaient le potentiel de faire d’une bonne danse sensorielle un chef-d’œuvre ? Nous ne nous rendons pas service en refusant d’explorer ces questions avant que notre croisade pour la pureté des danses sensorielles ne conduise l’industrie à rejoindre une longue liste de technologies éphémères et obsolètes.

J.A.

Relive.it – Le magazine trimestriel des éditeurs, amateurs et passionnés de danse sensorielle, vol. 4/78, décembre 2076
ÉDITORIAL INVITÉ

Les éditeurs de braindance s’efforcent depuis longtemps de trouver un équilibre entre l’expérience réelle, vécue et la pureté technologique et expérientielle. Plus le matériel est traité en profondeur, plus les voies neuronales sont abstraites, plus l’enregistrement de braindance est clair ; ces éléments fondamentaux de conception ont guidé les éditeurs depuis l’émergence de la première vague de technologie de partage d’expérience de cerveau à cerveau. Dans leur quête d’équilibre, cependant, les éditeurs ont clairement montré une préférence pour la pureté au détriment du naturalisme au fil des ans, allant même jusqu’à en faire un point de fierté dans la qualité de leur production.

Mais dans cette dernière poussée de l’industrie vers une plus grande pureté, le but de la technologie n’a-t-il pas déjà été oublié ? Ne risquons-nous pas de traiter et filtrer un enregistrement de braindance au point que l’expérience émotionnelle ne dépasse plus ce que nous recevons du cinéma, de la télévision et des jeux vidéo ? Après avoir revécu certains des derniers titres sur mon unité d’alimentation, cette question autrefois académique semble désormais inévitable compte tenu de la trajectoire actuelle de l’industrie.

Réfléchissons un instant à la raison pour laquelle certains rapports suggèrent que de plus en plus d’utilisateurs recherchent des titres non autorisés sur le marché noir (les soi-disant braindances noirs, braindances extrêmes, ou XBD). Sommes-nous si certains que c’est le contenu illicite qu’ils recherchent ? Ou peut-être que l’attrait réel réside dans cette rugosité résiduelle que nous, éditeurs, nous efforçons tant d’éliminer ? Pensées distrayantes, souvenirs non pertinents, fils associatifs lâches, émotions s’étendant au-delà du spectre souhaité… Et si ce bruit n’était pas aussi superflu que nous le croyons ? Et si ces expériences périphériques avaient le potentiel d’élever un bon braindance à un niveau exquis ? Nous nous desservons en n’explorant pas ces questions avant que notre croisade aveugle pour la pureté du braindance ne mène cette industrie tout droit dans la poubelle des technologies obsolètes et éphémères.

– J.A.

Relive.it – Le magazine trimestriel pour éditeurs, amateurs et passionnés de braindance, vol. 4/78, décembre 2076
ÉDITORIAL INVITÉ

Les éditeurs de braindance se sont longtemps efforcés de trouver un équilibre entre l’expérience réelle et vécue, et la pureté technologique et expérientielle. Plus le matériel est traité, plus les chemins sont abstraits, plus l’enregistrement de braindance est clair; ces éléments fondamentaux de conception ont guidé les éditeurs depuis la première vague de technologies de partage d’expérience cerveau-à-cerveau. Dans leur quête d’équilibre, cependant, les éditeurs ont clairement montré une préférence pour la pureté au détriment du naturalisme au fil des ans, allant même jusqu’à en faire un point de fierté dans la qualité de leur production.

Mais dans la dernière poussée de l’industrie vers une plus grande pureté, le but de la technologie a-t-il déjà été oublié? Ne nous retrouverons-nous pas à traiter et à filtrer un enregistrement de braindance au point que l’expérience émotionnelle ne s’étende pas au-delà de ce que nous recevons des films, de la télévision et des jeux vidéo? Après avoir revécu certains des derniers titres sur mon unité de flux, cette question autrefois académique semble désormais trop inévitable avec la trajectoire actuelle de l’industrie.

Un instant, considérons pourquoi certains rapports suggèrent que de plus en plus d’utilisateurs recherchent des titres non autorisés sur le marché noir (les soi-disant braindances noires, braindances extrêmes, ou XBDs). Sommes-nous si sûrs que c’est le contenu illicite qu’ils recherchent? Ou peut-être que le véritable attrait est le grain résiduel que nous, éditeurs, essayons si fort d’éliminer? Des pensées distrayantes, des souvenirs non pertinents, des fils associatifs lâches, des émotions s’étendant au-delà du spectre désiré… Et si ce bruit n’était pas aussi superflu que nous le croyons? Et si ces expériences périphériques avaient le potentiel d’élever une bonne braindance à une braindance exquise? Nous nous desservons en n’explorant pas ces questions avant que notre croisade aveugle pour la pureté de la braindance ne conduise cette industrie directement dans la poubelle des technologies obsolètes et éphémères.

– J.A.

Relive.it – Le magazine trimestriel pour les éditeurs de braindance, les amateurs et les passionnés, vol. 4/78, décembre 2076.

And for good measure, here’s a raw diffcheck between the official translation of the 1st paragraph and Claude’s results:

In itself, this doesn’t prove much, as you could compare two great human translations and still get significant differences. However, you can see a stark contrast in the way each sentence is constructed, including terminology, placement of adjectives, splicing of various clauses, etc., all of which affect the reader’s perception. You could argue that the AI-generated output is “legible”, but it’s certainly not pleasant to read. And if this output was sent as part of a translation test, it would have a 99% chance of being rejected.

Let’s now move on to the content itself, starting with the elephant in the room: Neither Claude nor Gemini knows what a “braindance” is, so they just left it in English. In contrast, the official translation reads “sensorial dance”, which perfectly encapsulates the concept of “braindance”.

And therein lies the first real problem with translating anything using AI: By default, LLMs do not ask questions – They just work with what they’re given, and when you don’t give them much, you can’t expect miracles. So, all you have to do is feed them more context, right? The problem is that even human translators are often not provided with any reference material whatsoever (for a variety of reasons, but that’s a topic for another day). Which begs the question: Why would it be different once you start throwing AI into the mix?

Here’s an assortment of other major issues with the automated outputs:

  • The very first line, “GUEST EDITORIAL”, was translated word-for-word (ÉDITORIAL INVITÉ), which makes no sense in French, whereas the official translation, “INVITÉ DE LA RÉDACTION”, reads “GUEST OF THE EDITORIAL [TEAM]”, a common journalistic term. One could assume the translator asked the dev team to confirm J.A.’s identity, since they used the male word for guest. It’s unlikely any LLM would make the connection between “J.A” and “GUEST” in the first place, since they’re so far apart.
  • “The more heavily processed the material, the more abstracted the pathways”, which refers to the way the braindance treats the users’ recorded experiences, is once again translated word-for-word by both Claude and Gemini. Gemini even translates “pathways” as “roads” (as in, the physical kind), introducing further confusion. Overall, this clause is extremely hard to parse, as it follows the same structure as English (complete with the use of a semicolon, which is uncommon in French). Meanwhile, the official translation divided the clause into two sentences, improving legibility, and used more natural-sounding terms like “connexions dissociées” to refer to abstracted pathways.
  • “[…] point of pride in the quality of their production” was translated as “point de fierté dans la qualité de leur production”, which sounds too literal and stilted regardless of context. The official translation reads “a selling point in regard to the quality of their product”. If this sounds awkward when back-translated into English, that’s exactly the point – All languages work differently, which is why it’s important to capture the intent rather than the exact words.

And I’m only pointing out the most glaring issues of the FIRST paragraph, a mere 91 words… Now, imagine having to do that kind of mental gymnastics over millions of words, day after day. Just thinking about it sounds exhausting and demoralizing, right? Well, you now know what it’s like to post-edit AI output.

Of course, I chose French for this example because it’s the language I translate into, but also because it’s semantically close to English. Adapting similar texts for Russian, Japanese or Chinese audiences, with all the cultural nuances involved, is a whole other ball game. If you speak those languages, I encourage you to run the same test and share your results!

And if you think I’m just cherry-picking, here’s a completely unrelated example encountered by a German colleague, which showcases new skills in a fantasy game. This time, the output was supposedly trained on the game’s design documents, as well as a custom style guide/glossary. The text in red was rewritten, while the text in blue changed the word order (Note: Some parts were redacted due to NDA):

Now, let’s assume that you compiled every single key term of your game into a glossary prior to translation, complete with extensive definitions, as well as descriptions of each character’s gender, age, etc. Yes, you could potentially get better results. But without extensive human editing, it would still be mediocre – because LLMs can’t inherently understand what makes a text “flow”. They don’t try to rephrase a question into an exclamation because it better fits the target culture or the context. They can’t cut a long clause into multiple smaller ones and swap them around to improve legibility. They can’t know at which stage characters can start addressing each other informally as their relationship evolves. Simply speaking, they cannot translate, merely convert words, which leads to rigid, uninspired results.

And this is where “AI-powered” localization agencies come in, trotting out a supposed army of expert translators ready to leap into action whenever the output is awkward or flat-out wrong to provide you, the end client, with the best possible quality in record time. However, this is demonstrably false: On top of the factual issues I’ve pointed out above, the practice known as MTPE is far from popular among localization specialists, meaning that those tasks are usually handled by translators who are inexperienced, desperate, or both.

In a recent survey involving 160 translators and editors, 57% of respondents claimed to hate or dislike the practice, while 33% merely feel ambivalent. MTPE is often considered tedious, uninteresting work, which strips translators of their creativity and agency, while forcing them to work as fast as possible to make ends meet. Yet, some companies have made MTPE their entire business model, earning them the label of “translation sweatshops”.

Meanwhile, a few developers have ditched agencies altogether to let ChatGPT take the wheel, complete with hallucinations, leading to hilariously disastrous results:

Looking Towards the Future

You may think that translators who oppose the use of AI are just a bunch of disgruntled luddites, technophobes and what have you. And in a way, you’d be right – personally, I loathe the very idea of outsourcing my critical thinking skills to a probabilistic autocomplete built upon stolen data. This doesn’t mean we don’t use tools at all (as evidenced by the many macros/scripts I wrote for this website), or that we can’t occasionally get a rough idea from DeepL.

But just like any tool, if it can be used as a means of oppression, it most likely will. Translators were just the canaries in the creative coal mine, and unfortunately, many others will follow. This is the real human cost of using AI. This cost cannot be measured in dollars, and you may not feel its ramifications today, tomorrow, or even a couple of years from now. But once every ounce of that precious data has been siphoned and regurgitated a thousand times over, will there be anything palatable left?

Despite all that, our brains have a remarkable capacity to notice oddities and unnatural patterns, which are hallmarks of all things AI. Whenever something is produced hastily or sloppily, we tend to get put off by it, if only subconsciously. Translations are no exception. And considering the amount of slop we’re now constantly exposed to online, I’m convinced that people will gradually become more mindful of what they choose to spend their precious time and money on.

Some of the best games of this generation were 100% written (and therefore translated) by humans, for humans. After all, why should players bother reading something nobody bothered to write?

If you’re a developer looking to localize your game, you either have or will be approached by dozens of companies promising you the moon with all sorts of “AI-assisted” or “human in the loop” solutions. However, I want you to be aware that those companies don’t have your best interests at heart – All they care about is boosting their margins on the back of their subcontractors, and hope that you won’t notice just how much you’re getting ripped off. Unless you have a native level understanding of all the languages you want to support in your game, you must rely on reputation and good faith alone… and in today’s market, that’s a risky move at best.

So, if you find yourself at a crossroads, I’d recommend you reach out to professional entities such as the IGDA LocSig. You can also contact me at me[at]luciledanilov.com, and I promise you that I’ll help you find the best possible partners for your project – including reputable companies I don’t work directly with, as to avoid conflicts of interest.

A big thanks to my colleagues and friends for helping me with this piece: Sarah Deville and Lucie Teulières (The Indie Duet), as well as Aleksander Danilov and pixelhans for the visual assets.

*A few details were changed to protect this person’s anonymity.

4 thoughts on “The Human Cost of AI”

  1. Hi Lucile,

    Thanks for such a great article. You did an incredible job addressing the issue with clear logic and convincing examples. I’m a EnglishChinese translator, but I enjoyed navigating through the Cyberpunk 2077 passage with my extremely limited French; it really showcases the limitations of AI and where/how human translation can shine.

    Bravo!

    1. Hello Hongyi, thanks for your comment! A colleague actually offered to do a similar comparative analysis for Chinese, so I think it’ll turn out very interesting 🙂 he’ll publish the results once they’re ready.
      Thanks again and have a great day!

  2. Thank you for this. It allowed me to realize a few extra things about why AI gen is failing us but why so many people praise it. My conclusion is that it’s simply made it obvious that most people do not have a sufficient level of quality analysis to even notice the issue. It’s some sort of Dunning Krueger effect all over again.
    Thinks ai result is fine, but also thinks human slop is fine -> thinks it feels odd but can’t explain it -> can actually point out what’s wrong.

    1. Hi Moni, thanks for your message. You’re right, since localization has always been a more or less unregulated field, countless sloppy translations have managed to slip under the radar for years, and the rise of MT/AI has only intensified the issue.

      Now that mediocrity is set to become the norm, people are less and less able to notice issues in their own native language, which is alarming to say the least. I firmly believe that the “flattening” of languages caused by the continued use of LLMs is one of the biggest threats we currently face, since it also affects the way we think and form opinions.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top