How Artificial Speech Can Revolutionize the Gaming Industry

Text to speech can have some surprising functionality when it comes to the gaming industry. More and more games are incorporating speech synthesis to give voices to their games, stories, and characters as well as providing accessibility to players with visual impairments. In the future, we could potentially see games with thousands upon thousands of dialogue all generated instantly, saving huge amounts of time and development costs.

Artificial Speech

How Does it Happen in Gaming

Speech synthesis is the process of creating artificial sounds in the form of human speech. It’s done by either storing a large number of recorded sounds into a database or using a computer made voice generator to completely produce all sound artificially. Bots are involved because they analyze text on any given page and figure out which sounds correspond to certain letters and symbols. This can be applied to video games by letting developers automate the entire process of voice recording and letting them create as many voiced characters as they want. Another advantage is that they can make quick changes without sinking huge amounts of time and costs to bring actors back. Smaller and indie game devs who normally wouldn't have the resources necessary to fully voice their games can also use this technology to bring their worlds to life as well.

How Would it Benefit Developers

While it might be commonplace in this day and age, voice acting in video games is a huge process that involves tons of logistics, planning, and scheduling. For example, a large game like The Witcher 3: Wild Hunt has over 450,000 lines of dialogue and over 950 speaking roles. Numbers like these only grow when you take into account factors like downloadable content adding new dialogue and especially dubs in different languages. Games are only growing and getting larger, so text to speech can be an immensely powerful resource to give developers workarounds to common hurdles. For example, making a story change at the last second would necessitate that the studio would have to write their new dialogue, bring in their voice actors whenever their schedule doesn't conflict, implement the new voice into the game, and test of bugs, all within a sharply reduced timeframe. The potential for speech synthesis in instances like these can’t be understated as they could get around those hurdles instantly.

By implementing text to speech into games, a game developer can create hours of voiced dialogue with unique and natural sounding characters all within seconds. As text to speech technology progresses, these voices will sound even more realistic and pleasing to players. While these generated voices are nowhere near the point where they’ll outright replace human beings entirely, it can be an efficient tool for filling in gaps and saving costs and resources that would be spent making even the most minor changes. It’s possible that in the future, we’ll see games with rich stories, thousands of voices, and numerous unique characters all populate game worlds with their voices entirely generated in a manner almost indistinguishable for players.