Testing AI Voice Translation
- Nadine Hegmanns

- Feb 5
- 2 min read

AI tools can be very helpful and support us in our work. Many things are already possible, but some are not—especially AI Interpreting is not as mature and glitch-free as the Internet claims. We tested the AI language translation tool LiveVoice at the Best of Events trade fair, noticing that AI does not follow a natural flow of speech (which is also true for other automated speech-to-speech translation tools).
Interviews or panel discussions are a particular challenge for AI. While human interpreters will render the translation simultaneously, i.e. almost at the same time as the original speaker, AI will first transcribe every single spoken word and translate it in writing, before a computer-generated voice will read out the text. As you can imagine, all this takes time, which can cause delays of up to four sentences that disrupt the natural flow of conversation. When a panellist Interjects or interrupts another speaker, AI will just stop mid-sentence, leading to grammatical errors. AI will then try to compensate for such deficits by accelerating its speech output. Moreover, AI is also unable to reflect emotional nuances in spoken language: irony, wordplay, feelings such as joy, despair, indignation... all of this is lost and may lead to misunderstandings and confusion among listeners.
Coincidentally, my (human) ability to pick up a certain mood or atmosphere in a room or on stage is often pointed out during my interpreting assignments. Participants will approach me and ask why I am “gesticulating wildly” in the booth at the back of the room, even though no one can actually see me. But that‘s just it! Interpreters don‘t just sit in the booth and read a text off a screen or telepromter. We often use our hands and arms to give emphasis to a statement. Sometimes we even stand up when the speech on stage is gripping or to give our voice more power. We deliberately vary our pitch, volume, and speaking speed to underline the message. That's the beauty of authentic communication. And you can actually hear that.






Comments