Guide-ID logo, the letter G made with 3 different colored lines.

Artificial intelligence (AI), test to speech (TTS), and audio tours

Frits Polman nov 18, 2021

Artificial intelligence (AI) is a concept that lots of people will have heard of by now. Many well-known platforms such as LinkedIn, Facebook, and Just Eat already use AI systems which are self-learning and automatically improve. While this offers many advantages, it also has a rather scary side: the programmer no longer knows exactly what the system has learned. More on that later...

Text to speech

As you know, I’m a huge fan of technology – and therefore also of AI. I’ve been wondering for years how we can use AI on our platform for audio tours. We’ve found some wonderful examples, and one that I’ll be discussing today will soon be available to all our customers worldwide: text to speech (TTS).

All of us at Guide-ID have the same mission: to tell as many stories as possible worldwide. That’s why we’ve created the world’s easiest-to-use audio guide. We use audio to help people to discover more about what they see, without any unnecessary distractions. And that requires audio – a lot of audio.

No more time-consuming processes

This audio is usually recorded in professional studios and narrated by professional voice-over actors. While this produces fantastic audio, it’s also a laborious, time-consuming, and expensive process. This is a hurdle for many museums, and even the richer ones often choose to produce audio in one or two languages only. That means some visitors listen to an audio that isn’t in their native language. Many museums can’t do any production at all and don’t offer any audio tours. This situation is changing thanks to text to speech, which – as the name suggests – automatically converts text into speech.
Marlise Meuter listening to a Podcatcher whilst looking at a museum display at an fashion exhibition
From unbearable to pleasant to listen to

Until a year or two ago, the level of TTS was unbearably bad: we’re all traumatised by those tinny, robotic voices. But then came a tipping point in AI. Google, Microsoft, and Amazon had reached the critical amount of data that allowed the acceleration of the algorithm. From that moment on, the voices quickly became better – much better. So, we decided to do a test with real visitors and assess their experience. Our most recent TTS baptism of fire was in Amsterdam’s Nieuwe Kerk, where English-speaking visitors were given an automatically generated audio tour. And now the data...

A positive test result

Looking at recent years, we know that visitors who use English-language audio guides at Nieuwe Kerk listen to, on average, slightly fewer stops than Dutch-speaking listeners. This is based on audio tours with professionally recorded voices in a studio. The percentage for the English-language studio versions is 90% compared with the Dutch-language studio versions.

Visitors who listen to the TTS versions in English listen to just under 80% of all stops (compared with visitors who listen to the studio versions in Dutch). So, it’s not a huge difference! What’s more, the stops they listen to are listened to just as long as the Dutch studio versions. Basically, the visitors’ listening behaviour is almost equal between audio recorded in studios and TTS audio.

Audio content for everyone

This is pretty significant, as it enables us to offer all museums low-cost audio content on the spot, in all languages, without weeks of production time. It also takes all the hassle out of making additions and adjustments. Simply change the text, and you’re done. Want to add temporary content for an exhibition? Ta-dah! The algorithm will still make some mistakes from time to time – after all, the programmer no longer knows what the system has learned (or not). But these errors can be corrected manually. But are the voices all top-quality? No, as that requires more money and more patience. But for 99% of museums and their visitors, this is a fantastic development.
Marlise Meuter listening to a Podcatcher whilst walking through a fashion exhibition with an wider overview of the hall
Machine translation: a matter of time

The next step is to add machine translations. The quality of machine translation has also skyrocketed over the last two years, and AI has played a part. We’re not quite there yet, but soon you’ll only need to enter texts in one language. At Guide-ID, we automatically make sure that all your visitors can listen to the stories in their own language the next day. It’s our way of making stories accessible for everyone.

Request a demo (or experience it yourself!)

Want to experience it yourself? Drop by Nieuwe Kerk in Amsterdam or send me a personal message. I’ll get the demo sent to you!

This is pretty significant, as it enables us to offer all museums low-cost audio content on the spot, in all languages, without weeks of production time. It also takes all the hassle out of making additions and adjustments. Simply change the text, and you’re done. Want to add temporary content for an exhibition? Ta-dah! The algorithm will still make some mistakes from time to time – after all, the programmer no longer knows what the system has learned (or not). But these errors can be corrected manually. But are the voices all top-quality? No, as that requires more money and more patience. But for 99% of museums and their visitors, this is a fantastic development.

List of Services

Discover more...
Share by: