In March 2019, Neue Zürcher Zeitung (NZZ) launched a new text-to-speech service to the public, as an improved version of its beta audio player released last October. The company shared some of its key lessons learned during the process.
- Google Wavenet is not enough for the Swiss German language
NZZ used Google Wavenet to generate its audio files, and while the technology is talented at languages (currently speaking nine with a natural quality sound), it was not robust enough for the complexities of Swiss German. To solve the problem, NZZ equipped a middleware with a lexicon to flow the words through before being converted into audio.
- Architecture must be mix-and-match friendly
In a changing industry with changing tools, needs and products, NZZ needed to build a service that could easily be adapted to changing circumstances. By building a mix-and-match architecture, they were able to move the service from Amazon Polly to Google Wavenet at short notice, improving dramatically.