Connect with us

Artificial Intelligence

Google expands “Audio Overviews” to 75 languages using Gemini-based audio production

 

NotebookLM’s “Audio Overviews” feature is now available in approximately 75 languages, including less commonly spoken ones such as Icelandic, Basque, and Latin.

The audio for each language is generated by AI agents using “metaprompting,” with the Gemini 2.5 Pro language model as the underlying system. At the same time, Google is moving to an audio production technology based entirely on Gemini’s multimodality, a development that does not bode well for providers focused exclusively on audio models.

As with AI-generated text, audio created by language models can also contain inaccuracies. This issue is especially pronounced in AI-generated podcasts, where large amounts of audio may be produced from minimal source text, and the conversion from text to dialogue constitutes a significant alteration of the original material.

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Copyright © 2022 Inventrium Magazine

%d bloggers like this: