Google revealed that Gemini, now powered by the Lyria 3 audio model, enables users to craft 30-second songs from descriptive inputs or alter pre-existing audio files as desired. This enhancement extends Gemini's established functions for creating written content, visuals, and motion clips, while also appearing in YouTube's Dream Track option to develop sophisticated instrumental accompaniments for short videos.
In line with various AI-driven composition platforms, Gemini delivers acceptable audio from basic descriptions without needing elaborate specifications. A demonstration from Google involves requesting 'a funny R&B tune at a leisurely pace centered on a sock locating its counterpart.' Further interaction with Lyria 3 permits precise adjustments to aspects like pace or percussion techniques. Apart from written prompts, the system can derive sounds from uploaded pictures or footage, and these can be matched with artwork produced via Google's Nano Banana generation tool.
Google states that Lyria 3 surpasses prior sound-creation systems by delivering more authentic and elaborate musical pieces, allowing finer command over specific song parts, and spontaneously composing words. At present, Gemini restricts results to 30-second excerpts, though the company's showcase footage hints at possibilities for extended durations or broader application in tools such as Google Messages.
Similar to Gemini's other synthetic creations, Lyria 3 compositions include Google's SynthID marking to prevent easy confusion with human efforts. The firm initiated the SynthID Detector rollout for spotting artificial media at Google I/O 2025. The example pieces shared with the reveal are persuasive, yet their engineered traits may be evident without relying on Google's verification software. While the backing music in Gemini's samples frequently impresses, the wording crafted by Lyria 3 often feels awkwardly sentimental or peculiar.
Individuals eager to experiment with Lyria 3 can start submitting requests in Gemini right away, as long as they are at least 18 years old and proficient in English, Spanish, German, French, Hindi, Japanese, Korean, or Portuguese.