NVIDIA’s Generative AI model FUGATTO combines music composition and voice transformation to create a novel soundscape

NVIDIA unveils a new generative AI model that utilizes prompts from any combination of text and audio files and transforms or generates them into any sound or music. Compared to existing AI models available today, the model generated by Nvidia, known as FUGATTO, an abbreviation of  Foundational Generative Audio Transformer Opus1, is one of the first to showcase emergent properties – capabilities to take open-ended instructions such as a combination of text and audio prompts and create a unique music outcome for which model wasn’t directly trained before.

(Source: NVIDIA Developer

What makes Fugatto different from existing AI models is none of them have the proficiency to perform both compose and modify a voice. Fugatto gives users artistic control over creating music by combining various genres and styles and creating unique compositions or changing the accent and emotions of the voice. Music composers can experiment with different instruments, voices, and styles to create or edit songs, enhancing audio quality and creating unique compositions. Users can get a voice in various accents and a range of emotions they want to add to a voice, providing it a potential application in ad agencies, video games, language learning platforms, etc. It allows users to create a soundscape, it’s never trained on, such as a trumpet bark, or a thunderstorm easing in dawn with birds chirping.

FUGATTO, trained on a massive dataset of audio samples, uses 2.5 billion parameters and is developed on NVIDIA DGX systems packing 32 NVIDIA H100 Tensor Core GPUs 4. FUGATTO’s multi-accent and multilingual capabilities come from the collaboration of its development team, which comprises people from different parts of the world. It uses the Composable ART technique during inference, which allows the model to understand instructions separately and synthesize them into novel combinations based on user prompts.

https://twitter.com/PoniakTimes/status/1863106683699913043

NVIDIA’s FUGATTO, with its remarkable creative capabilities, is a groundbreaking generative AI model innovation. Combining music compositions and voice modification in a single platform enables users from various industries to explore uncharted creative possibilities. Though NVIDIA has not yet confirmed its launching plans, once released, FUGATTO will be a paradigm shift in how music and audio are conceived across industries.