Meta Platforms, the parent company of Facebook and Instagram, is joining the growing field of AI music generation.
On Tuesday, June 18, Meta’s AI research division unveiled its latest development, JASCO, a tool that transforms chords or beats into full musical tracks.
Developed by Meta’s Fundamental AI Research (FAIR) team, JASCO stands for “Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation.” This tool is designed to provide creators with greater control over AI-generated music, offering quality comparable to other AI tools but with significantly enhanced versatility.
To demonstrate JASCO's capabilities, Meta has published a collection of music clips where simple public-domain melodies are transformed into various musical genres. For example, Maurice Ravel’s Bolero is converted into both an '80s driving pop song and a folk tune featuring accordion and acoustic guitar. Similarly, Tchaikovsky’s Swan Lake is reimagined as a traditional Chinese track with guzheng, percussion, and bamboo flute, and as an R&B track with deep bass, electronic drums, and lead trumpet.
Meta has been proactive in sharing its AI research with the public. Alongside JASCO, the company has released a research paper detailing its development and plans to release the inference code under an MIT license and the pre-trained JASCO model under a Creative Commons license later this month. This will allow other AI developers to use and build upon the model.
“As innovation in the field continues to move at a rapid pace, we believe that collaboration with the global AI community is more important than ever,” Meta FAIR stated in a blog post.
This announcement follows the release of MusicGen last year, a text-to-audio generator capable of creating 12-second tracks from simple text prompts. MusicGen was trained on 20,000 hours of music licensed by Meta, as well as 390,000 instrument-only tracks from Shutterstock and Pond5. It can also use melodies as input, making it one of the first AI tools capable of turning a melody into a fully developed song.
Meta's introduction of JASCO comes amid several recent advancements in the AI music space. On the same day JASCO was unveiled, Google’s AI lab DeepMind introduced a new video-to-audio (V2A) tool that creates soundtracks for videos based on text prompts or the video content itself. This tool is a step towards creating fully AI-generated video content, as most AI video generators currently produce silent videos.
Additionally, last week, Stability AI, known for its AI art generator Stable Diffusion, released Stable Audio Open, a free, open-source model for creating audio clips up to 47 seconds long. While not designed for complete song creation, this tool allows users to fine-tune audio clips with their custom data, such as unique drum beats generated from a drummer's own recordings.
These AI tools differ from platforms like Udio and Suno, which generate entire tracks from text prompts. Such platforms, trained on large datasets, have raised concerns within the music industry over potential unauthorized use of copyrighted music for training purposes.
Comments