A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://huggingface.com/google/magenta-realtime below:

Website Navigation


google/magenta-realtime ยท Hugging Face

Model Card for Magenta RT

Authors: Google DeepMind

Resources:

Terms of Use

Magenta RealTime is offered under a combination of licenses: the codebase is licensed under Apache 2.0, and the model weights under Creative Commons Attribution 4.0 International. In addition, we specify the following usage terms:

Copyright 2025 Google LLC

Use these materials responsibly and do not generate content, including outputs, that infringe or violate the rights of others, including rights in copyrighted content.

Google claims no rights in outputs you generate using Magenta RealTime. You and your users are solely responsible for outputs and their subsequent uses.

Unless required by applicable law or agreed to in writing, all software and materials distributed here under the Apache 2.0 or CC-BY licenses are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the licenses for the specific language governing permissions and limitations under those licenses. You are solely responsible for determining the appropriateness of using, reproducing, modifying, performing, displaying or distributing the software and materials, and any outputs, and assume any and all risks associated with your use or distribution of any of the software and materials, and any outputs, and your exercise of rights and permissions under the licenses.

Model Details

Magenta RealTime is an open music generation model from Google built from the same research and technology used to create MusicFX DJ and Lyria RealTime. Magenta RealTime enables the continuous generation of musical audio steered by a text prompt, an audio example, or a weighted combination of multiple text prompts and/or audio examples. Its relatively small size makes it possible to deploy in environments with limited resources, including live performance settings or freely available Colab TPUs.

System Components

Magenta RealTime is composed of three components: SpectroStream, MusicCoCa, and an LLM. A full technical report is forthcoming that will explain each component in more detail.

  1. SpectroStream is a discrete audio codec that converts stereo 48kHz audio into tokens, building on the SoundStream RVQ codec from Zeghidour+ 21
  2. MusicCoCa is a contrastive-trained model capable of embedding audio and text into a common embedding space, building on Yu+ 22 and Huang+ 22.
  3. An encoder-decoder Transformer LLM generates audio tokens given context audio tokens and a tokenized MusicCoCa embedding, building on the MusicLM method from Agostinelli+ 23
Inputs and outputs Uses

Music generation models, in particular ones targeted for continuous real-time generation and control, have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development.

Out-of-Scope Use

See our Terms of Use above for usage we consider out of scope.

Bias, Risks, and Limitations

Magenta RT supports the real-time generation and steering of instrumental music. The purpose and intention of this capability is to foster the development of new real-time, interactive co-creation workflows that seamlessly integrate with human-centered forms of musical creativity.

Every AI music generation model, including Magenta RT, carries a risk of impacting the economic and cultural landscape of music. We aim to mitigate these risks through the following avenues:

Known limitations

Coverage of broad musical styles. Magenta RT's training data primarily consists of Western instrumental music. As a consequence, Magenta RT has incomplete coverage of both vocal performance and the broader landscape of rich musical traditions worldwide. For real-time generation with broader style coverage, we refer users to our Lyria RealTime API.

Vocals. While the model is capable of generating non-lexical vocalizations and humming, it is not conditioned on lyrics and is unlikely to generate actual words. However, there remains some risk of generating explicit or culturally-insensitive lyrical content.

Latency. Because the Magenta RT LLM operates on two second chunks, user inputs for the style prompt may take two or more seconds to influence the musical output.

Limited context. Because the Magenta RT encoder has a maximum audio context window of ten seconds, the model is unable to directly reference music that has been output earlier than that. While the context is sufficient to enable the model to create melodies, rhythms, and chord progressions, the model is not capable of automatically creating longer-term song structures.

Benefits

At the time of release, Magenta RealTime represents the only open weights model supporting real-time, continuous musical audio generation. It is designed specifically to enable live, interactive musical creation, bringing new capabilities to musical performances, art installations, video games, and many other applications.

How to Get Started with the Model

See our Colab demo and GitHub repository for usage examples.

Training Details Training Data

Magenta RealTime was trained on ~190k hours of stock music from multiple sources, mostly instrumental.

Hardware

Magenta RealTime was trained using Tensor Processing Unit (TPU) hardware (TPUv6e / Trillium).

Software

Training was done using JAX and T5X, utilizing SeqIO for data pipelines. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models.

Evaluation

Model evaluation metrics and results will be shared in our forthcoming technical report.

Citation

A technical report is forthcoming. For now, please cite our blog post.

BibTeX:

@article{magenta_rt,
    title={Magenta RealTime},
    url={https://g.co/magenta/rt},
    publisher={Google DeepMind},
    author={Lyria Team},
    year={2025}
}

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4