Data, Licensing, and the Future of Music AI Post-Fair Use
A conversation with Alex Bestall, the founder and CEO of Rightsify and Global Copyright Exchange (GCX), two companies at the forefront of the AI music revolution.
Music licensing is the bedrock of the music industry. It's how music is monetized and how the industry and artists pay their bills. That's why the emergence of AI platforms that put their faith in supposed fair use loopholes to avoid paying for licenses presents an existential threat to the music industry. This has led to the high-profile lawsuits we are seeing today.
So why not just license the music?
Enter Alex Bestall, the founder and CEO of Rightsify and Global Copyright Exchange (GCX), who is pioneering a solution that demonstrates that AI music development and proper licensing are not mutually exclusive. In fact, they can work hand in hand to create a more robust and fair ecosystem for all stakeholders.
Rightsify presents a trailblazing approach to AI music development through their licensable dataset. Instead of relying on questionable fair use claims, Rightsify has built one of the most comprehensive and well-annotated music datasets in the industry. This dataset, carefully curated over years of global operations, forms the foundation for legitimate AI music training and generation.
In this interview, we'll explore how Rightsify's licensable dataset is being leveraged by AI companies for various applications. We'll also explore the challenges of building such a dataset, the importance of annotation, and the role of the Dataset Providers Alliance in setting industry standards.
Thanks for taking the time to talk today. Let's start with a bit of background about yourself. What made you want to get into the Music AI space, and what experiences led you to founding Rightsify?
I've been involved in music my whole career, starting from live music to launching an independent label, and for the past 11 years, it’s been all about Rightsify. Rightsify was started as a run of the mill sync licensing library but over time we evolved into licensing of all kinds, with a focus on direct licensing and now AI.
We've been operating globally for a while, with offices in Asia, handling numerous hotels, airports and airlines. This global presence allowed us to build a vast catalogue. Without it, we wouldn't have accumulated as much data as we have today.
Due to this need for extensive music libraries, hotels and gyms require 24-hour playlists that evolve throughout the day. This necessity led us to build our extensive catalogue, which forms the foundation of our current operations.
We didn’t have a grand strategy from the beginning; it evolved over time. That’s why I emphasise the importance of maintaining good records and data accuracy. For us, that's where the true value lies beyond the music, having clear and accurate data.
Can you elaborate on how users are leveraging Rightsify, and could you discuss your involvement or connection with the Data Providers Alliance?
Yes, it’s been pretty nonstop for us since January 2023 when we first launched GCX, our data licensing service. After some AI content creation tools launched in late 2022 (ChatGPT, MidJourney, Runway, etc.), we saw the potential disruption for music as well. We started researching how datasets are structured and what a music dataset looks like. It turned out we had accidentally built the most comprehensive music dataset through years of manually tagging and annotating our catalogue. In addition to licensing datasets, we also trained two of our own models, Hydra I and Hydra II, exclusively on the Rightsify-owned catalogue.
We can't mention any customer names yet, but we plan to announce several in the next few months. Our customers range widely. For example, stem separation is one use case. Music recommendation algorithms, sample generators, music generators, and text-to-video generators that need music to adapt to those videos are also significant use cases. The most exciting use case is likely text-to-video with music matched to the video. Instead of someone typing, "Make me a jazz song at 80 bpm with saxophone," it’s more like, "Here’s a video; match a song to this."
The Dataset Providers Alliance (DPA) was an idea we had towards the end of 2023. At the time, there weren’t many companies actively licensing datasets, not just in music but in all modalities. Most of the deals were “news outlet does deal with X” or “stock marketplace does deal with Y.” Six months later, we noticed more startups entering the space and traditional rights holders getting more active in the data licensing marketplace. We approached a few companies that we knew were active and aligned with us on building a sustainable marketplace, so we went ahead and launched the DPA in June. The DPA’s mission revolves around setting standards for dataset licensing and giving rights holders a voice in what we see as the largest new content distribution segment in over a decade.
As I mentioned earlier, the idea started last year, but it was very fragmented and still is to some extent. It was about organising the community. I was familiar with companies doing image, video, and text datasets, and we were all aligned on the principles. We are releasing a white paper in the next few weeks that outlines these principles.
Obviously, people who scrape data and resell it can't be DPA members. All the data has to be either licensed or wholly owned. Getting opt-ins is also really important for DPA. For example, if you're a marketplace with a lot of contributors and you just go and licence that whole dataset to a company without contributors opting in, we disagree with that. We think creators should have the opt-in. In our experience, instrumental musicians are very open to it, but a lot of vocalists are not. So having that consent is really important.
As a pioneer in AI Music Licensing, what are some of the challenges you've encountered in this emerging field?
Given the nature of AI models needing large volumes of high-quality data, we do a lot of the hard work of manually labelling and annotating all our music datasets. This ensures that all the tracks have clean stems, accurate key and tempo data, chords, moods, and more. As of today, there aren’t any accurate music tagging solutions for large-scale music datasets, so we still do this manually to maintain the integrity of the music data.
Regarding the types of data, this is constantly changing as new models become more sophisticated and existing models get fine-tuned. We are continually updating our taxonomy and datasets to include new pieces of data.
Dealing with a lot of international music, such as Indian and Thai music with different key scales, is a whole different challenge. Annotating these types of music and their instrument taxonomy is complex so having a diverse and global team has been very helpful.
We have a full-time team of 24, and with contractors, it’s closer to 100. It’s a lot of people’s job to listen to music and annotate it. This is a whole different way of operating compared to our past focus on creating music in our core genres. Now, it’s about making music in every genre and capturing as much detail as possible on every song.
What AI companies want is human data, not AI-generated data. They want it to be ground truth.
In your experience, what does a typical AI music licensing agreement look like currently, and how do you anticipate these agreements might evolve as more prominent rights holders, such as major labels, become involved in the space?
I can’t go into too many details due to NDAs, but there is now a framework for how AI music licensing deals are structured, largely depending on the use case. For example, a stem separation or music recommendation use case will typically be a flat fee, while a generative model may be either a flat fee, revenue share or a hybrid, depending on the company.
Are there any accepted "truths" about AI's role in music that you disagree with based on your discussions with rights holders? What's your contrasting viewpoint and evidence?
A year ago, people thought we were crazy for willingly licensing our catalogue for AI training, but now rights holders are much more interested in it. From the start, we saw AI as both a threat and an opportunity. We leaned heavily into the opportunity side, and it has worked out well. I think rights holders uniformly agree that their content should not be used for free and then commercialised without consent, though some rights holders disagree that AI music should exist at all. We take the view that AI tools (of all modalities) will be widely accessible and adopted everywhere, so it’s better to have a seat at the table rather than sit on the sidelines.
Artistry is such a human-led endeavour, and many people disagree with the idea of a computer doing everything from start to finish. For many music producers, AI tools are more assistive than anything else. Our view is that the technology is here, it’s going to grow, and more people will use it. We can't rebel from the sidelines and sit it out. For us, it's really about how to survive long-term as a business. It’s also very exciting because it's a whole new distribution method, and that's how we see it.
Many people think AI will either entirely replace traditional methods or not go anywhere at all. I believe this is just a new distribution method, potentially bigger than streaming.
How do you see the future of AI in the music industry evolving in the next 5-10 years? Where do you envision Rightsify in that future?
This is a hard one to predict as things in AI change every quarter! But to take a macro view of the future, I would say this: In regards to music production, there will likely be AI models integrated into every web tool, app, and software that musicians use to create music. It is not unfeasible to imagine that all competitive DAWs and VSTs will have some or multiple machine learning models under the hood. For consumers, in addition to text-to-music generators, I can see adaptive and personalised AI music streaming services taking off, which would certainly threaten the established streaming model. Then for business use, there is no question that AI is a huge disruption event for sync licensing and UGC. Brands and creators everywhere will be able to create their own music anytime and at scale.
For Rightsify, we have positioned ourselves solidly at the data layer of AI, and we are always creating new data (music). As use cases evolve, we will be there with the largest and highest quality datasets available for clean licensing.
Can you share anything about the next thing you are shipping or working on?
Unfortunately, I can't disclose much beyond annotating large datasets and some ML research.
What advice would you give to aspiring artists & entrepreneurs looking to enter the AI or music tech space?
There are many ways to enter the space, but for rights holders and artists, it’s all about having clean data with clear ownership. Ensuring your catalogue is organised and well-annotated is a must if you are interested in licensing data or fine-tuning a model on your own music.
Good record-keeping is something you have to do persistently. It’s boring, grunt work, but it has value. Keeping accurate records is very important.
In terms of opportunities in Music AI, It's hard to say because it's changing so quickly. For example, our Hydra model is mostly being used by businesses and producers. Producers are generating samples, and businesses are generating background music.
So, there’s a dual use case there, but it’s hard to predict. Tagging still has a long way to go. Some aspects, like genre and tempo, can be done, but instrument detection is a huge problem that nobody has solved yet. Chord detection is another challenge.
Is there anything else you'd like to share about Rightsify or your personal journey that we haven’t covered?
The main thing in my experience is to always be open to evolving. 11 years ago we started in sync, 9 years ago in background music licensing and now we are an AI data company. If history has shown us anything it’s that business models of the past may not be viable forever so keeping an open mind and embracing the change has been paramount to our success.
Browse the Music AI Archive to find AI Tools for your Music
Find out more about Rightsify, GCX & Data Providers Alliance (DPA)
Great to see a "Fairly Trained"-certified company like Rightsify get attention for their initiatives towards ethical AI!