written by Eamonn Forde December 23, 2020
A Metadata with Destiny: The Next Steps for Digital Music Information
From smart speakers and livestreaming to discovery algorithms and contextual playlists, Eamonn Forde examines how the use of metadata is evolving with new technologies and methods of music consumption.
There is a common industry expression that “data is the new oil” – where the vast amounts of information around consumption turn those who can both interpret it and act on it into the most powerful people in music.
This is often looking at data from the perspective of the end user; but for digital music, the whole system depends utterly on clean, precise and broad metadata being created around songs that can cover off every eventuality and use context. This is then wrapped around digital files and ingested into DSPs before the user can even hear it. If the metadata is incomplete at the ingestion stage, everything that follows is compromised.
Metadata in 2020 is expected to cover off far more areas than what metadata even five or 10 years ago was expected to shoulder. As new means of consumption arrive and become normalised, metadata has to not just factor it in but also anticipate what changes are coming down the line and be able to factor that in too. This is all a moving target and metadata has to hit the bullseye every time.
If everyone – from content owners to digital platforms – is to sing off the same metadata hymn sheet, consensus is essential. DDEX (Digital Data Exchange) was founded in 2006 to help set international standards for everyone in the digital supply chain, with the end goal of harmonising information here. In October 2019, it launched its latest standard, Media Enrichment & Description (or MEAD for short).
This came from discussions with DSPs, labels, collection societies and other third-party organisations operating in this space to expand the data categories that were covered here.
“MEAD will allow for over 30 types of additional data such as lyrics, reviews, historic chart positions, and focus track information to be communicated through the supply chain, which will support new service options and marketing opportunities,” it said at the time.
Mark Isherwood, a DDEX secretariat, says that even though it was published over a year ago, the organisation is “still waiting for some implementations” of MEAD. There is, he says, an anticipated lag here as content owners, DSPs and others recalibrate their systems to accommodate it.
“Usually it’s about 18 months between us publishing something and you actually seeing it being used by a reasonable contingent of companies,” he explains. “Unfortunately DDEX itself can’t do anything about that because implementation has to be done on a bilateral basis. Both parties in a bilateral desire to implement it have to get the implementation on their roadmaps.”
“Usually it’s about 18 months between us publishing something and you actually seeing it being used by a reasonable contingent of companies.” – Mark Isherwood, DDEX
Niels Rump, also a DDEX secretariat, says the MEAD framework was defined after a brainstorming session in Toronto with about 100 stakeholders from around the industry in the room contributing. From there the framework was refined. “Then it’s a chicken and egg problem between those people who provide the metadata and those who want to ingest the metadata,” he says. “If you only have one then it’s pretty useless; you need to have both. But we do have what I would call soft commitment there.”
While there is soft commitment from copyright owners and DSPs – and a general desire to have a system in place that ultimately makes everyone’s life easier – there are still some problems that need to be addressed here.
Matthew Adell, the CEO and co-founder of OnNow.tv, a discovery platform for livestreams of all types, says the world of metadata is “massively improved” today, but points to the fact that anomalies still exist in the system.
“A great example is Universal, which is filled with a tonne of smart people,” he says. “But they can’t consistently decide how they want to spell ‘Tupac Shakur’. It has less now to do with ‘do we have standards?’ and more to do with shitty data going in and shitty data going out.”
He says the major labels and publishers – as well as a number of big indie labels and publishers – treat this with the gravitas it deserves; but smaller companies (perhaps because they don’t have the time or resources to attend to it in intricate detail) sometimes lag behind here.
“I think it’s just a matter of time and prioritisation,” he says. “With the smaller providers, it has a lot to do with, I think, the artists and the creators not really understanding why they need to get the metadata correct and, as a result, not [getting it right].”
Taylor Kirch, manager of content analysis systems and operations at SiriusXM and Pandora, feels that MEAD is an important development here and a sign of how the world of music metadata is evolving. “Everyone’s getting on board with it,” she says. “We are equalising the data offering across services, which is super important.”
Rump suggests that the MEAD format has been agreed upon by the assorted partners and it is now a slight waiting game as it needs to slot into the development cycle of each participant. This could start to happen towards the end of 2020, but he suggests this more realistically will be early 2021.
Isherwood adds that everyone involved in the discussions and scoping here has a collective will to put standards in place.
“There is a desire amongst the people who attend the meetings to get this done,” he says. “All of the people that attend these meetings are basically operational or IT people – so they don’t have any of the commercial baggage that the lawyers or the commercial people would have if you try to get them all in a room. They just want to solve a problem and make their lives easier.
Although it takes some coordinating and cajoling – and occasionally a bit of arm twisting – basically the will is there amongst the attendees to actually get the job done. It’s all done in a very positive and collegial way.”
Major rightsholders and services starting to implement the new standards will mean that other parties operating in this space have to bend to the new systems and approaches to metadata and fall in line. “There is definitely a network effect,” says Isherwood of how this spreads.
While we can talk about metadata in a broad sense – the need to tag files properly so that all the information is correct and payment systems are not compromised – there is effectively a two-tier system at play here: there is the metadata – ideally standardised – that rightsowners provide at the ingestion stage; and then there is the bespoke metadata that the services themselves layer on top of that to make their discovery and user experiences unique.
“When it comes to utilising external metadata sources as DSPs, we’re taking in all of this external data and we’re figuring out how to leverage that for our algorithm,” says Kirch of how Pandora operates here. “What it really comes down to for us is about understanding more about that content.
We call it ‘content understanding’. Increasing the content understanding means expanding the possibilities of our listener experience. It allows us to make connections across otherwise disparate aspects of our catalogues.”
“When it comes to utilising external metadata sources as DSPs, we’re taking in all of this external data and we’re figuring out how to leverage that for our algorithm.” – Taylor Kirch, SiriusXM / Pandora
She continues, “Being able to make those connections across our catalogue means that we can get new music or different music in front of our listeners at any given time. It’s about finding that right song for the right listener in the right moment ‘The next song matters’ is what we always say. All inputs and all signals we have to generate what that next song [will be] is super important for DSPs.”
Adell was formerly CEO of Beatport and explains why adding their extra layer of service-centric metadata was essential to give the DSP a unique position in the market. He also talks about how artificial intelligence (AI) technology can help elevate what is happening here.
“Beatport has its own unique genre and sub-genre system,” he says. “Beatport had an administrative and technology system in place whereby, after music came in, additional unique metadata was added to it so that the content could flow exceptionally well through Beatport’s unique system. I think that AI is going to be helpful in the future for DSPs in getting that done. In fact I think unique metadata at the DSP level is really the only thing DSPs can compete on in terms of uniqueness.”
“I think unique metadata at the DSP level is really the only thing DSPs can compete on in terms of uniqueness.” – Matthew Adell, OnNow.tv
He feels, however, that moves by DDEX or other bodies to start to include more of this ‘contextual’ metadata at the ingestion stage is perhaps not such a good idea as it could lead to a homogenisation at the DSP level.
“That further commoditises the market and makes all DSPs the same,” he argues. “I’m a big fan of DSPs doing that contextual work themselves to try to create a unique experience. Otherwise if I am listening to, for example, Portugal.
The Man’s ‘Feel It Still’ and every single DSP recommends the same track after that because they all have the same metadata, that’s a further commodifying of the experience that I think damages the market.”
Naturally this is something Pandora feels strongly about, having been an early pioneer here with its Music Genome Project. It launched at the very start of 2000 and sees musicologists draw from 450 different musical attributes when ranking and categorising songs.
“It is that secret sauce that we’re capturing internally,” says Kirch. “We have a team of musicologists and we have a team of content metadata analysts. They’re constantly looking through the collection and figuring out new ways to describe the content – whether it’s finding synonyms, artist aliases or different spellings of the same terms.
Those are just some of the ways that we’re trying to fill the gaps and really leverage our in-house experts to understand the content.”
This propriety metadata is essential to make services different at the user level while still adhering to the same metadata provided by the labels and publishers. If all services have essentially the same catalogue and if all subscriptions cost roughly the same price each month, it is in the creation and deployment of propriety metadata that allows each service to do something different in the market.
Metadata for music is far from a fixed entity. There are the basics – song title, running time, writer(s), label, publisher – but as the means of delivery and consumption change, so too must the metadata around music.
2020 has seen livestreaming – previously a niche activity that drew in small but dedicated audiences – go mainstream in the absence of real-world concerts. The huge success of acts like BTS, Niall Horan and Dua Lipa here show that live shows online can draw in enormous audiences.
Adell’s company OnNow.tv aims to put metadata standards in place for what is fast becoming a huge new category for the music business. It may be something that proves to be short-lived due to lockdown or it may evolve into an important ancillary part of the touring business. For now, however, Adell feels the issue of metadata needs to be addressed here.
“At a bare minimum, livestreaming is basically where digital music was 25 years ago in terms of metadata,” he says as a means of illustrating the need to fix this issue now. “There isn’t even a standard of how to digitally express what is the start time and start date.
That’s the most rudimentary thing: when does it start? You’ve got some people using Greenwich Mean Time [GMT], some people using their regional time, plus a sign about what time zone you’re in. Even if we just all agree to use GMT, you need an agreement about how you format that data. That’s how rudimentary livestreaming is right now.”
“At a bare minimum, livestreaming is basically where digital music was 25 years ago in terms of metadata.”
– Matthew Adell, OnNow.tv
As such, OnNow.tv has proposed a set of standards for livestreaming metadata. It has over 30 categories listed, marking some as required and others as optional.
“I really, really believe what’s going on with livestreaming is the final disruption of broadcast television,” Adell argues. “In order to provide the value we wanted to provide, we initially had to create a normalisation layer for ourselves just so we could launch a product and get everyone’s data into a place where users didn’t have to think about the fact that it all came in differently.”
Podcasting is another hugely important area as DSPs invest heavily here by buying up podcasting companies and podcasters themselves so as to lead into a blended listening experience for users where music and podcasts increasingly co-exist in playlists and their overall listening diet.
Pandora even launched its Podcast Genome Project in 2018 to bring its discovery and recommendation powers to podcasting.
“It is a similar idea to Music Genome, capturing that ‘aboutness’ of a podcast,” explains Kirch. “It utilises different things like production style, information about the host – literally what it is about and what they are talking about. What artists are being interviewed on this podcast? What artists are doing a live in-studio performance on the Howard Stern Show?”
Here the metadata from one category (music) is intermeshed with the metadata from another (podcasting) to offer much broader recommendations across different content types, taking a more agnostic approach and using the learnings from one in the expansion of another. The metadata is evolving as the range of content evolves, working both in distinct silos and across silos.
Tied to this is the wider contextual use of metadata – especially given the rise of mood-based and activity-oriented playlists (e.g. music to relax to or music to work out at the gym to). That notion of context is expanding and metadata is expected to map this.
“One important thing is as a follow-on from the sort of data that people can communicate with MEAD is that we’re investigating the development – and we are well down the road here – of location,” says Isherwood. “A lot of this is all about search and discovery – so being able to link bits of data. ‘Who’s the producer on this album?
Play me something else they produced. I’m listening to George Harrison – play me something where he was in a different band from The Beatles.’ All of these things are about linking these pieces of data together so that the AI can actually pull this stuff forward. But that data has to be communicated.”
“A lot of this is all about search and discovery – so being able to link bits of data. ‘Who’s the producer on this album? Play me something else they produced.” – Mark Isherwood, DDEX
Voice in general (as adoption of smart speakers/smart assistants continues to grow rapidly) marked a new step forward here for metadata being able to slipstream changing consumer behaviour as the way that someone asks for music on Alexa is going to be very different to how they might search on the Spotify mobile app. Given that Canalys is forecasting shipments of 163m smart speakers globally in 2021 (up 21% from 2020), this is only going to grow in importance.
“You need good information so that the AI can actually find the music that you want,” says Isherwood.
Kirch says that seeing the first stirrings of all this and how it could profoundly change music consumption is why Pandora was early to market with voice support and integrating with off-platform smart devices.
“We want that natural language – so how people actually interact with those smart devices,” she says. “It’s a little bit more natural. The way that you search for something via text and the way that you search for something on a smart device is totally different. So it’s about understanding the way that people speak and the way that people search via voice.”
“The way that you search for something via text and the way that you search for something on a smart device is totally different.” – Taylor Kirch, SiriusXM / Pandora
She adds that there are distinct generational differences here and service-level metadata has to cater for that. “When you look at the younger demographic, you see people interacting very conversationally with their smart devices because they’re just used to it; they understand how it works,” she says. “I think that the voice technology needs to meet that conversational tone and figure out ways of dissecting those queries in order to provide the best results possible.”
In terms of where metadata can – and will – move next, there are a number of emerging possibilities opening up for those who are active in the space.
Google has already expanded to allow users to search for songs by humming them to Google Assistant. “After you’re finished humming, our machine learning algorithm helps identify potential song matches,” it says. “And don’t worry, you don’t need perfect pitch to use this feature. We’ll show you the most likely options based on the tune.
Then you can select the best match and explore information on the song and artist, view any accompanying music videos or listen to the song on your favorite music app, find the lyrics, read analysis and even check out other recordings of the song when available.”
For Kirch, it is about how to bring the listening experience off-platform more.
“It’s really about creating those partnerships and figuring out ways that all of these services can work together – so for example, Pandora SiriusXM’s relationship with off-platform listening devices,” she says. “That’s a really great example of how these companies and everyone can work together to bring those listening experiences off-platform.
The way we can track that and the way we can optimise that is definitely a piece of the puzzle – the metadata that we use and share with one another when we can.”
Alongside this, Musimap has developed MusiMe – a “psycho-emotional profiling engine” – that builds an emotional profile for listeners based on their play history and favourites. Spotify, meanwhile, is securing patents for “[m]ethods and systems for personalizing user experience based on [user] personality traits”.
These are areas that platforms are investing in to help their own user experience stand out in the market and this technological arms race has a knock-on effect for the types of metadata that could be needed in the near future.
Adell suggests that the next big area to focus on is in relation to derivative works.
“On TikTok, every video is a derivative work; on Triller, every video is a derivative work,” he says by way of illustration. “The internet is a giant derivative work-generating machine, be it bootleg remixes or dancing to something on TikTok.
I think all the opportunities in music are around derivative works at this point. Derivative works are where everything interesting and exciting is going on and I’m very anxious for derivative works to generate more income for artists. I think the industry has spent a lot of years with its head in the sand about the value of derivative works and the fact that you’re just never going to stop them.”
“I think all the opportunities in music are around derivative works at this point. Derivative works are where everything interesting and exciting is going on.” – Matthew Adell, OnNow.tv
He suggests that there should be an agreed upon framework for derivative works – something that DDEX could help scope out. With the boom in UGC on platforms such as TikTok where audio is used and can be watched hundreds of millions of times, this is something the industry needs to be fully tracking by having metadata systems in place that are tailored to the use and consumption contexts.
“I’m interested in watching companies like Tracklib and my last company which I sold to Native Instruments, MetaPop, which is all about stems and remixes and how you then embed the root work’s metadata to travel with the derivative work,” he says. “I think that’s really where the industry is headed – consumer-created derivative works or fan remixes. How do we make sure that stuff is not just monetised properly but also have the engagement around it properly tracked? That is the next step that I think is really exciting.”
The industry as a whole is moving is moving in unison here as much as it can. The collective will to have standards in place that improve things for everyone – rights owners, services and consumers – is what is driving everything forward.
“I have a library science background, so for me the categorisation and the understanding of our content is of prime importance to the listener experience – whether it’s browse, search and discovery, or personalisation,” says Kirch. “It’s really great to see not only the DSPs really doubling down on the science behind that but also the industry at large, whether it’s livestreaming or DDEX themselves.”
SOURCE: Synchtank