In the AI Song Contest teams from all over Europe and Australia will compete attempting to create the next Eurovision hit with the help of artificial intelligence. COMPUTD took on this challenge as part of their mission of the demystification of AI and wants to show how AI and humans can collaborate in the world of music.
Our AI specialists know how to strum a chord or two on the guitar, but taking on this challenge would require a deep understanding of music theory. That’s why COMPUTD decided to partner up with René and Angela Shuman.
René and Angela Shuman are owners of Always-Online Creative Projects BV where they combine IT, Marketing, Multimedia, and Events. As artists, they are known as Mr. & Mrs. Rock’n Roll. As producers, they’ve done 100’s of productions for TV, Radio and big brands such as Audi, Tesla, profile, etc. With their IT company, they know how to connect creative products with technological innovations.
René and Angela were very enthusiastic to work with us, saying the following:
”We think it is awesome to work with a young team such as COMPUTD. Together we search the boundaries of new technologies and its applications. It is not our goal to show that AI will create a better feeling then the human does, but discovering where AI and humans can cooper-ate can lead to an undiscovered path that we would like to investigate with the team of COMPUTD.”
René and Angela Shuman
As Data Scientists, we did a lot of research into what could a good data-driven composer look like. To start off, we realised that if we want good results, we can not work on just train on an unprocessed dataset of songs and expect Neural Networks to perform a new song for us. We wanted to make sure that the pipeline mimics human composers as closely as possible. Not discussing actual architectures yet, we designed a pipeline that will satisfy two conditions: it will generate a lead sheet(we define a lead sheet as the chord progression and the main melody of the song ) and it will sound good in a Eurovision competition. To achieve this, we came up with the following process:
• Find the most common tempo and structure Eurovision songs have
• Generate Chord progression from existing songs in the tempo and structure we defined
• Generate node pattern given the chord and the previous 3 notes
• Evaluate the generated pattern with respect to Eurovision songs
• Generate arrangements for the accepted song
This way, we would pick a style of music we like from the real world, train a chord progression auto-encoder on songs from the radio, generate the notes based on these songs and the generated chord progression and then evaluate each patter on the fly based on the Eurovision Dataset we received during the beginning of this challenge. This design makes sure that we avoid sampling and copying Eurovision songs, as they are never in the training dataset, but enables the pipeline to produce song festival-like songs as the evaluation function does not let songs that are not similar to the Eurovision dataset. We found that this approach to song generation is novel and could potentially lead to the creation of AI composers for lead sheets.
The first question we had to ask ourselves was how a composer will figure out the standard progression of verses, choruses, pre-choruses and the like. As we could not answer this question, we had to split up the data into the different sections of each song, and rework the pipeline, so that we generate each section alone and that verse informs the pre-chorus, pre-chorus informs the chorus, etc. However this is when we ran into our second major challenge: the legal and open-source datasets we managed to find were not well-structured midis, therefore we could not (without major manual effort) split up these songs into their respective sections. This set the plan back a bit, as now we had to train on the Eurovision dataset we have received at the start of the challenge. In the end, we ended up only using a subset of those songs to account for age and style of the song we wanted to generate and pass them through the pipeline.
Initially, we expected that the Melody of the generated song will inform the lyrics, as each syllable of the lyrics will correspond to one or more notes in the song. Therefore we decided to set an upper limit of the number of notes of the melody in the lyrics. We used a dataset containing over 300000 song lyrics in a CSV form. We decided that we will only use English, and filter out all negative songs from the dataset, by analyzing the sentiments of each song, aggregating the major sentiment in them and assigning this as an additional label to the song. Initially, we designed an approach, where we use an ANN with LSTM nodes to generate words per syllables from the dataset, allowing to at most generate as many syllables as there are notes. We also wanted to introduce our own topics into the song, and reward the generated lyrics based on the topic we choose, for example, “writing a song” and ”AI” will result in a better score than other song lyrics. We would set up this as a pipeline and let it train until we get nice, consistent lyrics.
Once we reduced our training set to around 100 000 lyrics with our conditions, we realized that we will have a resource and time constraints if we want to create an action space from all the syllables that can be found in the training set. Unlike for example Hungarian, the English language is interesting when it comes to syllables, as more than one way of spelling can result in the same(or quite similar) sounds. This meant that it was not enough to only split up the training set into syllables and create the action space out of them, but we had to account for the phonetics so that we can reduce the action space. This resulted in some interesting texts being generated: KAHZ’,’ ’, ’WAHT’, ’VER’, ’ ’, ’BIHN’, ’DER’, ’AH’, ’ ’, ’YUW’, ’ ’, ’IHZ’, ’ ’, ’TUW’, ’ ’, ’AESK’, ’ ’,’WAHN’, ’RAHS’, ’ ’, ’YAE’, ’ ’, ’TAYM’, ’ ’, ’MAY’, ’ ’, ’IY’, ’ ’, ’IHT’, ’MAH’, ’ ’, ’DHAH’, ’ ’,’DAARK’, ’SIHZ’, ’ ’, ’NIH’, ’LIY’, ’ ’, ’STREY’. While the words make complete sense, it takes a while to realize what they actually mean and translate them back into plain English spelling is a manual effort, negating the topic assessment in our pipeline. Therefore we had to abandon the syllable based prediction and return to a more standard letter-based text generation, and solve the number of syllables challenge in post-processing instead of as part of the pipeline.
Our criteria for success was simple: have a catchy song that is written completely by a machine. The COMPUTD & Always-Online bv team considers the task successful as we managed to compose a song without any human intervention until the performance. We put all our creative efforts into creating the approach and then let the computer figure out the song, which opened up a whole new realm for the Data Scientists in the team to maybe dabble into music a bit more with the help of AI. We luckily had a team where the members maximally complemented each other: Dirk with his Deep Learning knowledge, Marcell with his analytical mindset, Pieter with his natural language processing creativity and René and Angela as producers and songwriters who brought the music into the team.
“AI has taken a giant step forward. There are many possibilities that will lead to new creative avenues in the future. There are still a number of hurdles to take in this early development phase and it should be noted that it is not yet possible for AI to really put a feeling,depth and a plot into a song, since this is still something purely human. That aspect is reflected in many AI applications and in many faces of the development. As a result, AI can become a great assistant in the near future to achieve unforeseen positive results. AI at the service of man. And not the other way around;) This song was therefore really created in collaboration with AI. ”
René and Angela Shuman
This research and results open up a world in front of AI composers who can assist in composing complete albums without burning out artists and giving them sleepless nights. They can also serve as great inspiration since every generation will be new and unique!
We are very much looking forward to sharing our song with you!
You get to decide who wins! From 10 April you can listen to all songs on this website and leave your evaluation. Which song earns your douze points?
Make sure to leave vote and on May 12th the winner is announced in a live stream!
Back to blogs