On June 6, the 2021 Global Artificial Intelligence Technology Conference (GAITC 2021) “Development and Challenges” Intelligent Media Forum hosted by the Chinese Society of Artificial Intelligence (CAAI) and co-organized by Sina News and Communication University of China was held in Hangzhou. Wang Zhongyuan, Vice President of Kuaishou Technology and Head of MMU&Y-tech, attended the forum and delivered a keynote speech on “The Collision and Fusion of Music and Technology-How Art Changes with Time”, sharing the dynamics and progress of Kuaishou in AI music, and fully demonstrating Kuaishou’s leading labor Smart technology and AI music bring a powerful boost to short videos.
As a national short video APP, Kuaishou has massive content, huge traffic and high user stickiness. The data shows that the average number of short videos uploaded by Kuaishou users per month exceeds 1.1 billion, and the overall daily activity is 370 million+. The average daily time for users to watch short videos and live content on the Kuaishou platform is nearly 100 minutes.
In the rich Kuaishou community ecology, music has become one of the factors that motivate users to create. In Kuaishou, 76% of Kuaishou works have a soundtrack, and 90% of Kuaishou users expect most short videos to have a soundtrack.
Why do users have such a strong dependence on music when creating short videos? Wang Zhongyuan said: “For the production experience of short videos, the positive impact of music is very important. For example, if the background music is removed and only the original sound is left in the short video, it will weaken the sense of atmosphere and leave users completely different from the previous one. Different impressions.”
In this regard, Wang Zhongyuan further analyzed the unique charm of music. In his opinion, music can be connected with people’s thoughts and emotions, and can make people feel happy, sad, and expectant. When Chinese people hear the Spring Festival overture, they often have a feeling of spring return to the earth and all things resuscitating, and the opening words of the CCTV Spring Festival Gala host do not consciously come to mind.
As a magical form of artistic expression, music also produces new forms with the development of technology. In the age of industrialization, improvements in manufacturing techniques have made the sounds that musical instruments can emit more rich and hierarchical. In the electronic age, the development of electronic technology has created sounds that cannot be made by machinery in nature, and the expressive power of music has become more diverse.
Now, entering the era of artificial intelligence, AI technology helps music achieve comprehensive personalization and intelligence, and brings new development space for music and short videos. According to Wang Zhongyuan, the popular “Ant Hey” on the Internet some time ago is a creative fusion of music and visual AI technology. Users only need a photo to automatically generate a humorous and fun dynamic singing video, plus magical features. BGM has quickly become a template for short video users to create.
AI technology promotes the popularization of music, fast-hand self-built models to restore professional-level singing
With the blessing of technology, music production has entered the era of popularization. How to help more users create personalized music? Kuaishou independently researched AI music creation models and AI singers.
From the point of view of the music production process, most of them tend to be streamlined, engineered, and modular. First, grasp the creative motivation, then write lyrics and music, then arrange music, and finally record and mix. With the AI model built by Kuaishou, every step of it can be completed with the help of AI.
Wang Zhongyuan said: “In the AI era, motivation has become very simple. Input random keywords into the Kuaishou AI music model, and the model can convert words into a representation of motivation, and even various initialized music.”
After determining the motivation, you can use the Kuaishou AI module to generate lyrics. In terms of AI lyrics, Kuaishou retrained the model for millions of existing songs to ensure that AI can understand the meaning of words well. Users only need to enter a theme, an emotion, and a style to generate numbers in a few seconds. Ten lyrics.
In the creation of AI melody, Kuaishou also adopted a similar method to build an audio training model for hundreds of thousands of music scores and millions of songs, and then let the model self-supervise and learn the inner relationship of songs through a mini database, thereby training the ability of AI melody generation.
It is understood that Kuaishou invited musicians on the platform to use AI models to create new songs. The popular music that has been created includes “Sweet Taste”, “Night”, and “Go Forward”.
Record a high-quality finished song, which requires the singer’s singing level to be extremely high. In order to solve the problems of out-of-tune singing and unpleasant timbre faced by some users, Kuaishou has launched an AI singer-assisted creation function and has continuously improved the accuracy of its models.
According to reports, in May 2020, the singing of AI singers is still at the KTV level; by December 2020, it is difficult for ordinary people to find flaws in the singing of AI singers; now, the model automatically adjusts the pitch, beat and lyrics according to the music score. AI singers can realistically restore the singing level of professional singers.
At the end of the speech, Wang Zhongyuan said: “In the future, Kuaishou will explore more new technologies to enrich the gameplay on the generation side. With the help of voice recognition technology, Kuaishou hopes that AI singers can imitate personal timbre in the future, and AI music can directly convert what they say into songs. , To meet the more personalized music creation needs of users, and continue to help the creation of short videos. “Kaishou internal documents should not be shared outside.Return to Sohu to see more