YouTube’s AI Evolution
GEOFF VAN DER MEER
VP Engineering, YouTube
Human brains are fast: Sending electric signals at over 400 kilometers per hour, and generating enough energy to light a bulb at any given second, our brains are one of nature’s biggest wonders. Ironically still leaving us with a lack of understanding of how they work in detail.
Still, we started creating new brains, artificial ones. It’s a chance and a threat at the same time. And you’d be hard pressed to find a more competent person to speak on those than Geoff van der Meer, Vice President of Engineering at YouTube. With degrees in mathematics, physics and software engineering and a big hand in the creation of Music Streaming giant Spotify. Here are four theses’ derived from his keynote at GMPLN24 to give you a better feeling of one of the most dominating topics in news cycles in recent years.
“AI is changing the way we’re interacting with sports.”
In 2023, there have been over 1.7 billion video views related to generative AI tools on YouTube, an internal analysis done by the platform revealed. Sounds a lot? Well, Geoff finds that number undercounted severely and finds double or triple the count fathomable. But AI does not only exist in content. It also has real-world applications in businesses predating virtual content by over 100 years.
Take football: Google’s AI department DeepMind partnered with Liverpool FC in 2019 and created TacticAI, a merge of graphical neural networks and generative AI that yielded a model to predict the most likely outcomes of different corner kick scenarios. The feedback by the coaches, themselves leaders in this field of expertise (Jürgen Klopp, former coach of the club, was the first to employ a throw-in expert in Thomas Gronnemark whose sole task was to perfect this overlooked part of a football game making Liverpool one of the best clubs in Europe when it comes to set pieces) was that these scenarios are on par or in most cases better than those they have been working. With a little caveat: The coaches have been working on those for days and weeks, TacticAI generated those in seconds.
How cool would it be to have these tools at your disposal during a broadcast of a sports event, giving you insights near-live you would not have gotten at all before? More on that later…
AI is well worth your time!
In 2023 ChatGPT and other LLMs (Large Language Models) took the world by storm in a developmental explosion in tech Geoff calls “the biggest since the mobile phone”. This was text-to-text interaction, all the hype in 2023 but old news in 2024 according to Geoff. In experiments run by YouTube you are now able to interact with all types of media. How about asking a video to summarize itself? You liked some item shown in the video that is not even a topic of said video? Just use AI to help you find out more.
According to the VP, a significant share of people asked in a survey run by YouTube are open to watching content created by the help of AI and synthetic content (fully AI-created). Think of synthetic content being less pictures or videos showing humans with three hands and other unintentionally funny results but indistinguishable from human content. But better content brings on a whole new set of problems.
The barrier to content creation is low(er).
Let us introduce you to RuiCovery. It's a YouTube account run by a studio in South Korea, with an AI-generated persona that sings, dances and VLOGs just like other creators generating millions of views and heaps of subscribers in the process. According to Geoff this will be the standard and less of an outlier in the near future. Content is becoming active, not passive. Users want to influence the content instead of it being presented to them.
Speaking of active, here is another number for you: 82% of people between ages 18 and 44 have created video content online in 2023 across platforms. Multimodel AI, in contrast to LLMs (text to text, text to images or vice versa) allows you to move freely between text, video, audio, images in any direction you please unlocking countless possibilities in the coming years. In turn more people create content, more content exists and the best content will get better and better.
DreamTrack, an AI tool in partnership with the Music Industry helps paint a picture: Artists like Sia or Charlie Puth volunteered to create a model using Machine Learning to create art. Ever thought about Sia leaving chandeliers aside and singing a unique Happy Birthday Song with a name drop for your significant other indistinguishable from a song Sia produced herself? Or a more pragmatic idea: You can’t reach new audiences because your market is Non-English or Non-Spanish. With “Aloud” you can create voice-overs for your videos speaking other languages perfectly. With your voice. Your tonality.
AI brings simple questions with very hard answers.
While AI generated content being indistinguishable from “real” content sounds exciting, it does bring a set of problems that need tackling: How do you mark AI-generated content to prevent misuse? Who owns and by proxy gets paid for the data and resources used in the process? How do you copyright a video using a song that did not exist before? Who does an AI generated script belong to?
There is not one answer to that, says Geoff. Google itself committed itself to marking all AI generated content. Watermarks are being developed to make detection, flagging and transparency possible. Putting the onus on the creators to declare synthetic content as exactly this is an idea too with cooperation with industry bodies being paramount. AI is a responsibility. AI is a new age of expression that requires protection for both participants and for those opting out. AI is in need of collaboration with industry leaders in safety, politics, engineering to scale this protection. Whether you’re a fan or sceptic - depending on the news outlet AI either is going to end humanity or propel us into the next age - AI is here to stay.