Reddit Strikes $60 Million Licensing Deal to Reportedly Train Google's AI Models

Subscribe to HubSpot's Next in AI Newsletter

Martina Bretous

Published: February 27, 2024

About a week ago, Bloomberg reported that Reddit had just signed a huge licensing deal ahead of its IPO, allowing an unnamed company to train its AI models on their data.

Reddit Strikes $60 Million Licensing Deal to Reportedly Train Google's AI Models

A new report says that company is Google, although neither party has confirmed it. If true, this would be Reddit's first content deal.

Why is every AI company looking for licensing deals?

Since the AI race started, getting access to large, quality datasets has been a top priority.

AI models are trained on data – the more data it’s trained on, the better the output. In addition to quantity, there’s also a quality perspective. AI models want access to high-quality data that their competitors ideally don’t have access to.

This is where publishers like Reddit come in.

For a long time, OpenAI and other AI companies were freely roaming through publishers’ data. That was until publishers like The New York Times and Reddit caught on.

Last April, Reddit said, “If you want access to an 18-year deep well of data, you’re going to have to pay up.”

The NYT, on the other hand, just said, “no.” (And they’re suing OpenAI for allegedly still doing it.)

Now close to a year later, Google, Apple, and OpenAI have all signed licensing agreements with huge publishers worth $100+ million.

The latest to join is Reddit who reportedly signed with Google, in a deal worth $60 million annually. This deal likely has an exclusivity clause, ensuring that only Google has access to this data, however that hasn’t been confirmed.

With an upcoming IPO, Reddit’s CEO Steve Huffman shared the company had earned over $200 million in licensing deals.

“Reddit’s vast and unmatched archive of real, timely, and relevant human conversation on literally any topic is an invaluable dataset for a variety of purposes, including search, AI training, and research,” wrote Huffman in their S-1 filing.

This would also be a huge win for Google who’s been trying to dethrone OpenAI for years.

Should AI licensing deals come with guardrails?

Some see licensing deals as a win-win: Publishers get paid for their data while AI companies get access to large, quality datasets.

However, it also comes with some setbacks.

Social media platforms like Reddit and X are community forums where people can write just about anything. Conspiracy theories, misinformation, and hateful rhetoric.

X user disapproves of Reddit's AI licensing deal with Google

Image Source

And although Reddit does have content moderators and policies, they only introduced a ban on hate speech 15 years after the site was founded.

Is that what AI models should be trained on?

AI companies can clean their data to filter out this type of content but there’s no clear standard that every model is built on. So, as a consumer, I won’t know what data models were trained on and how well they’ve been “cleaned.”

So, it begs the question: Should some websites be off the table when it comes to training AI models? And what guardrails are in place to ensure their models aren’t regurgitating the darkest content on the internet?

These answers are still up in the air.

Topics: Artificial Intelligence

What does 'publicly available' training data mean to AI companies?

Apr 09, 2024
Amazon Kills its AI-Powered 'Just Walk Out' Checkout Feature

Apr 09, 2024
How Deepfakes Impact Influencer Marketing (and What to Do About It)

Apr 02, 2024
Latimer is Making AI More Diverse and Inclusive

Mar 26, 2024
Everything You Need to Know about AI-Powered Search Engine Perplexity

Mar 26, 2024
Here's Everything You Need to Know about Deepfakes

Mar 19, 2024
How Text-to-Video Apps May Impact the Entertainment & Stock Content Industries

Mar 19, 2024
Nvidia: The Silent Force Powering AI Innovation

Mar 19, 2024
Where AI Regulation Stands in the UK, According to a Tech Lawyer

Mar 12, 2024
How AI is Impacting SEO + What to Do About It [Expert Interview]

Mar 12, 2024

Blogs

Blogs

Marketing

Sales

Service

Website

Next in AI

Instagram Marketing

Customer Retention

Email Marketing

SEO

Sales Prospecting

Newsletters

Newsletters

The Hustle

Videos

Videos

The Hustle

Marketing with HubSpot

My First Million

Marketing Against the Grain

HubSpot

Podcasts

Podcasts

My First Million

Goal Digger

The Hustle Daily Show

Another Bite

Business Made Simple

Marketing Against the Grain

Online Marketing Made Easy

The Product Boss

Nudge

Side Hustle Pro

Outbound Squad

Resources

Resources

Academy

Templates

Ebooks

Kits

Tools

HubSpot Products

The HubSpot Customer Platform

Free HubSpot CRM

Overview of all products

Marketing Hub

Sales Hub

Service Hub

Content Hub

Operations Hub

Commerce Hub

About HubSpot

Contact Us

Customer Support

Log in

日本語

Deutsch

English

Español

Português

Français

Reddit Strikes $60 Million Licensing Deal to Reportedly Train Google's AI Models

Why is every AI company looking for licensing deals?

Should AI licensing deals come with guardrails?

Don't forget to share this post!

Related Articles

What does 'publicly available' training data mean to AI companies?

Amazon Kills its AI-Powered 'Just Walk Out' Checkout Feature

How Deepfakes Impact Influencer Marketing (and What to Do About It)

Latimer is Making AI More Diverse and Inclusive

Everything You Need to Know about AI-Powered Search Engine Perplexity

Here's Everything You Need to Know about Deepfakes

How Text-to-Video Apps May Impact the Entertainment & Stock Content Industries

Nvidia: The Silent Force Powering AI Innovation

Where AI Regulation Stands in the UK, According to a Tech Lawyer

How AI is Impacting SEO + What to Do About It [Expert Interview]