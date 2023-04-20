In a first, Reddit will start charging third parties for access to its application programming interface or API data. Reportedly, Reddit's API data has been used to train some of the most popular chatbots in the market, including Google's Bard, OpenAI's ChatGPT, and Microsoft's Bing Chat.

While it is widely known that the new chatbots based on large language models have been trained using data from social media sites, Reddit has become the first social media platform to charge these companies for its data.

In an interview with the New York Times, Reddit's co-founder and CEO Steve Huffman explained the importance of data on the social media site, saying, "More than any other place on the internet, Reddit is a home for authentic conversation…There’s a lot of stuff on the site that you’d only ever say in therapy, or A.A., or never at all."

According to Huffman, the key to producing the best results for large language models is using new and relevant data. As of 2019, Reddit had over 430 million monthly active users, who were active participants in more than 1.2 million special interest communities.

The New York Times reported that both ChatGPT and Bard have been trained using Reddit data in one form or another. Reportedly, large language models are trained by downloading and processing user data from Reddit through its API. API allows developers to access Reddit data in a structured and organized way.

So far, Reddit has had a mutually beneficial relationship with Microsoft and Google, who scrape data from Reddit to provide accurate search results. In turn, Reddit benefits from appearing higher in search rankings and attracting more visitors to its platform

With the onset of large language models-based chatbots Reddit has little to gain from letting these companies use its data. Huffman explains, “The Reddit corpus of data is really valuable. But we don’t need to give all of that value to some of the largest companies in the world for free."

Huffman says that Reddit's API data will be available for free to developers who create applications that enhance user experience, though the platform has not yet revealed pricing for other third-party access.