A tool to catch students cheating with ChatGPT. OpenAI hasn’t released it
Summary
The technology that is mired in internal debate can detect text written by artificial intelligence with 99.9% certainty.OpenAI has a method to reliably detect when someone uses ChatGPT to write an essay or research paper. The company hasn’t released it despite widespread concerns about students using artificial intelligence to cheat.
The project has been mired in internal debate at OpenAI for roughly two years and has been ready to be released for about a year, according to people familiar with the matter and internal documents viewed by The Wall Street Journal. “It’s just a matter of pressing a button," one of the people said.
In trying to decide what to do, OpenAI employees have wavered between the startup’s stated commitment to transparency and their desire to attract and retain users. One survey the company conducted of loyal ChatGPT users found nearly a third would be turned off by the anticheating technology.
An OpenAI spokeswoman said the company is concerned the tool could disproportionately affect groups such as non-native English speakers. “The text watermarking method we’re developing is technically promising but has important risks we’re weighing while we research alternatives," she said. “We believe the deliberate approach we’ve taken is necessary given the complexities involved and its likely impact on the broader ecosystem beyond OpenAI."
Employees who support the tool’s release, including those who helped develop it, have said internally those arguments pale compared with the good such technology could do.
Generative AI can create an entire essay or research paper in a matter of seconds, based on a single prompt, free. Teachers and professors say they are desperate for help to crack down on its misuse.
“It’s a huge issue," said Alexa Gutterman, a high school English and journalism teacher in New York City. “It’s something that every teacher I work with has talked about."
A recent survey by the Center for Democracy & Technology, a technology policy nonprofit, found 59% of middle- and high-school teachers were sure some students had used AI to help with schoolwork, up 17 points from the prior school year.
OpenAI Chief Executive Sam Altman and Chief Technology Officer Mira Murati have been involved in discussions about the anticheating tool. Altman has encouraged the project but hasn’t pushed for it to be released, some people familiar with the matter said.
Wall Street Journal owner News Corp has a content-licensing partnership with OpenAI.
99.9% effective
ChatGPT is powered by an AI system that predicts what word or word fragment, known as a token, should come next in a sentence. The anticheating tool under discussion at OpenAI would slightly change how the tokens are selected. Those changes would leave a pattern called a watermark.
The watermarks would be unnoticeable to the human eye but could be found with OpenAI’s detection technology. The detector provides a score of how likely the entire document or a portion of it was written by ChatGPT.
The watermarks are 99.9% effective when enough new text is created by ChatGPT, according to the internal documents.
“It is more likely that the sun evaporates tomorrow than this term paper wasn’t watermarked," said John Thickstun, a Stanford researcher who is part of a team that has developed a similar watermarking method for AI text.
Still, staffers have raised concerns the watermarks could be erased through simple techniques like having Google translate the text into another language and then back, or having ChatGPT add emojis to the text and then manually deleting them, an OpenAI employee familiar with the matter said.
There is broad agreement within the company that determining who can use this detector would be a challenge. If too few people have it, the tool wouldn’t be useful. If too many get access, bad actors might decipher the company’s watermarking technique.
OpenAI employees have discussed providing the detector directly to educators or to outside companies that help schools identify AI-written papers and plagiarized work.
Google has developed a watermarking tool that can detect text generated by its Gemini AI. Called SynthID, it is in beta testing and isn’t widely available.
OpenAI has a tool to determine whether an image was created using its text-to-image generator, DALL-E 3, that was released for testing this past spring. The company has given priority to audio and visual watermarking over text because the harms are more significant, particularly in a busy election year in the U.S., the employee familiar with the matter said.
Essays about Batman
In January 2023, OpenAI released an algorithm intended to detect text written by several AI models, including its own. But it succeeded just 26% of the time, and OpenAI pulled it seven months later.
There are other tools developed by outside companies and researchers to detect text created with AI, and many teachers say they have used them. But they sometimes fail to detect text written by advanced large language models and can produce false positives.
At first, students “thought we had all these magic wizard tricks to figure out whether they were using AI," said Mike Kentz, an AI consultant for educators who recently taught at a private high school in Georgia. “By the end of the year…they were like, hold on a second, my teacher has no idea."
Some teachers encourage students to use AI to help with research or to provide feedback on ideas. The problem is when students have an app like ChatGPT do all the work and don’t even know what they are handing in.
Last year, Josh McCrain, a political-science professor at the University of Utah, gave students a writing assignment that included, in indecipherable small text, instructions to include a reference to Batman. If they copy and pasted the assignment into AI, the instructions would be incorporated.
Sure enough, a handful of students turned in papers with nonsensical references to Batman. Going forward, McCrain is tweaking writing assignments to focus more on current events AI is less familiar with and beseeching students not to outsource their work to AI. “That’s where I try to really hammer that point home to students: You need to learn this stuff," he said.
Years of debate
Discussions about the watermarking tool started before OpenAI launched ChatGPT in November of 2022 and have been a persistent source of tension, the people familiar with the matter said. It was developed by Scott Aaronson, a computer-science professor who has been working on safety at OpenAI while on leave from the University of Texas for the past two years.
In early 2023, one of OpenAI’s co-founders, John Schulman, outlined the pros and cons of the tool in a shared Google Doc. OpenAI executives then decided they would seek input from a range of people before acting further.
Over the next year and a half, OpenAI executives repeatedly discussed the technology and sought fresh data to help decide whether to release it.
In April 2023, OpenAI commissioned a survey that showed people worldwide supported the idea of an AI detection tool by a margin of four to one, the internal documents show.
That same month, OpenAI surveyed ChatGPT users and found 69% believe cheating detection technology would lead to false accusations of using AI. Nearly 30% said they would use ChatGPT less if it deployed watermarks and a rival didn’t.
A recurring internal concern has been that the anticheating tool could hurt the quality of ChatGPT’s writing. OpenAI conducted a test earlier this year that found watermarking didn’t impair ChatGPT’s performance, people familiar with the matter said.
“Our ability to defend our lack of text watermarking is weak now that we know it doesn’t degrade outputs," employees involved in the testing concluded, according to the internal documents.
In early June, OpenAI senior employees and researchers met again to discuss the project. The group agreed the watermarking technology worked well, but the results of the ChatGPT user survey from last year still loomed large. Staffers said the company should look into other approaches that were potentially less controversial among users but unproven, according to people with knowledge of the meeting.
They also said OpenAI needed a plan by this fall to sway public opinion around AI transparency as well as potential new laws on the subject, the internal documents show.
“Without this, we risk credibility as responsible actors," a summary of the June meeting said.
Write to Deepa Seetharaman at deepa.seetharaman@wsj.com and Matt Barnum at matt.barnum@wsj.com