OpenAI's widely recognized AI-powered tool, GPT-3, has been discovered to demonstrate reasoning abilities on par with college undergraduate students, according to scientific research.
During the study, the artificial intelligence large language model (LLM) was presented with reasoning problems commonly found in intelligence tests and standardized exams like the SAT, which are crucial in the admissions process for colleges and universities in the United States and other nations, reported PTI. The findings highlight the impressive capabilities of GPT-3 in tackling complex cognitive tasks.
As per the report, at the University of California - Los Angeles (UCLA), researchers conducted an experiment involving GPT-3. The AI was tasked with predicting the subsequent shape in intricate arrangements of shapes and answering SAT analogy questions, both of which were entirely new to the AI.
Additionally, the researchers invited 40 UCLA undergraduate students to attempt the same problems. In the shape prediction test, GPT-3 achieved an accuracy of 80%, surpassing the human participants' average score, which was slightly below 60%, and even outperforming their highest scores.
"Surprisingly, not only did GPT-3 do about as well as humans but it made similar mistakes as well," said UCLA psychology professor Hongjing Lu, senior author of the study published in the journal Nature Human Behaviour.
Regarding the SAT analogies, GPT-3 exhibited superior performance compared to the average score of the human participants. Analogical reasoning involves tackling novel problems by drawing comparisons to familiar ones and applying similar solutions to the new scenarios. The test questions required participants to identify pairs of words that shared analogous relationships. For instance, in the given problem "'Love' is to 'hate' as 'rich' is to which word?," the correct response would be "poor."
The questions asked test-takers to select pairs of words that share the same type of relationships. For example, in the problem "'Love' is to 'hate' as 'rich' is to which word?," the solution would be "poor".
Nevertheless, when faced with analogies based on short stories, the AI's performance was not as strong as that of the students. These particular problems required reading a passage and then discerning another story that conveyed a similar meaning.
"Language learning models are just trying to do word prediction so we're surprised they can do reasoning. Over the past two years, the technology has taken a big jump from its previous incarnations," Lu said.
Due to the restricted access to GPT-3's internal mechanisms, which are closely held by its creator, OpenAI, the researchers admitted uncertainty about the underlying workings of its reasoning abilities. They are uncertain whether large language models (LLMs) are genuinely starting to exhibit human-like "thinking" or if they are merely replicating human thought through a different process. The researchers expressed their desire to delve further into this matter in future investigations.
"We'd like to know if it's really doing it the way people do, or if it's something brand new - a real artificial intelligence - which would be amazing in its own right," said UCLA psychology professor Keith Holyoak, a co-author of the study.
(With inputs from PTI)
Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.