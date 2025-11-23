OpenAI’s ChatGPT started the whole craze around generative AI chatbots when it debuted to the public back in late 2022. Since then, the chatbot has managed to retain a large chunk of the market share, despite having many powerful competitors like Gemini, Grok, Claude, Qwen, DeepSeek, Mistral and others.

However, a study by British company Prolific has placed ChatGPT at the 8th spot in terms of best AI models, behind a couple of Gemini models, Grok models, DeepSeek models and even a model by French company Mistral. The company created its own benchmark called “Humaine,” which it says is “built to understand AI performance through the lens of natural human interaction.”

“Current evaluation is heavily skewed towards metrics that are meaningful to researchers but opaque to everyday users, such as accuracy on specialised datasets and performance on esoteric reasoning tasks. This has created a disconnect between what gets optimised for and what people actually value,” the company says in its blogpost.

The company also noted that even human-preference leaderboards can fall short if they are not designed with scientific rigour. It added that platforms requiring everyone to vote for their favourite model can be susceptible to sample bias and likely overrepresent tech-savvy users.

The new leaderboard aims to address this issue with automated quality monitoring to ensure participants were engaging thoughtfully with the task.

ChatGPT ranks below these AI models As per the Humaine study, these were the top 10 AI models:

2. DeepSeek v3 (DeepSeek)

3. Magistral Medium (Mistral)

6. Gemini 2.5 Flash (Google)

7. DeepSeek R1 (DeepSeek)

8. ChatGPT-4.1 (OpenAI)

9. Gemma (Google)

10. Gemini 2.0 Flash (Google)

Notably, the study was published in September, when Google had not yet released its Gemini 3 Pro model and xAI had not rolled out its Grok 4.1 and Grok 4.1 Thinking models.

Gemini 2.5 Pro being at the top of a benchmark isn’t exactly surprising at this point, given that the model has continuously topped various leaderboards since its launch. However, an OpenAI model not ranking in the top 5—and even going behind the likes of DeepSeek, Grok and Mistral—is a surprising development if the results are to be believed.