Melbourne, Mar 13 (The Conversation) When OpenAI launched ChatGPT-5 in August of last year, many academics scoffed at the tech company's claims its new artificial intelligence (AI) model possessed "PhD-level" intelligence.

After all, how could systems so prone to hallucination, flawed reasoning, and sycophancy compete with the world's brightest young minds?

Yet academics are now routinely using tools such as ChatGPT to assist them in their research in much the same way they might once have relied on PhD students. Perhaps the most famous example is the world's best-known mathematician, Terence Tao, who reports using generative AI as a mathematical collaborator.

I myself was recently turned from a skeptic into a believer when, over the course of a few months, I carried out a research project using a range of generative AI tools to perform tasks which I'd normally carry out in collaboration with my PhD students.

But this experience also highlighted a hidden danger of AI – one that shows why it would be unwise for anybody, whether they're an academic or not, to substitute AI for actual apprentices.

The engine-room of research production

PhD students are the engine-room of research production that underpins much scientific progress. Under guidance from their supervisors, they devise hypotheses and experiments, theorise mathematical models, write proofs, and draft research papers.

But doing a PhD is much more than cheap research labour. A PhD degree is an apprenticeship in research. Today's students are tomorrow's research leaders. To get there, students learn how to ask the right questions, how to critique findings and, ultimately, how to take responsibility for the science they produce.

Much of my own research develops mathematical models to explain why computer systems and programs are and – just as often – are not secure. This involves developing mathematical definitions and theories, stating theorems and writing logical proofs, but also implementing defensive programs that embody and validate the ideas of all of that mathematics.

Normally, each of these steps would be done in collaboration with a PhD student, with the student carrying out the bulk of the proof-writing and programming work under my supervision.

Using AI as a tool

In my recent project, however, I developed the key mathematical ideas in conversation with ChatGPT.

I used ChatGPT to state and refine the key mathematical definitions and theorems at the level of "pen-and-paper" mathematics, which I carefully checked for errors. I used Anthropic's Claude Code tool to translate all the pen-and-paper mathematics into a computerised format, in which a very old-fashioned proof checking program could check the consistency of each logical reasoning step.

I even used Claude Code and ChatGPT to implement Python programs to show that the mathematical ideas could be applied in practice. I checked these programs to make sure they did indeed conform to the mathematics, just as I would if they had been written by a new PhD student.

All up, I'd estimate that in the equivalent of six weeks of full-time work with the help of generative AI, I was able to produce the kind of research that might otherwise have taken at least a year of work by a PhD student working under my supervision.

This is remarkable progress. And perhaps a little frightening.

A hidden danger

Many people have warned that overuse of generative AI might reduce research quality. I agree that AI needs to be used with care and that, ultimately, authors need to remain responsible for the research they produce at all stages throughout the research process.

However, there is another danger.

In my research project, the extra benefits in terms of productivity came at a significant cost to learning. Not my own; I think I learned just as much working with generative AI as I would had I been working with a PhD student on this project.

I'm talking about the student learning that was sacrificed.

At no stage during this project did I teach a student how to discern a worthwhile research problem from a mere intellectual curiosity, nor what it means to rigorously test a hypothesis, place a piece of research within its proper scientific context, or even how to convey one's thoughts cogently or manage their time efficiently.

Good researchers are ones who are aware of the gaps in their knowledge, who view their own hypotheses with appropriate suspicion, and take intellectual responsibility for the work they carry out.

AI is poorly placed to do any of these things well.

