Watch out for generative AI trained by generative AI

 (iStockphoto)
(iStockphoto)

Summary

Self-reinforcing errors and biases are a serious threat and we mustn’t lose time on AI regulation

It takes a massive amount of data to train AI systems to perform specific tasks accurately and reliably. Using human labour to label AI programs is a crucial part of training machine learning models for certain applications such as driverless cars. Workers in countries like India and Sri Lanka, and workers from Mechanical Turk, a platform created by Amazon, are two sources of labour for this task.

For many of today’s ‘machine learning’ or ‘deep learning’ programs, including image recognition for self-driving vehicles, we think of thousands of humans in India or Sri Lanka labelling every picture so that an AI program can refer to this human labelling each time it attempts a task like recognizing a traffic sign or telling a pedestrian apart from a bicyclist. The primary advantage of using Indian or Sri Lankan labellers lies in cost-efficiency. You may potentially hire more workers for the same money and get the labelling job done cost effectively. An interesting consideration is the potential cultural and language differences, a phenomenon I call ‘English to English’ translation. While many Indians and Sri Lankans are proficient in English, subtle linguistic nuances or culturally specific contexts could be missed. On the other hand, Amazon’s Mechanical Turk (MTurk) is a crowdsourcing marketplace that connects ‘requesters’ (those who need tasks done) with ‘workers’ willing to perform them. MTurk boasts of global reach with a vast pool of workers from various backgrounds. This diversity can be especially useful for tasks requiring multilingual and multicultural knowledge.

The flexible nature of MTurk also provides a significant advantage. Workers can choose tasks that suit their skill-sets and work on them at their convenience. As a result, requesters can usually get their tasks completed relatively quickly. Furthermore, MTurk’s integrated quality control mechanisms help ensure that work output is of a reasonable standard. The cost-effectiveness of using it varies, depending on task complexity. Simple tasks can be cost-efficient, but complex tasks that require highly skilled workers might be more expensive than sourcing labour from countries with lower wages. The anonymous and impersonal nature of the platform could also mean variable quality and a lack of accountability. Workers on the platform are paid per task, which may get rushed jobs without due care for quality, especially if payments are low.

Some new generative AI models, though, can be primed for specific tasks using just a few examples, as opposed to the thousands of examples and several hours of additional training required by their ‘deep learning’ predecessors. Computer scientists call this ‘few-shot learning,’ and believe that GPT3 was the first real example of a powerful change in the way humankind trains machines. Systems architects have been able to provide just a few simple instructions to have GPT 3 write its own programs.

This throws generative AI systems into a different orbit altogether, but does not reduce their fallibility. In fact, they are known to have kinks, including biases and dispositions to profanity, and firms like OpenAI are actively seeking inputs from even lay users so that they can improve their training models for better output.

That said, dangers still lurk. The first is from places such as the dark web. I wrote recently about the phenomenon of “Jailbreaking" generative AI systems where an ‘ethical hacking’ firm named Adversa.AI has shown significant success in breaking into a whole slew of large-language generative AI offerings, including GPT 4, Google’s Bard, Anthropic’s Claude and Microsoft’s Bing Chat system. The efficiency with which a single set of commands can flummox all these models is a surprise, an abject lesson in the vulnerability of these systems (rb.gy/ovhdz).

But now there is news of more vulnerability, according to MIT’s Technology Review (rb.gy/yrsox). While both offshore workers and workers from MTurk offer unique advantages in labelling AI programs, it is crucial to establish proper quality control mechanisms to ensure high-quality data labelling. This is because the quality of AI models is highly dependent on the quality of the input data—and that starts with the human labour that goes in.

It appears as if gig workers on platforms such as M-Turk may be using generative AI to complete their tasks, says the magazine. It reports that a team of researchers from the Swiss Federal Institute of Technology hired 44 people on the gig work platform to summarize 16 extracts from medical research papers. The responses were then analysed using an AI model that looks for signals of ChatGPT output. The team also extracted the workers’ keystrokes to look for other indicators that the generate responses came from elsewhere.

According to the magazine, the team estimated that somewhere between 33% and 46% of the workers had used AI models like OpenAI’s ChatGPT. It goes on to quote a researcher who says, “Using AI-generated data to train AI could introduce further errors into already error-prone models. Large language models regularly present false information as fact. If they generate incorrect output that is itself used to train other AI models, the errors can be absorbed by those models and amplified over time".

It sounds to me like it’s time for governments around the world to step in and regulate what may soon be dangerous trends. But in all honesty, I have no idea where they would even start.

Siddharth Pai is co-founder of Siana Capital, a venture fund manager.

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
more

MINT SPECIALS