Once upon a time, knowledge was handed down through generations through oral folklores, mantras, and stories. With the invention of writing, these spoken words found permanence on tablets and scrolls, evolving the methods of information retention by humans. The printing press thereafter amplified this, transforming individual writings into mass-produced knowledge.
Fast-forward to the 20th century: the computer revolution arrived, relying on previously printed works to fuel its databanks. In essence, each technological advance in media has been an evolution of its predecessor, carrying with it vestiges of ancient wisdom.
Humans now stand on the cusp of another revolutionary moment - Generative AI. Unlike any medium before it, this technology can both comprehend and generate content, raising unique and unprecedented challenges. These Large Language Models (LLMs) are sophisticated enough to simulate human-like understanding and are trained on a diverse range of human-generated digital content—from Wikipedia and websites to news articles and blogs.
And while we might hope for our digital media to be a sanctuary of unblemished content, the reality is less idyllic. Digital platforms often reflect the world's imperfections, complete with prejudices, biases, and toxic elements. Consequently, any AI system trained on such a diverse array of digital media is likely to inherit an average representation of these flaws.
Luckily this flaw in training AI systems was identified early and the training of AI systems involved a human in the loop which would substantially remove such biases and toxicity, if not remove them altogether. Involving a human in the training of these systems helped to make these AI systems better which seems to be the new medium of the near future.
Surprisingly, a study by the Swiss Federal Institute of Technology (EPFL) found that between one-third to half of the 44 gig workers involved in training an AI model were themselves using AI systems like ChatGPT to generate training data.
This finding challenges the foundational notion that human intervention can correct the errors of AI systems, particularly when these systems already generate imperfect results. If this trend holds true on a larger scale, where human trainers delegate their tasks to existing AI models, we risk perpetuating and even amplifying the biases, prejudices, and toxic elements already present in our society.
Worse yet, we may fall into the trap of thinking that these AI-generated outputs are free from such flaws. While AI models do an excellent job of understanding things, hey misunderstand a lot more and when we use them for training new models, we only perpetuate their misunderstanding and errors.
Till the time we have enough evidence to prove otherwise, one can assume that an LLM trained by other LLMs will surely carry forward all misunderstandings and errors.
Pawan Prabhat and Paramdeep Singh are co-founders of Shorthills AI.
Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
MoreLess