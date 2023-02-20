GPT-3 was trained on billions of words from an array of sources including 7,000 unpublished books, Wikipedia entries and news articles, which left it vulnerable to picking up on biased or hateful material as well. OpenAI has used human moderators to strip much of that out of its model, but that work isn’t foolproof. Bias is also almost impossible to detect when it’s deeply buried in an LLM, a complex layered network of billions of parameters that acts like a black box even to its own creators. Misinformation also afflicts these models. Tech news site CNET generated 77 articles on financial advice last November using an LLM. It has to issue corrections on 41 of them. OpenAI doesn’t disclose what it calls the “hallucination rate" of its language models, but a January 2022 report on tech news site Protocol cited researchers as saying it was between 21% and 41%. My own experience of using ChatGPT puts misinformation at between 5% and 10%. Even if the rate is that low, companies using LLMs need to take everything the programs say with a huge grain of salt.