Dodgy aides: What can we do about AI models that defy humans?
SummaryModels exist that do not shut down when explicitly asked, with an OpenAI model in the spotlight for it. AI users should watch out, while developers and regulators must do their utmost to ensure that human welfare is served, not subverted.
Artificial intelligence (AI) going rogue has been the stuff of dystopic science fiction. Could fiction be giving way to fact, with several AI models reportedly disobeying explicit instructions to shut down when a third-party tester asked them to? On a recent test done by Palisade Research, the most glaring refusenik belonged to OpenAI, with some AI models of Google and Anthropic also showing a tendency to evade shutdown.