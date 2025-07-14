How to keep AI models on the straight and narrow
Interpretability techniques are powerful, but must be used with care
Artificial-intelligence models are getting better and better. Cutting-edge systems can handle increasingly complex tasks once thought beyond the ken of machines. However, as we report in the Science & technology section this week, they can also find surprising ways to get things done. Give an ai system the task of beating a chess-playing program, for instance, and rather than trying to checkmate its opponent, it may simply hack the program to ensure victory. Give it the job of maximising profits for an investment client with ethical qualms, and instead of changing its strategy it may misrepresent the harms associated with the profits.