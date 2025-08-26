This is where ZeroBench and its peers come in. Each tries to measure a particular way AI capabilities are approaching—or exceeding—those of humans. Humanity’s Last Exam, for instance, sought to devise intimidating general-knowledge questions (its name derives from its status as the most fiendish such test it is possible to set), asking for anything from the number of tendons supported by a particular hummingbird bone to a translation of a stretch of Palmyrene script found on a Roman tombstone. In a future where many AI models can score full marks on such a test, benchmark-setters may have to move away from knowledge-based questions entirely.