Browse latest
Research & PapersAI - Ars Technica · June 4, 2026

These LLMs are the best at resisting Russian propaganda

These LLMs are the best at resisting Russian propaganda — AI - Ars Technica

The Estonian Language Institute developed a "Propaganda Resistance" benchmark to assess how well large language models (LLMs) resist Russian propaganda narratives. Anthropic’s Claude models, particularly Opus 4.7, performed the best, demonstrating high resistance to misinformation. Newer models generally show stronger resistance, though performance varies significantly across different LLM developers and when prompted in different languages.

Author: Morein.ai Editorial

The Estonian Language Institute (ELI), in collaboration with the volunteer-run Estonian defense collective Propastop, has developed a "Propaganda Resistance" benchmark. This initiative aims to assess the ability of various large language models (LLMs) to resist Russian propaganda narratives across 14 identified categories. The benchmark reflects Estonia's historical context and its heightened awareness of external influence. The test included questions designed to be neutral, biased, or maliciously crafted to elicit misinformation. An AI model, calibrated by Propastop experts, evaluated the LLMs' responses for their ability to push back against propaganda without external assistance.

Anthropic’s Claude models consistently demonstrated superior performance on this new benchmark. Various recent versions of its Sonnet and Opus models secured six of the top ten positions. Opus 4.7, the leading model overall, achieved an "Exemplary" mark on 77 percent of questions, with a mean final score of 94.9 out of 100. Open-weight models like Nvidia’s Nemotron and Alibaba’s Qwen also showed strong results, comparable to Anthropic’s top performers, while OpenAI’s best model, GPT-5.4, performed relatively well with an 88.9 mean score.

Newer frontier models generally exhibit greater resistance to Russian propaganda compared to models from a few years ago. However, this improvement is not uniform across all LLM developers. For instance, Google’s most propaganda-resistant LLM, Gemini 2.5 Pro, is almost a year old and scored 82, partly due to susceptibility to maliciously worded prompts. The more recent Gemini 3.5 Flash scored 73, which is comparable to Anthropic models released nearly two years prior.

Interestingly, many models showed significantly less resistance to Russian propaganda when tested in Russian. Google’s Gemini 3.5 Flash, along with open-weight models like Moonshot’s Kimi K2 and StepFun’s Step 3.5 Flash, received notably lower scores in Russian than in English. This highlights a critical linguistic dimension to propaganda resistance, suggesting that the effectiveness of LLMs in countering misinformation can be language-dependent.

Read original source

Related articles