FAILSafe: Protecting AI against foreign influence operations designed to infect LLMs

NewsGuard’s Foreign Adversary Influence in LLMs Safety Service (FAILSafe) helps AI companies detect and defend against foreign influence operations aimed at tainting AI responses with state-sponsored disinformation narratives and propaganda.

Created in response to a groundbreaking NewsGuard audit that found Russian disinformation networks had infected top AI tools, leading those tools to repeat propaganda narratives 33% of the time, FAILSafe provides AI companies with real-time data, verified by disinformation researchers with an expertise in foreign malign influence, on narratives and sources involved in adverse influence operations run by the Russian, Chinese, and Iranian governments.

Learn More

Media Coverage

Real-time data about foreign disinformation narratives

FAILSafe provides AI companies with a continuously updated data stream of information about false narratives being spread by Russian, Chinese, and Iranian influence operations, designed to enable AI companies to ensure their systems do not inadvertently repeat these narratives in response to user queries.

Domain and account data for foreign influence operations

AI companies can license the continuously updated FAILSafe database of websites, social accounts, platform handles, and other publishing venues that are directly involved in foreign malign influence operations, built to allow AI companies to ensure their systems do not rely on content from these websites and accounts.

Red-teaming and monitoring from disinformation experts

NewsGuard analysts can conduct periodic stress-testing of AI products to determine whether, and to what extent, Russian, Chinese, and Iranian disinformation and propaganda narratives have infected responses, conducted by NewsGuard’s expert disinformation analysts using proprietary data about known disinformation narratives.

Fast, simple integration via API or cloud datastream.