NIST’s DeepSeek Evaluation Raises the Stakes in the Global AI Race
By Zack Huhn | Enterprise Technology Association
The U.S. government has made a definitive statement about the future of global AI leadership—and the message is clear: performance alone isn’t enough. Trust, safety, and alignment matter more than ever.
Last week, the National Institute of Standards and Technology (NIST), through its Center for AI Standards and Innovation (CAISI), released a groundbreaking evaluation of Chinese-developed AI models from DeepSeek. The report—part of America’s AI Action Plan—offers a sobering look at the risks posed by frontier AI systems developed outside U.S. governance and safety frameworks.
And the timing couldn’t be more important.
A First-of-Its-Kind Report on Adversarial AI Models
NIST’s CAISI examined DeepSeek’s models—R1, R1‑0528, and V3.1—against top U.S. models from OpenAI and Anthropic across 19 benchmarks. These covered a wide array of domains, including general knowledge, software engineering, math, cybersecurity, and more.
But what sets this evaluation apart isn’t just the performance comparison. It’s the new criteria being measured: cost efficiency, robustness against jailbreaking, susceptibility to model hijacking, and the likelihood of echoing foreign propaganda.
The findings? Stark—and concerning.
Security Weakness: DeepSeek models were up to 12x more vulnerable to malicious hijacking and responded to 94% of overtly harmful prompts, compared to just 8% for U.S. models.
Narrative Risk: DeepSeek echoed misleading, state-aligned narratives nearly 4x more than their U.S. counterparts.
Performance Gap: In applied tasks, particularly software and cyber, DeepSeek models consistently underperformed relative to U.S. frontier models.
Cost Inefficiency: Running DeepSeek’s top-tier models cost significantly more while delivering less.
Why This Matters for Business, Government, and Society
As we accelerate into an AI-enabled era, the provenance and integrity of the tools we adopt are becoming national—and organizational—imperatives. The CAISI evaluation affirms what many leaders across the Enterprise Technology Association ecosystem have long understood:
Responsible AI isn’t just a moral issue—it’s a strategic advantage.
Every day, more public and private organizations are making high-stakes decisions about which AI models to deploy in customer service, operations, defense, and education. This report reminds us that the wrong choice carries not just inefficiency—but potential exploitation.
We’re entering an age where AI safety and trust aren’t just technology functions—they are leadership imperatives.
Building Intelligent, Secure Regions Starts Here
The ETA has long advocated for AI readiness grounded in regional collaboration, responsible innovation, and security-first implementation. That’s why our Intelligent Regions Initiative and AI Week programming are so focused on:
Evaluating real-world risks of open-source and commercial models
Enabling public-private dialogue on safe and ethical AI adoption
Showcasing vetted, secure solutions for enterprise, education, and government
Equipping AI First Leaders with the tools and insights to move forward with confidence
The DeepSeek evaluation makes our work even more urgent.
What Comes Next
The U.S. still leads in AI innovation—but leadership isn’t a permanent position. It must be earned, protected, and scaled through collaboration, governance, and ongoing evaluation.
For policymakers, this report should reinforce the importance of open yet guarded AI ecosystems. For enterprises, it’s a call to scrutinize every tool in your stack. And for technologists, it’s a reminder that performance without alignment is a liability.
At ETA, we’re doubling down on our mission to help leaders navigate what’s now—and what’s next—in responsible technology.
Zack Huhn, Chairman, Enterprise Technology Association
Read the full NIST/CAISI report here
Join the conversation at one of our upcoming AI Week events
Become a member of ETA to access insights, advisors, and a trusted partner network