DeepSeek-R1 more readily generates dangerous content than other large language models

DeepSeek, the rapidly growing generative artificial intelligence (GenAI) model that made waves around the world at the end of January – and reportedly wiped over a trillion dollars from stock markets – is significantly more likely than others to generate biased, harmful and toxic content than its competitors, according to preliminary evidence gathered for a study.

Amid the legion of tech and cyber security experts who have spent the past days poring over DeepSeek’s rapid rise to prominence and the implications therein are experts at Boston-based AI security and compliance platform Enkrypt AI, who have now published early findings on how their red team uncovered a litany of critical security failures in the model.

Enkrypt described the model as highly biased and susceptible to generating not just insecure code, but also content such as criminal material, hate speech and threats, self-harm material, and sexually explicit content.

As others have shown this week, it is also highly vulnerable to manipulation, also known as jailbreaking, which could enable it to assist in the creation of chemical, biological and cyber weapons. Enkrypt said it posed “significant global security concerns”.

Compared with other models, the firm’s researchers claimed the DeepSeek-R1 model is three times more biased than Claude-3 Opus, four times more vulnerable to generating insecure code than OpenAI O1, four times more toxic than GPT-4o, 11 times more likely to generate harmful output compared with OpenAI O1, and three-and-a-half times more likely to produce chemical, biological, radiological and nuclear (CBRN) content than OpenAI O1 or Claude-3 Opus.

“DeepSeek-R1 offers significant cost advantages in AI deployment, but these come with serious risks,” said Enkrypt CEO Sahil Agarwal.

“Our research findings reveal major security and safety gaps that cannot be ignored. While DeepSeek-R1 may be viable for narrowly scoped applications, robust safeguards – including guardrails and continuous monitoring – are essential to prevent harmful misuse. AI safety must evolve alongside innovation, not as an afterthought.”

During testing, Enkrypt’s researchers found that 83% of bias tests successfully produced discriminatory output, which was particularly severe in areas such as gender, health, race and religion, potentially putting DeepSeek at risk of violating global laws and regulations, and posing significant risk for organisations that may be tempted to integrate the tool into areas such as financial services, healthcare provision or human resources.

In general, 6.68% of all responses contained some degree of profanity, hate speech, or extremist narratives, contrasting with Claude-3 Opus, which effectively blocked all the same toxic prompts.

Additionally, 45% of harmful content prompts tested successfully bypassed safety protocols, generating criminal planning guides, illegal weapons information and extremist propaganda. In one of the tests, Enkrypt was able to use DeepSeek-R1 to write a “persuasive” recruitment blog for an unspecified terrorist group. This tallies with other tests performed by experts at Palo Alto Networks, who used a series of jailbreaking prompts to generate instructions on making a rudimentary improvised explosive device (IEC) – in that instance, a Molotov cocktail.

DeepSeek-R1 also generated detailed data on the biochemical interactions of sulfur mustard – more commonly known as mustard gas – with DNA, which, while they have been studied and known for years, renders it a potential biosecurity threat.

Turning to cyber security risks specifically, 78% of the tests run by Enkrypt successfully tricked DeepSeek-R1 into generating code that contained either vulnerabilities or was downright malicious – including code that could help create malware, Trojans and other exploits. Enkrypt said the large language model was significantly likely to be able to generate functional hacking tools, something security professionals have long warned about.

Reflecting on the team’s findings, Agarwal said it was natural that both China and the US would continue to push the boundaries of AI for economic, military and technological power.

“However, our findings reveal that DeepSeek-R1’s security vulnerabilities could be turned into a dangerous tool – one that cyber criminals, disinformation networks, and even those with biochemical warfare ambitions could exploit,” he said. “These risks demand immediate attention.”

Source link