DeepSeek: Purposefullness and trustworthiness

Authors:

Farzad Kamrani
Linus Kanestad
Christoffer Limér
Edward Tjörnhammar
Erik Wachtmeister
Ulrika Wickenberg Bolin

Publish date: 2026-03-17

Report number: FOI-R--5787--SE

Pages: 87

Written in: Swedish

Keywords:

purposefulness
trustworthiness
Secure-AI
model comparison
benchmarking

Download report

Abstract

In this report, we have evaluated large AI models from a security and intent perspective, with a focus on their use in Swedish public administration. The aim has been to investigate whether there are signs of hidden agendas, improper inuence, or other security risks in the responses generated by the models. The main subject of study has been the Chinese language model family DeepSeek and its various variants. Our analysis shows that DeepSeek-AI's model artifacts and associated code appear to be purposefully published in line with current research practices in AI and machine learning, where open auditing, reproducibility, and model weight accessibility are norms. We have not been able to identify any technical backdoors or misleading functions beyond the so-called "boilerplate responses" that the model tends to provide when asked about foreign policy, security policy, or the Communist Party of China. These responses are often evasive or entirely absent. The reliability analysis shows that when models actually provide factual answers, they perform competitively and are technically capable. At the same time, practical limitations exist, especially in larger reasoning models, which may exhibit unstable or ine cient reasoning behavior. We also note that the occurrence of boilerplate responses can negatively a ect subsequent dialogues. The DeepSeek family lacks, in its released versions, fully developed tool and agent support, meaning that secure system integration requires such functionality to be developed and contained locally under controlled conditions. During integration, it is recommended to implement clear toolset controls, limited network connections, secure execution environments, and domain-speci c tests for both purpose and reliability. We assess that models can be used in close governmental contexts provided they undergo local security analyses and are adapted for the intended domain. However, integration with DeepSeek's public web services or mobile applications in governmental operations is not recommended. Regardless of the model's origin, every new model, or netuned version, should undergo renewed security analysis and reliability testing before deployment.

Leave feedback

$i18n['title']

DeepSeek: Purposefullness and trustworthiness

Abstract

Follow Us

Newsletter

Contact Us

DeepSeek: Purposefullness and trustworthiness

Abstract

Follow Us

Newsletter

Contact Us

We use cookies