arXiv: Security of LLM-generated Code: A Comparative Analysis

AI_SAFETY AI Security & Safety · 21 May 2026 · arxiv_cscr

AI Analysis

This publication is a research paper, not a regulatory change, but it provides critical evidence for compliance teams assessing AI risk under frameworks like the EU AI Act. The study systematically compares the security of code generated by large language models (LLMs) from different providers, finding significant variation in vulnerability rates, including common flaws like injection attacks and insecure cryptographic practices. It highlights that even state-of-the-art models produce code with exploitable weaknesses, and that security outcomes depend heavily on model selection and prompt engineering.

Organizations developing or deploying AI-assisted coding tools are directly affected, particularly those in regulated sectors such as finance, healthcare, critical infrastructure, and software vendors subject to liability for defective products. Compliance teams in these sectors must now consider that LLM-generated code may introduce systemic risks that require additional validation, documentation, and human oversight to meet safety and robustness obligations under the AI Act.

Compliance teams should immediately review their AI risk management frameworks to include specific security testing for LLM-generated code, such as static analysis and penetration testing. They should also update procurement and vendor assessment processes to require model providers to disclose vulnerability benchmarks. Finally, teams should document these findings in their conformity assessments and consider implementing mandatory human review for high-risk code generation use cases.

View original source →

Get notified about AI_SAFETY changes

Subscribe to our free weekly digest covering 24 compliance frameworks.