AI_SAFETY

EU Regulatory Changes

396 changes tracked across 24 compliance frameworks including DORA, NIS2, GDPR, EU AI Act, Cyber Resilience Act, and more.

All DORA NIS2 GDPR CSRD MaRisk ISO27001 EU_AI_ACT CRA DSA DMA eIDAS2 SOC2 PCI_DSS HIPAA ISO42001 AMLD6 PSD3 DATA_ACT GPSR CER EUDR CVE BREACH AI_SAFETY

AI_SAFETY 30 May 2026 arxiv_cscr

arXiv: Secure Distributed Hypothesis Testing

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Code as a Weapon: A Consensus-Labeled Prompt Bank for Measuring Coding-Model Compliance with Malicious-Code Re...

This paper, published on arXiv, introduces a new benchmark called "Code as a Weapon," which is a curated set of prompts designed to test whether large language models (LLMs) that generate code will...

Read analysis →

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Efficient and Quantum-safe Internet Key Exchange Protocols for Satellite Communications

This publication from May 2026 introduces a new technical framework for Internet Key Exchange (IKE) protocols designed to be resistant to quantum computing attacks, specifically tailored for satell...

Read analysis →

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: MaskClaw: Edge-Side Personalized Privacy Arbitration for GUI Agents with Behavior-Driven Skill Evolution

This paper, published on arXiv, introduces MaskClaw, a technical framework designed to enhance privacy for graphical user interface (GUI) agents—AI systems that interact with software interfaces on...

Read analysis →

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: GraphSteal: Structural Knowledge Stealing from Graph RAG via Traversal Reconstruction

A new research paper, GraphSteal, published on arXiv, demonstrates a novel method for extracting the structural knowledge embedded within Graph-based Retrieval-Augmented Generation (RAG) systems. T...

Read analysis →

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Blind PRNG Hijacking: An Undetectable Integrity-Preserving Attack Against LLM Watermarking

A new academic paper published on arXiv, titled "Blind PRNG Hijacking: An Undetectable Integrity-Preserving Attack Against LLM Watermarking," presents a novel method to remove or bypass watermarkin...

Read analysis →

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Position: Retire the "Positive Backdoor" Label -- Secret Alignment Requires Strict and Systematic Evaluation

A new position paper published on arXiv, titled "Retire the 'Positive Backdoor' Label -- Secret Alignment Requires Strict and Systematic Evaluation," argues that the AI safety community should aban...

Read analysis →

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Technical Report: Exploring the Emerging Threats of the Agent Skill Ecosystem

This technical report, published on arXiv on May 27, 2026, identifies emerging security and safety risks within the rapidly growing ecosystem of AI agent skills—modular capabilities that can be dow...

Read analysis →

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

This paper, published on arXiv, introduces a novel method for detecting and exploiting refusal signals in large language models (LLMs) by analyzing their internal activations before a final output ...

Read analysis →

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Do you dare to try Test-Driven Forensics? Increasing Trust in Desktop Forensics with ADARE

This publication introduces the ADARE framework, which applies test-driven forensics to desktop investigations. It proposes a structured methodology for validating forensic tools and processes by u...

Read analysis →

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Towards Cybersecurity SuperIntelligence (CSI): What's the best harness for cybersecurity?

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: ISAC Privacy: Challenges and Solutions for 6G

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Out of Sight, Not Out of Mind: Unveiling Latent Attack in Latent-based Multi-Agent Systems

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: Cybersecurity AI (CAI) Dataset

AI_SAFETY 28 May 2026 arxiv_cscr

arXiv: SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

AI_SAFETY 28 May 2026 arxiv_cscr