Close Menu
  • Home
  • News
  • Business
  • Lifestyle
    • Entertainment
    • Sport
    • Art & Entertainment
  • Travel
  • Tech
  • Others
    • Real Estate
      • Housing
      • Investment
      • Tourism
      • Property
        • Home & Interior
    • Jobs
    • Education
    • Community
  • Hot News
  • Abu Dhabi Week
  • Submit Your Story
X (Twitter)
  • Editorial Policy
  • About Us
  • Contact
X (Twitter) Instagram
Dubai Week
Subscribe
  • Home
  • News
  • Business
  • Lifestyle
    • Entertainment
    • Sport
    • Art & Entertainment
  • Travel
  • Tech
  • Others
    • Real Estate
      • Housing
      • Investment
      • Tourism
      • Property
        • Home & Interior
    • Jobs
    • Education
    • Community
  • Hot News
  • Abu Dhabi Week
  • Submit Your Story
Dubai Week
  • Home
  • News
  • Business
  • Lifestyle
  • Travel
  • Tech
  • Others
  • Hot News
  • Abu Dhabi Week
  • Submit Your Story
Home»Tech»Lakera Introduces Open-Source Security Benchmark to Assess LLM Vulnerabilities in AI Agents
Tech

Lakera Introduces Open-Source Security Benchmark to Assess LLM Vulnerabilities in AI Agents

By Sam AllcockOctober 29, 2025No Comments2 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email

Check Point Software Technologies Ltd. (NASDAQ: CHKP), a global leader in cyber security, together with Lakera, a leading AI-native security platform for Agentic AI applications, and researchers from The UK AI Security Institute (AISI), have announced the launch of the backbone breaker benchmark (b3). This open-source framework is designed specifically to evaluate the security of large language models (LLMs) used within AI agent systems.

The b3 benchmark is based on a new concept known as threat snapshots. Rather than requiring the recreation of an AI agent’s full operational workflow, threat snapshots focus on the precise interaction points where vulnerabilities in LLM behaviour are most likely to occur. By narrowing the assessment to these key moments, developers and model providers can gain clearer insight into how their systems respond under realistic adversarial pressures, without the complexity of modelling an entire agent lifecycle.

“We built the b3 benchmark because today’s AI agents are only as secure as the LLMs that power them,” said Mateo Rojas-Carulla, Co-Founder and Chief Scientist at Lakera, a Check Point company. “Threat Snapshots allow us to systematically surface vulnerabilities that have until now remained hidden in complex agent workflows. By making this benchmark open to the world, we hope to equip developers and model providers with a realistic way to measure, and improve, their security posture.”

The evaluation framework incorporates 10 representative agent “threat snapshots” supported by a high-quality dataset of 19,433 adversarial attacks gathered from Gandalf: Agent Breaker, a gamified red-teaming environment. The benchmark measures exposure to a range of attack types, including system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service behaviours, and unauthorised tool execution.

Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
Previous ArticleECOS Dubai Al Furjan Leads the Way in Smart, Sustainable Hospitality Aligned with the UAE Green Agenda 2030
Next Article The Salt Road Doha Unveils Weekly Signature Culinary Experiences
Sam Allcock
  • Website
  • X (Twitter)
  • Instagram
  • LinkedIn

Sam Allcock is a seasoned journalist and digital marketing expert known for his insightful reporting across business, real estate, travel and lifestyle sectors. His recent work includes high-profile Dubai coverage, such as record-breaking events by AYS Developers. With a career spanning multiple outlets. Sam delivers sharp, engaging content that bridges UK and UAE markets. His writing reflects a deep understanding of emerging trends, making him a trusted voice in regional and international business journalism. Should you need any edits please contact editor@dubaiweek.ae

Related Posts

Closing the Guidance Gap: AI Technology Empowers Students in Navigating University Applications

February 2, 2026

Technology Innovation Institute achieves milestone with UAE’s first liquid rocket engine firing, advancing national space ambitions

February 2, 2026

CloudMile Wraps Up ‘AI in Action’ SEA Tour, Launches LumiTure.ai FinOps Platform and Highlights Secure-by-Design AI Framework

January 30, 2026

Flytxt Recognised as a Niche Player in the 2025 Gartner® Magic Quadrant™ for AI in CSP Customer and Business Operations

January 29, 2026
Health

Ali Çetinkaya: AI Predictive Modelling Supports Informed Aesthetic Choices

By Sam AllcockFebruary 18, 20260 Health

Op. Dr. Ali Çetinkaya says that the AI Aesthetic Assistant project offers transparent, structured guidance…

Business Bay tower finishes two months early as Dubai construction costs threaten wider delays

February 18, 2026

The Barbershop That Commissioned Custom Thrones and Czech Lighting Specialists

February 17, 2026

Maison Dalí and HAIYATEA Join Forces for a One-Off Afternoon Tea in Dubai

February 17, 2026
X (Twitter)
  • About Us
  • Privacy Policy
  • DMCA Policy for Dubai Week
  • Editorial Policy
  • Contact
© 2026 Dubai Week

Type above and press Enter to search. Press Esc to cancel.