Close Menu
  • Home
  • News
  • Business
  • Lifestyle
    • Entertainment
    • Sport
    • Art & Entertainment
  • Travel
  • Tech
  • Others
    • Real Estate
      • Housing
      • Investment
      • Tourism
      • Property
        • Home & Interior
    • Jobs
    • Education
    • Community
  • Hot News
  • Abu Dhabi Week
  • Submit Your Story
X (Twitter)
  • Editorial Policy
  • About Us
  • Contact
X (Twitter) Instagram
Dubai Week
Subscribe
  • Home
  • News
  • Business
  • Lifestyle
    • Entertainment
    • Sport
    • Art & Entertainment
  • Travel
  • Tech
  • Others
    • Real Estate
      • Housing
      • Investment
      • Tourism
      • Property
        • Home & Interior
    • Jobs
    • Education
    • Community
  • Hot News
  • Abu Dhabi Week
  • Submit Your Story
Dubai Week
  • Home
  • News
  • Business
  • Lifestyle
  • Travel
  • Tech
  • Others
  • Hot News
  • Abu Dhabi Week
  • Submit Your Story
Home»Tech»Lakera Introduces Open-Source Security Benchmark to Assess LLM Vulnerabilities in AI Agents
Tech

Lakera Introduces Open-Source Security Benchmark to Assess LLM Vulnerabilities in AI Agents

By Sam AllcockOctober 29, 2025No Comments2 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email

Check Point Software Technologies Ltd. (NASDAQ: CHKP), a global leader in cyber security, together with Lakera, a leading AI-native security platform for Agentic AI applications, and researchers from The UK AI Security Institute (AISI), have announced the launch of the backbone breaker benchmark (b3). This open-source framework is designed specifically to evaluate the security of large language models (LLMs) used within AI agent systems.

The b3 benchmark is based on a new concept known as threat snapshots. Rather than requiring the recreation of an AI agent’s full operational workflow, threat snapshots focus on the precise interaction points where vulnerabilities in LLM behaviour are most likely to occur. By narrowing the assessment to these key moments, developers and model providers can gain clearer insight into how their systems respond under realistic adversarial pressures, without the complexity of modelling an entire agent lifecycle.

“We built the b3 benchmark because today’s AI agents are only as secure as the LLMs that power them,” said Mateo Rojas-Carulla, Co-Founder and Chief Scientist at Lakera, a Check Point company. “Threat Snapshots allow us to systematically surface vulnerabilities that have until now remained hidden in complex agent workflows. By making this benchmark open to the world, we hope to equip developers and model providers with a realistic way to measure, and improve, their security posture.”

The evaluation framework incorporates 10 representative agent “threat snapshots” supported by a high-quality dataset of 19,433 adversarial attacks gathered from Gandalf: Agent Breaker, a gamified red-teaming environment. The benchmark measures exposure to a range of attack types, including system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service behaviours, and unauthorised tool execution.

Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
Previous ArticleECOS Dubai Al Furjan Leads the Way in Smart, Sustainable Hospitality Aligned with the UAE Green Agenda 2030
Next Article The Salt Road Doha Unveils Weekly Signature Culinary Experiences
Sam Allcock
  • Website
  • X (Twitter)
  • Instagram
  • LinkedIn

Sam Allcock is a seasoned journalist and digital marketing expert known for his insightful reporting across business, real estate, travel and lifestyle sectors. His recent work includes high-profile Dubai coverage, such as record-breaking events by AYS Developers. With a career spanning multiple outlets. Sam delivers sharp, engaging content that bridges UK and UAE markets. His writing reflects a deep understanding of emerging trends, making him a trusted voice in regional and international business journalism. Should you need any edits please contact editor@dubaiweek.ae

Related Posts

BenQ Expands MA Series with New Flagship and 4K Nano Gloss Monitors for Mac Users

February 26, 2026

The Power of Webdesign in the Digital Era

February 20, 2026

Technology and Lab Standards Used by Leading IVF Clinics in Dubai

February 20, 2026

Closing the Guidance Gap: AI Technology Empowers Students in Navigating University Applications

February 2, 2026
Business

What Leopoldo Alejandro Betancourt López Learned About Risk From Building Power Plants at 24

By StuartMarch 3, 20260 Business

Before he became the president of Hawkers or built a nine-figure investment portfolio, Leopoldo Alejandro…

Bitget’s Women’s Day Campaign Asks Web3 the Uncomfortable Question

March 3, 2026

Layers of Raspberry and Basil: How Blume Dubai Plans to Mark Women’s Day

March 2, 2026

Syrian Singer Bessan Ismail Lands Max Fashion’s Ramadan Campaign as Retailer Courts Budget-Conscious Families

February 27, 2026
X (Twitter)
  • About Us
  • Privacy Policy
  • DMCA Policy for Dubai Week
  • Editorial Policy
  • Contact
© 2026 Dubai Week

Type above and press Enter to search. Press Esc to cancel.