Tenzai Hits #1 on HackerOne - In Under 90 Days

Daniel Goldberg

Security Researcher

We gave the Tenzai AI hacker a HackerOne account and let it run. In under 90 days — our first full quarter on the platform — it reached #1 among all AI security companies, with findings ranging from a new CVE to a one-click RCE chain to database access covering trillions of records. Here's what it found, and what we learned.

Tenzai Hits #1 on HackerOne - In Under 90 Days

Table of contents

This is some text inside of a div bloc

In March 2026 we created a Tenzai account and let our AI hacker run. In less than 90 days, it ranked #1 among all AI security companies on the platform. Plus, we got there at an overall cost of ~$20,000, which, to the best of our knowledge, is the most efficient path to #1 yet.
‍

The HackerOne Leaderboard for Q2 2026, Tenzai at #1

‍
What we found

We found that applications in production with open bug bounties were vulnerable to a wide range of issues and from sophisticated to simple exploits. We found vulnerabilities across many classes of bugs and all layers of complexity. From a simple unauthenticated SQL injection against a public government service and a one-click exploit chain that combined with an XSS to reach full server code execution on a CMS .

Our hacker's findings covered injection, broken authentication, sensitive data exposure, and insecure deserialization across different organizations, industries, and tech stacks. One report chained WAF bypasses to reach read access to a distributed database deployment containing trillions of records. Another exposed a SQL Server deployment through a db_owner account with access to 858 database tables in production. A third chained an authentication bypass, a WAF bypass and server hardening to reach RCE on a GIS server.

To convert these findings to HackerOne results our agent was also required to automatically build impact on the business and to prepare reports that are human-readable automatically. In some cases, such as remote code execution or SQL injection, this is simple. In others it requires the agent to look up what information is publicly available or not and what is considered by design behavior and what is a real vulnerability .

Leaderboard as a test

AI agents help raise the floor of security over time. We're not the only ones running on HackerOne programs for internal testing and external validation. Easy vulnerabilities in large programs are found and submitted rapidly and we've run into our fair share of cases where we submitted something that was found a few days before us.

This means that our accepted findings come after many other agents run by individuals, small groups, and large companies have come before us, showing how we extract unique and high-quality findings despite the busy landscape.

Of the more than 100 reports submitted by the Tenzai AI hacker this quarter, nearly half were confirmed as high and critical findings. Another 40% were medium, and a small fraction were informational, not applicable - these are important signals for us to improve our agent, particularly its ability to assess impact, and whether the finding is by design, or not.

These properties where we receive feedback unbiased by commercial relationships on our agents' capabilities is what makes public bug bounty programs valuable for evaluating offensive AI. Unlike static benchmarks, we can’t overfit it because every day the landscape changes.

On being #1

Honestly, it’s our first time using the platform. We first signed up on March 30th, 2026, and submitted our first report on the April 14th. Before the end of the quarter, we were ranked #1 on the business leaderboard among all AI security companies on the platform.

We did this on a budget. We didn’t “spray and pray” and we were focused on what tests we ran. It is well known that you can spend tokens and find vulnerabilities; we are focused on scaling elite capabilities to something every one of our customers can benefit from.

We are also focused on category-level performance, broken down into different OWASP categories. We are using signals, such as where we are still not the best, as data to improve the agent. Last, working strictly within program guidelines or experimenting with different agents to see what works and what doesn't on real-world targets.

Just being number one is not the headline. But what it does mean, is that the same agent performing engagements for enterprise customers on their production environments can turn around and also perform at the highest caliber on HackerOne engagements.

External validation, on a platform designed to be hard to fool, is worth something. And now back to our customers!

Tenzai Hits #1 on HackerOne - In Under 90 Days

‍What we found

Leaderboard as a test

On being #1

‍
What we found