• Vulnerable U
  • Posts
  • Researchers Hack Source Code from Google Gemini

Researchers Hack Source Code from Google Gemini

Security researchers found a way to exfiltrate internal binaries and proto files from Google Gemini's Python sandbox—without breaking out of it.

How a carefully orchestrated vulnerability disclosure uncovered internals of Google's AI sandbox and led to leaked code from inside the Gemini ecosystem.

When most people think about hacking AI, they imagine clever prompts or weird jailbreak tricks. But what if you could go way deeper?

That’s exactly what happened when security researchers Roni “Lupin” Carta, Justin “Rhynorater” Gardner, and Joseph “rez0” Thacker returned to Google’s bugSWAT event in Las Vegas. What began as a security challenge turned into a full-on exploration of Google Gemini’s Python sandbox—and ended with a surprising twist: the extraction of internal binaries and traces of Google’s own source code.

Here’s how it unfolded.

Google's AI Sandbox

Google’s Gemini platform allows users to interact with its LLMs in real time, including a feature called the Python Sandbox. It was designed to let users safely run code without jeopardizing system integrity. The sandbox is built on Google’s gVisor, a hardened user-space kernel that isolates code execution through syscall interception and tailored permissions.

The gVisor sandbox is so tightly protected that Google offers a $100,000 bounty for any successful escape.

But escaping wasn’t the researchers’ goal.

Instead, they asked: What if the danger wasn’t in breaking out—but in what was already inside?

Google’s AI sandbox is part of their broader Vulnerability Reward Program, which paid out over $140K for GenAI bugs in 2024 alone. With live-hacking events like bugSWAT in Vegas and new bounties for AI systems, it’s clear that the company is preparing for a future where LLM security becomes as important as browser or mobile hardening.

The First Clue: Arbitrary Python Execution

Thanks to early access to a preview version of Gemini, the team quickly realized they could run arbitrary Python code inside the sandbox. They discovered that despite being a “safe” space, the environment allowed full use of the os library—meaning they could start mapping the internal filesystem.

A custom recursive function helped them explore directory structures, check file permissions, and identify file sizes. One file stood out immediately:

/usr/bin/entry/entry_point — a binary over 579MB in size.

The Binary That Shouldn’t Be There

The team attempted to exfiltrate the binary directly, but outputting the full file into the front-end interface caused timeout issues. Network calls via TCP, HTTP, and DNS were also blocked, cutting off traditional exfiltration paths.

Instead, the team split the file into 10MB base64 chunks and used the sandbox's output to reconstruct it outside the environment.

Once reconstructed, the binary was analyzed using several tools:

  • file: Confirmed it as an ELF 64-bit shared object

  • strings: Revealed internal paths and references to google3 — Google’s proprietary source code repository

  • binwalk: The breakthrough tool that extracted an entire file structure, including full directories of Python code

That’s when things got real.

Inside the Binary: Leaked Source Code

The binwalk extraction revealed a folder structure containing multiple google3 components, including:

  • assistant/boq/lamda/execution_box

  • sandbox_interface/

  • And files referencing Google services like Flights, Maps, and YouTube

Most notably, they found a file:
google3/assistant/boq/lamda/execution_box/images/py_interpreter.py
This file included string signatures designed to detect script dumping—ironic, given what had just occurred.

At this point, the researchers had not only retrieved an internal Gemini binary, but also uncovered parts of the underlying Python logic used to interface with Google services.

This kind of asset exposure isn't limited to AI platforms. In another case, some of these same bug bounty hunters uncovered a $50K software supply chain vulnerability through exposed DockerHub images—complete with private tokens and production pipeline access. These missteps often happen not from exotic zero-days, but from overlooked components in complex systems.

RPC Pipes, File Descriptors, and Tool Access

Diving deeper into the recovered code, they discovered the sandbox wasn’t as sealed as it appeared. While network isolation was tight, Gemini’s internal tools communicated through file descriptors using protobuf-based RPC.

Functions like _set_reader_and_writer() and run_tool() allowed code running in the sandbox to talk to Google’s own back-end systems by writing to specific file descriptors (like /dev/fd/5 and /dev/fd/7). These channels are what let Gemini run external tools during agent operations, like fetching flight info or chart data.

This led to an important question:
Could different prompts or execution contexts give researchers access to more privileged sandboxes?

Prompt Injection Meets Sandbox Behavior

Based on a review of the ReAct paper (which Gemini is based on), the team hypothesized that they could trigger higher-privilege sandboxes through indirect prompt injections during the model’s “planning” phase.

In testing, they occasionally succeeded.

These sandboxes offered different levels of tool access depending on whether the user input was processed via Gemini’s front-end or backend agent logic. That subtle difference could potentially unlock extended capabilities.

Though the researchers didn’t fully exploit this pathway, the mechanism shows how AI-driven sandboxing decisions could expose unpredictable security behaviors.

The Leak That Mattered Most: Proto Files

The final discovery came through a simple trick: dumping string output from the extracted binary and searching for keywords like “Dogfood” (Google’s term for internal testing environments). That revealed embedded Protocol Buffer (.proto) files—essential schema definitions that dictate how Google systems structure and transmit internal data.

Files exposed included:

  • security/credentials/proto/authenticator.proto

  • security/loas/l2/proto/identity_types.proto

  • privacy/data_governance/attributes/proto/classification.proto

These files don’t contain user data themselves but define how sensitive data is handled, classified, and transmitted inside Google’s internal infrastructure.

This unintentional inclusion happened because of an automated step in Google’s build pipeline that added unnecessary proto definitions to the sandbox binary.

Google’s Response and Security Takeaways

Google’s security team worked directly with the researchers throughout the process and confirmed that:

  • The sandbox never exposed user data

  • The inclusion of proto files was unintentional but didn’t violate user privacy

  • Public disclosure was approved by Google after review

Still, the implications are serious.

This incident underscores how even well-designed isolation environments can leak critical internal logic, especially when model behavior, file bundling, and sandbox privileges interact in unpredictable ways.

This isn't the first time this trio has landed major wins in AI security. Justin, Joe, and Lupin previously won $50,000 in bug bounties at a Google live-hacking event, including $20K for a vulnerability that allowed Bard to return images of user emails. That campaign showed how indirect prompt injection could trigger unexpected data leaks—even when sandboxes are supposedly locked down.

Why This Matters

As the LLM arms race accelerates, the security of AI agents, model execution sandboxes, and developer-facing tools will become central to enterprise risk. Gemini isn’t alone—every major model vendor is deploying interfaces that let users run code, manage sessions, or access tools like search, charts, maps, and more.

What this research highlights is simple:
There’s more to securing AI than prompt filtering or sandboxing.

From build pipelines to API communication layers, every moving part is a potential attack surface. And when those systems touch proprietary code and cloud-scale infrastructure, the stakes couldn’t be higher.

This isn’t the first time Google’s internal tooling has created unexpected exposure. In a recent finding, a vulnerability in Google Cloud Build allowed attackers to destroy data across multiple projects, showing how deeply integrated systems can unintentionally become high-value attack surfaces.

Stay tuned to VulnU for deeper dives into AI security, risks, and the vulnerabilities hiding inside the tools we trust.