- Vulnerable U
- Posts
- Researchers Hack Source Code from Google Gemini
Researchers Hack Source Code from Google Gemini
Security researchers found a way to exfiltrate internal binaries and proto files from Google Gemini's Python sandbox—without breaking out of it.

How a carefully orchestrated vulnerability disclosure uncovered internals of Google's AI sandbox and led to leaked code from inside the Gemini ecosystem.
When most people think about hacking AI, they imagine clever prompts or weird jailbreak tricks. But what if you could go way deeper?
That’s exactly what happened when security researchers Roni “Lupin” Carta, Justin “Rhynorater” Gardner, and Joseph “rez0” Thacker returned to Google’s bugSWAT event in Las Vegas. What began as a security challenge turned into a full-on exploration of Google Gemini’s Python sandbox—and ended with a surprising twist: the extraction of internal binaries and traces of Google’s own source code.
Here’s how it unfolded.
Google's AI Sandbox
Google’s Gemini platform allows users to interact with its LLMs in real time, including a feature called the Python Sandbox. It was designed to let users safely run code without jeopardizing system integrity. The sandbox is built on Google’s gVisor, a hardened user-space kernel that isolates code execution through syscall interception and tailored permissions.
The gVisor sandbox is so tightly protected that Google offers a $100,000 bounty for any successful escape.
But escaping wasn’t the researchers’ goal.
Instead, they asked: What if the danger wasn’t in breaking out—but in what was already inside?
Google’s AI sandbox is part of their broader Vulnerability Reward Program, which paid out over $140K for GenAI bugs in 2024 alone. With live-hacking events like bugSWAT in Vegas and new bounties for AI systems, it’s clear that the company is preparing for a future where LLM security becomes as important as browser or mobile hardening.
The First Clue: Arbitrary Python Execution
Thanks to early access to a preview version of Gemini, the team quickly realized they could run arbitrary Python code inside the sandbox. They discovered that despite being a “safe” space, the environment allowed full use of the os
library—meaning they could start mapping the internal filesystem.
A custom recursive function helped them explore directory structures, check file permissions, and identify file sizes. One file stood out immediately:
/usr/bin/entry/entry_point
— a binary over 579MB in size.
The Binary That Shouldn’t Be There
The team attempted to exfiltrate the binary directly, but outputting the full file into the front-end interface caused timeout issues. Network calls via TCP, HTTP, and DNS were also blocked, cutting off traditional exfiltration paths.
Instead, the team split the file into 10MB base64 chunks and used the sandbox's output to reconstruct it outside the environment.
Once reconstructed, the binary was analyzed using several tools:
file
: Confirmed it as an ELF 64-bit shared objectstrings
: Revealed internal paths and references togoogle3
— Google’s proprietary source code repositorybinwalk
: The breakthrough tool that extracted an entire file structure, including full directories of Python code
That’s when things got real.
Inside the Binary: Leaked Source Code
The binwalk extraction revealed a folder structure containing multiple google3
components, including:
assistant/boq/lamda/execution_box
sandbox_interface/
And files referencing Google services like Flights, Maps, and YouTube
Most notably, they found a file:google3/assistant/boq/lamda/execution_box/images/py_interpreter.py
This file included string signatures designed to detect script dumping—ironic, given what had just occurred.
At this point, the researchers had not only retrieved an internal Gemini binary, but also uncovered parts of the underlying Python logic used to interface with Google services.
This kind of asset exposure isn't limited to AI platforms. In another case, some of these same bug bounty hunters uncovered a $50K software supply chain vulnerability through exposed DockerHub images—complete with private tokens and production pipeline access. These missteps often happen not from exotic zero-days, but from overlooked components in complex systems.
RPC Pipes, File Descriptors, and Tool Access
Diving deeper into the recovered code, they discovered the sandbox wasn’t as sealed as it appeared. While network isolation was tight, Gemini’s internal tools communicated through file descriptors using protobuf-based RPC.
Functions like _set_reader_and_writer()
and run_tool()
allowed code running in the sandbox to talk to Google’s own back-end systems by writing to specific file descriptors (like /dev/fd/5
and /dev/fd/7
). These channels are what let Gemini run external tools during agent operations, like fetching flight info or chart data.
This led to an important question:
Could different prompts or execution contexts give researchers access to more privileged sandboxes?
Prompt Injection Meets Sandbox Behavior
Based on a review of the ReAct paper (which Gemini is based on), the team hypothesized that they could trigger higher-privilege sandboxes through indirect prompt injections during the model’s “planning” phase.
In testing, they occasionally succeeded.
These sandboxes offered different levels of tool access depending on whether the user input was processed via Gemini’s front-end or backend agent logic. That subtle difference could potentially unlock extended capabilities.
Though the researchers didn’t fully exploit this pathway, the mechanism shows how AI-driven sandboxing decisions could expose unpredictable security behaviors.
The Leak That Mattered Most: Proto Files
The final discovery came through a simple trick: dumping string output from the extracted binary and searching for keywords like “Dogfood” (Google’s term for internal testing environments). That revealed embedded Protocol Buffer (.proto
) files—essential schema definitions that dictate how Google systems structure and transmit internal data.
Files exposed included:
security/credentials/proto/authenticator.proto
security/loas/l2/proto/identity_types.proto
privacy/data_governance/attributes/proto/classification.proto
These files don’t contain user data themselves but define how sensitive data is handled, classified, and transmitted inside Google’s internal infrastructure.
This unintentional inclusion happened because of an automated step in Google’s build pipeline that added unnecessary proto definitions to the sandbox binary.
Google’s Response and Security Takeaways
Google’s security team worked directly with the researchers throughout the process and confirmed that:
The sandbox never exposed user data
The inclusion of proto files was unintentional but didn’t violate user privacy
Public disclosure was approved by Google after review
Still, the implications are serious.
This incident underscores how even well-designed isolation environments can leak critical internal logic, especially when model behavior, file bundling, and sandbox privileges interact in unpredictable ways.
This isn't the first time this trio has landed major wins in AI security. Justin, Joe, and Lupin previously won $50,000 in bug bounties at a Google live-hacking event, including $20K for a vulnerability that allowed Bard to return images of user emails. That campaign showed how indirect prompt injection could trigger unexpected data leaks—even when sandboxes are supposedly locked down.
Why This Matters
As the LLM arms race accelerates, the security of AI agents, model execution sandboxes, and developer-facing tools will become central to enterprise risk. Gemini isn’t alone—every major model vendor is deploying interfaces that let users run code, manage sessions, or access tools like search, charts, maps, and more.
What this research highlights is simple:
There’s more to securing AI than prompt filtering or sandboxing.
From build pipelines to API communication layers, every moving part is a potential attack surface. And when those systems touch proprietary code and cloud-scale infrastructure, the stakes couldn’t be higher.
This isn’t the first time Google’s internal tooling has created unexpected exposure. In a recent finding, a vulnerability in Google Cloud Build allowed attackers to destroy data across multiple projects, showing how deeply integrated systems can unintentionally become high-value attack surfaces.

Stay tuned to VulnU for deeper dives into AI security, risks, and the vulnerabilities hiding inside the tools we trust.