hemanshu baviskar

← back

8 march 2026

building close to perfect chromium fork for stealth

The first time I ran my "stealth" browser against a real anti-bot, it died in 90 milliseconds. Not blocked. Fingerprinted, scored, redirected to a honeypot, and logged. I had spent a weekend layering puppeteer-stealth tricks on top of vanilla headless Chrome and felt pretty clever about it. The detection script ran one line:

Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver').get.toString()

My override returned a string that didn't say [native code]. Game over.

That was the moment I stopped trying to patch JavaScript and started reading C++. This post is the trail of breadcrumbs from there to a fork that survives in the wild. I'm writing it as the curious idiot I was a month ago, not as someone pretending to be an expert now.


Lesson one: lies have to be told in the right language

Every stealth tutorial online wants you to inject a preload script. Override navigator.webdriver. Fake navigator.plugins. Wrap chrome.runtime. They all share one fatal property: they happen too late, in the wrong place, and they leave forensics.

A real navigator.webdriver getter is a pointer into Blink's compiled binary. Yours is a JavaScript function. The descriptor object has a different prototype, different length, different name, different stringified body. You can patch Function.prototype.toString to lie about it, and now toString.toString() rats you out. You can patch that, and now the v8 stack trace from (()=>{throw 0})() shows your wrapper. There is no fixed point. You are playing whack-a-mole against a system that has more moles than you have hands.

So I did what I should have done on day one and opened third_party/blink/renderer/core/frame/navigator.cc:101:

bool Navigator::webdriver() const {
  if (RuntimeEnabledFeatures::AutomationControlledEnabled())
    return true;
  ...
}

There it was. The lie is told in C++, before any JavaScript runs. The flag is set in content/child/runtime_features.cc:459, where three command-line switches each turn it on:

{wrf::EnableAutomationControlled, switches::kEnableAutomation,    true},
{wrf::EnableAutomationControlled, switches::kHeadless,            true},
{wrf::EnableAutomationControlled, switches::kRemoteDebuggingPipe, true},

The browser was snitching on itself the moment I asked for headless mode or a CDP pipe. Removing those three lines was the most satisfying ten-character patch I have ever written. The getter still exists, it still has its real [native code] body, the descriptor is byte-identical to a non-headless Chrome. It just answers false instead of true.

That moment reframed the whole project. Stealth is not about adding lies. Stealth is about not telling the truth in places where real Chrome doesn't tell the truth either.


Lesson two: the leaks are everywhere, but they all have addresses

Once you accept the C++ approach, the work becomes a treasure hunt. Each detection signal has an owning file. You grep, you read, you patch. The leaks I found, roughly in the order they bit me:

The User-Agent confessing. components/embedder_support/user_agent_utils.cc:230 literally checks for --headless and rewrites the product token to HeadlessChrome/<version>. Why does this exist? Probably so Google's own infrastructure can tell its automation apart from real users. Helpful for them, fatal for me. Ten lines deleted. But there is also a brand list (GetUserAgentBrandFullVersionList) used for Sec-CH-UA hints, and I learned this the painful way: if you scrub the UA string but forget the brand list, you have created a new fingerprint. The user whose UA says Chrome but whose Sec-CH-UA says HeadlessChrome. That is a more memorable lie than telling the truth would have been.

The case of the missing chrome object. Real Chrome puts a populated window.chrome on every page: chrome.app, chrome.runtime, chrome.loadTimes, chrome.csi. Headless ships a stripped version. Detection scripts check three things in order: does the object exist, does chrome.runtime.OnInstalledReason enum exist even with no extension installed, does chrome.loadTimes() return a populated struct. The construction site is content/public/renderer/chrome_object_extensions_utils.cc. The fix is to construct the same object regardless of mode. The depressing realization is that a lot of these leaks exist because someone decided headless doesn't need this code path. Every one of those decisions is now a fingerprint.

The permissions contradiction. This one is famous in puppeteer circles and it is real. Type this into headless:

Notification.permission                                        // 'default'
(await navigator.permissions.query({name:'notifications'})).state  // 'denied'

Real Chrome gives you default and prompt. Headless gives you default and denied, and the contradiction is the tell. Two files own it: content/browser/permissions/permission_controller_impl.cc returns DENIED when no platform notification service is bound, and third_party/blink/renderer/modules/permissions/permissions.cc passes it through. The fix is a stub service that says ASK, which becomes prompt. The same shape of bug exists for clipboard-read, geolocation, and camera. Once you know the pattern, audit them all.

Plugins that are not there. navigator.plugins.length === 0 on Linux headless. Real Chrome on Linux has three. They live in third_party/blink/renderer/core/page/plugin_data.cc and come from the embedder, which in headless returns nothing. Add the PDF Viewer pseudo-plugin back. It is a built-in extension and the metadata is static, so this is a clean patch.

WebGL telling on the GPU. getParameter(UNMASKED_RENDERER_WEBGL) returns something like ANGLE (Google, Vulkan 1.3.0 (SwiftShader Device...), SwiftShader driver) on a server. No human's GPU produces that string. I keep a per-session profile sampled from a small set of real Intel/NVIDIA strings and override the unmasked extension's return values in webgl_rendering_context_base.cc. The follow-on rabbit hole: getSupportedExtensions() order, MAX_TEXTURE_SIZE, ALIASED_LINE_WIDTH_RANGE all leak too, and you have to keep them mutually consistent. Lying inconsistently is worse than not lying at all.

The audio fingerprint. OfflineAudioContext.startRendering() produces a buffer whose last few floats hash to a stable value, deterministic per CPU and build. There is a famous fingerprinting library that does exactly this. The fix is one line in offline_audio_context.cc::HandlePostRenderTasks: add ulp-scale noise to the final mixdown, salted per session. The math is actually fun here. You cannot just add white noise because the buffer must round-trip through getChannelData, but ulp perturbations survive cleanly.

Sec-Fetch-User and userActivation. This one took me longest. Real human navigation sets Sec-Fetch-User: ?1 on the request and flips navigator.userActivation.hasBeenActive to true. CDP-driven navigation sets neither. A detection script can then correlate: page says no user activity, but request claims to be top-level navigation. Bot. The fix wants two patch sites: navigation_request.cc::ComputeFetchMetadataHeaders to forge the header when synthetic input is driving, and user_activation.cc to call LocalFrame::NotifyUserActivation from the synthetic input pipeline. This couples to the synthetic mouse system I built, which is its own essay.

document.hasFocus() lying flat. Headless reports hasFocus() === false and visibilityState === 'visible' as a contradictory pair. Real Chrome only contradicts itself when you alt-tab, briefly. A bot that idles in that state for minutes is visibly different from a user. One-line fix in web_contents_impl.cc::IsFocused.

Codec support. canPlayType('video/mp4; codecs="avc1.42E01E"') returns '' on open-source Chromium because H.264 is not built in. Real Chrome returns 'probably'. The check is in media/base/supported_types.cc::IsSupportedVideoType. If your fork can ship the proprietary codecs, build them. If not, the next-best lie is to claim support at the Blink layer for known-fingerprinted strings and accept that pages which actually try to play those streams will fail. There is no clean answer. You pick your detection profile and live with it.

The small ones, in a cluster. navigator.getBattery() returns 100% charging forever in headless; real laptops jitter, so I randomize per session. navigator.hardwareConcurrency reports the actual server core count, often 32 or 64; clamp it to a believable profile (4, 8, 12, 16). navigator.deviceMemory defaults to 8; match the profile. navigator.languages returns just ["en-US"]; match the Accept-Language header order. None of these are clever. They are just places nobody walks to until they are running into a wall in production.

The geometry tell. outerWidth - innerWidth is the width of Chrome's window chrome (the UI chrome, not the product). On Linux it is around 14px, on macOS 0, on Windows 16. Headless returns 0 always. So a UA that claims Windows but reports 0px is screaming. I synthesize fake outer geometry per session, matched to the spoofed UA platform.

The CDP residue. Even after neutering Browser.getBrowserCommandLine, attaching CDP leaves footprints. Runtime.enable causes console.debug calls to fire Runtime.consoleAPICalled, which a clever page detects by overriding console.debug with a property getter and timing the first call. Page.addScriptToEvaluateOnNewDocument runs in isolated world id 1, creating a phantom execution context observable through Performance API entries. For agent use I tag CDP sessions with a "stealth" flag in page_handler.cc::AddScriptToEvaluateOnNewDocument and route injection through the main world. You give up isolation, you stop creating a phantom.

The pattern across all of these: someone, at some point, decided headless doesn't need this code path. Each of those decisions, accumulated over years, is the fingerprint surface.


The other half: making it fast

Stealth was the hard half intellectually. Cold start was the hard half mechanically. Less detective work, more going through every millisecond and asking why it exists.

Spare renderers, plural, pre-JITted. Chromium's SpareRenderProcessHostManager keeps one warm process. For an agent fleet, one is laughable. I keep a per-BrowserContext pool sized to expected concurrency. After fork I run a tiny prelude in the spare's V8 isolate that touches Promise, fetch, microtask drain, and the parser entry point. By the time a real navigation arrives, the hot paths are already JIT-compiled and cached.

Same-site WebContents reuse. A pool of warm WebContents is faster than warm renderers because you skip BrowsingInstance creation entirely. A SessionContext per session (own StoragePartition) makes this safe. Session A cannot read session B's storage even though both might recycle the same renderer.

Stripping what ephemeral profiles do not need. Origin trials, Privacy Sandbox, Bluetooth, Geolocation, prefs persistence, field trials. None of it matters for a tab that lives 30 seconds. Removing their initialization saves roughly 150ms of synchronous work. The catch is that a few Blink paths assume those services exist and will null-deref. You keep the interface alive and stub the implementation. I learned this by crashing the browser six different ways.

Skipping prefs/policy disk probes for incognito. chrome/browser/profiles/profile_impl.cc::CreateProfileForKeyedServices does several synchronous file probes that are pointless for ephemeral in-memory partitions. Short-circuit when the profile is incognito-only.

Compositor throttling. --headless-render-fps=10 issues BeginFrame at a fixed cadence from the browser process and lets the GPU/Viz stack idle between ticks. Mostly memory and CPU savings rather than latency, but at scale it lets me pack roughly 3x more sessions per host.


What I would tell a month-ago me

Read the file. Every stealth and perf trick I learned was already named in a comment somewhere in content/ or third_party/blink/. The codebase is unfriendly but it is honest. The headless tells are right there, in plain English, in files you can grep. The only real skill is being willing to follow a flag through runtime_features.cc into Blink and out the other side, three or four hops, before you write a single line of patch.

The stealth game is not a battle of cleverness. It is a battle of attention. The detection scripts are written by people who read the same source code I did. Whoever reads more of it, more carefully, wins.

I am still wrong about a lot of this. If you find a leak I missed, tell me and I will go look it up.