Skip to content

/notes / ja4-fingerprinting

JA4: fingerprinting the handshake, not the headers

Enerik Sina··6 min read

Every anti-bot rule that reads the User-Agent header has the same weakness: the attacker wrote that header. Request headers are free-form text supplied by the client, which means they are exactly as trustworthy as the client itself. A curl script can claim to be Chrome 137 on Windows in one line. The interesting question for defenders is: what does a client reveal about itself that it did not consciously choose to send?

One of the best answers lives below HTTP entirely. Before a single header crosses the wire, the client and server negotiate TLS. The very first packet of that negotiation — the ClientHello — is a structured list of everything the client's TLS stack supports: protocol versions, cipher suites in preference order, extensions, elliptic curves, ALPN protocols. None of it is secret, and almost none of it is commonly configurable. It is a byproduct of which TLS library you compiled against and how it was built.

From JA3 to JA4

JA3, introduced by Salesforce researchers in 2017, hashed those ClientHello fields into a single MD5 string. It worked well until it didn't: Chrome started randomizing extension order specifically to break this kind of passive fingerprinting, and a raw hash gives you no partial information — change one bit and the whole fingerprint changes.

JA4, part of John Althouse's JA4+ suite, fixes both problems. It sorts extensions before hashing (so randomization no longer matters) and is built from three human-readable segments instead of one opaque hash. A JA4 like t13d1516h2_8daaf6152771_b0da82dd1658 tells you at a glance: TLS 1.3, desktop client, 15 cipher suites, 16 extensions, ALPN h2. The segments degrade gracefully — two clients that share a TLS library but differ in ALPN will match on the first two segments and differ in the third.

Why bots fail it

The practical power of JA4 is the mismatch test. A request whose User-Agent claims Chrome-on-Windows should produce the JA4 of Chrome's BoringSSL build. If instead it produces the fingerprint of Python's ssl module, Go's crypto/tls, or OpenSSL-via-curl, the headers are lying — and that one signal is worth more than a hundred header heuristics. Most off-the-shelf scraping stacks fail exactly here, because faking the ClientHello means replacing your TLS library, not editing a string.

  • requests/httpx (Python): distinctive OpenSSL-derived fingerprints, trivially separable from browsers
  • Go HTTP clients: crypto/tls has its own recognizable ClientHello shape
  • Headless Chrome: matches real Chrome — which is why TLS alone is never the whole answer

The honest limits

JA4 is one signal, not a verdict. Impersonation libraries (curl-impersonate, utls and friends) can replay a browser's ClientHello byte-for-byte, and headless browsers pass by construction because they are the real TLS stack. Fingerprints also collide by design: every Chrome 137 on every machine looks alike, so JA4 can tell you what is talking, never who. In production you treat it as one column in a wider matrix — combine it with header order, IP reputation, and behavioral signals, and weight disagreements between layers heavily. A client whose layers disagree about what it is, is almost never human.

If you want to see your own JA4, this site's bot-check page shows the digest Vercel's edge computes for your connection — the same one a defender would see.