HeadlinesBriefing favicon HeadlinesBriefing.com

Nginx logs expose AI model fetches vs. human clicks

Hacker News •
×

Ali Khallad set up a custom nginx log format and queried ChatGPT, Claude, Perplexity, Gemini and Google AI Mode about a domain he controls. By tail‑ing the access log he could see whether the model fetched the page or simply cited an indexed copy. The test distinguishes provider‑side retrieval—identified by a dedicated user‑agent token and no referrer—from a human clickthrough generated by the answer.

Logs revealed that ChatGPT issued bursts of GET requests with a “ChatGPT-User/1.0” agent, no referrer, and rotated IPs, matching OpenAI’s bot documentation. Claude behaved similarly, first requesting /robots.txt then following redirects using a “Claude-User/1.0” token. Perplexity also fetched a page directly with its own user‑agent, though the sample size was small enough to leave its overall behavior ambiguous.

Attempts to capture a distinct provider fetch for Google’s Gemini or AI Mode came up empty; only ordinary browser visits arrived with standard Chrome agents and a Google referrer. Because Google does not emit a unique AI user‑agent, logs cannot separate Gemini traffic from regular Search. Consequently, only the three verified bots—ChatGPT‑User, Claude‑User and Perplexity‑User—provide reliable, observable origin retrieval signals.