Blog

Block bot traffic at the tagging layer

A handful of patterns in the request stream account for most bot noise in your reports, and they are easy to filter at the container.

If your GA4 dashboard suddenly spikes with traffic from a country you do not advertise in, or your Meta CAPI match rate falls without explanation, scrape bots are usually the cause. Filtering at the tagging layer means the noise never reaches your destinations, which is cleaner than filtering downstream.

User agent patterns to drop

In a Custom Template or a server-side variable, check the incoming user agent. Drop the request if it matches any of these patterns:

  • bot, crawler, spider, scraper as substrings.
  • Empty user agent. Real browsers always send one.
  • User agent strings under 30 characters. Almost always automation.
  • Specific known scrapers: SemrushBot, AhrefsBot, DotBot, facebookexternalhit when not coming from a known referer.

The single most effective filter

If you can only add one check, drop requests with no document.referrer on landing-page hits AND no _ga cookie set. Real first-time visitors set the cookie within milliseconds; bots usually do not. Together these two signals catch most of the volume that would otherwise pollute your reports.

Implementation in sGTM

Add a Custom Variable that returns true for "is bot." Reference it in every tag's exception trigger. The tag fires only when the variable returns false.

const ua = getRequestHeader('user-agent') || '';
const lower = ua.toLowerCase();
return /bot|crawler|spider|scraper/.test(lower) || ua.length < 30;

For more aggressive filtering, combine with IP-based rules. Known scraper providers (DigitalOcean, Hetzner cloud IPs, AWS without legitimate referer) account for another meaningful slice. The trade-off is that you may also block headless testing tools your team uses, so document what you filter and why.

If your bot problem is large enough that filtering hurts performance, consider a layered approach with simple heuristics first and rate limiting after that.