Blog
Three things to check the week before, and one thing to do on the day if events start dropping.
Server-side containers usually handle traffic spikes well, but Black Friday traffic is large enough to expose the configurations that have been quietly running too close to their limits. A short pre-flight checklist saves panicked debugging during the spike itself.
If you self-host on App Engine, check your app.yaml. The default max_instances is 5, which is fine for normal traffic and immediately throttles during a spike. Bump to 25 or higher for Black Friday week. The cost increase is real but small relative to the lost revenue from dropped events.
On SprTags, scaling is automatic and transparent. The bump is a no-op.
Meta CAPI, TikTok, and LinkedIn all have rate limits per dataset. If your traffic doubles or triples on Black Friday, you can hit those limits and start seeing 429 responses. The destinations queue events for retry, but if the queue fills, events are dropped.
Check the rate limits in each destination's API docs. For Meta, the limit is around 1000 events per second per pixel. For TikTok, similar. For LinkedIn, 200 per second. If your peak projected event rate exceeds 60 percent of the limit, batch events on your tagging server before forwarding.
If you have verbose logging enabled (Always Log to Console in your client templates), the volume during a spike can produce surprising bills the following month. Switch to "Errors only" for the duration of the spike. You lose visibility into successful events but keep the alerting on failures.
First, check whether events are actually dropping or whether your reporting is just lagging. Real-time reports in GA4 lag by 30-60 seconds normally; during a spike, 5-10 minutes is not unusual. Wait before assuming the worst.
If events are genuinely dropping, look at container logs. The most common spike-triggered failures: timeout from a slow destination (Meta or TikTok during their own peak hours), exceeded instance count, or a CSP error if your custom domain SSL is failing under load.
In the days after the spike, compare your container's request count against your shop's checkout count. The two should be in the same ballpark. A material gap means you lost events somewhere; the timing of the gap will tell you whether it was a destination issue, a container issue, or a client-side issue.
If you saw 403 errors during the spike, that diagnostic is in a separate post.