Many of you have experienced it – the sudden surge in web traffic followed by an unnatural high bounce-rate. What gives? We get to the bottom of it and provide a workaround for Google Analytics.
Recently we were approached by a prospective client looking for insight into sporadic traffic spikes within his analytics data. Being fairly unfamiliar with the intricacies of Google Analytics reporting they turned to us for some advice.
Unfortunately for the client the traffic turned out to be fake visits generated by referrer spam. If you’re not aware of this type of traffic then you should be – it’s skewing your analytics data.
What is Referrer Spam and Why Does it Exist?
There are two types of referrer spam – Crawler and Ghost.
Crawler bots browse your site with one aim, to get traffic back to the referrer website, while Ghost referrers don’t even visit your site (hence the name) but instead use unspecified Google Analytics account ID’s to create these ‘fake’ visits.
Crawlers use a valid hostname within your analytics reports (seen below – 4webmasters.org), and Ghost referrers use invalid hostnames as can be seen below with the (not set) notation.
These visits usually appear in your analytics data with a 0 or 100% bounce rate and an average session duration of 0 seconds. Unfortunately even the robots.txt doesn’t help since the spam referrers ignore it.
Here’s a list of some of the more common referrer spam that has been affecting websites:
and more are being created all the time!
The Best Option for Spam Free Analytics – Create a Valid Hostname
By far the best option with the least amount of work required for upkeep is creating a Valid Hostname INCLUDE Filter in you analytics account. This seems the most efficient way to rid yourself of the evil spam referrer since only your known Hostnames will be included in your reports.
Log into your analtyics account
Select a lengthy timeline
Go to Audience
From this report you’ll identify all your valid hostnames. Remember to include all iterations of your site, i.e. anything that uses your tracking id, as well as the translate.googleusercontent.com and webcache.googleusercontent.com. Anything else that appears in the hostname column but isn’t ‘yourdomain’ relevant is spam.
Now create your hostname INCLUDE list as a regular expression, separating each hostname with a | symbol, something like this:
In Google Analytics go to
- Select New Filter
- Name Your Filter: e.g. Valid Hostname
- Select Custom
- Select Include
- Select Hostname
- Enter Filter Pattern – that’s the Regular expression list you created
- Verify Filter
You’ll see a table similar to that below showing you the before and after effect the filter has on your site. Confirm that you’ve not missed any of your own hostnames, then save your filter and exit the filter set-up page.
Hopefully this post will be helpful to those of you who want to get their hands dirty and delve into your analytics data to create the relevant reports enabling you to filter out that referrer spam. Feel free to comment on your experiences with spam filtering, and if you need assistance in this regard just get in touch and we’ll be happy to assist!
We also recommend setting up Exclude Filters for all those evil spam referrers as a second level of protection. We’ll be addressing how to set those up in a future post.
Good luck with closing the referral spam lid.