Removing Referral Spam From Google Analytics Reports

Removing-Referral-Spam-loves-data-blog-v2.jpg

Referral spam is something we’re being asked about more and more – what is it, and how do we get rid of it? While there’s no perfect solution, there are some great ways to minimise the impact of referral spam on your Google Analytics data quality. Get started with our quick guide to removing referral spam.

What is referral spam?

Referral spam is any traffic that shows up in your Google Analytics account which does not represent genuine users browsing your website. Some examples you might have seen in your accounts include semalt.com, buttons-for-website.com and 100dollars-seo.com; but there are hundreds of others. Individually, most of them don’t generate too many sessions but taken together they can add up:

1 Removing referral spam from Google Analytics reports

How does referral spam work?

Most legitimate web bots (eg. the bots written by search engines to crawl your website) don’t run the Google Analytics Javascript code, and therefore do NOT end up in your Google Analytics reports. However, a number of websites have taken to writing bots and other technologies that end up in your Google Analytics reports (whether intentionally or otherwise). While the motivation behind spam is varied, for many it seems to be the generation of curiosity-based traffic back to the spammer’s website.

How can referral spam be blocked?

At the view level, Google Analytics let you exclude all hits from known bots and spiders.

Removing referral spam from Google Analytics reports 4
Removing referral spam from Google Analytics reports 4

While this will remove some of the spam automatically, it currently does not remove everything. However, you can also create view filters to exclude traffic from certain referrers. Below is an example of a filter that has been created to exclude traffic from semalt.com (or any website with ‘semalt.com’ in the website name):

2 Removing referral spam from Google Analytics reports
3 Removing referral spam from Google Analytics reports
3 Removing referral spam from Google Analytics reports

A few important things to remember:

  • View filters do NOT apply to historical data.
  • Filters are permanent so if you make a mistake you could wipe all of your data. For this reason, it’s crucial to use the filter verification option when creating a filter, and to apply the filter to a test view of your Google Analytics data first.
  • The filter pattern used here is a regular expression. To learn more about regular expressions, check out our regex cheat sheet.
  • Because Google does not have a completely automated solution, every spam referrer has to be specified explicitly by name.

Does this mean I might need to filter out dozens of referrers?

Unfortunately yes. Luckily, this is a collective problem so there’s a few resources out there. One great resource at Lone Goat that has a list of 77 spam referrals, including the regular expressions which let you remove them using 7 filters.

That said, new spam referrers are popping up all the time – since the Lone Goat post was published in May there have been lots of new ones. There is a community-contributed list at GitHub which at the time of our publication listed 271 spam referrals. The GitHub project is here; from which you can click through to the list. (Note that many websites and even domain names are NSFW!)

To create a Google Analytics filter for these, string a bunch of the spam referrers together separated by a pipe (|) and put a backslash (\) before every dot (.). This would give you a regular expression such as the following, to create a Campaign Source exclusion filter from:

0n-line\.tv|100dollars-seo\.com|12masterov\.com|1pamm\.ru|4webmasters\.org

Note that as regex for filters needs to be 255 characters or less, you may need to create multiple filters.

If this sounds laborious, that’s because it is. Fortunately, Simo Ahava’s come up with a great script to add these filters to your account automatically – you’ll just need to authorise the script. You can run it from simoahava.com/spamfilter/. Note that the script is based on the shorter Lone Goat list of 77 spam referrals, not the longer GitHub list.

What can I expect to see in my Google Analytics account?

Although referral spam seems to account for less than 2% of all traffic on average, this depends on your website profile. For smaller websites, the traffic may drop by more than 2%, but it does mean that the data in the accounts will be more meaningful. This is another reason to set up these filters in a testing view first to see what impact this has on your reporting.

Have you seen other spam referral sites in your Google Analytics account? Let us know in the comments.