Question Your Analytics

Now, more than ever, small business owners need to understand how to make sense of their web analytics before taking actions based on the data. Increasingly, bots and spammers are adding noise that needs to be filtered out before jumping to conclusions.

Lately, I have gotten a lot of questions from clients along these lines:

  • Why are there so many hits from Russia? Am I being attacked?
  • Why am I getting big traffic spikes on days I didn’t publish a new blog post?
  • Is ɢoogle.com real?

It is really important to configure filters on your analytics account which can identify known bots and spam sources. Only then can you compare the overall totals to a pool of much more likely “real” traffic. Without that, you are going to make conclusions based on flawed data.

Warning Signs

Many small businesses have a website mainly to serve as a place for people to find information about their company. They don’t do business through their website directly, or offer any website-exclusive content beyond the occasional blog post. Their website therefore gets relatively little traffic, which makes false hits even more problematic, since it represents a huge percentage of the total.

There are a lot of ways to identify likely spam in your analytics data. For many of my clients, it is not uncommon for more than 40% of incoming traffic in a given month to be traceable to fake sources. The screenshot above shows real data for a small business client of mine. Since January 1st of this year through today, just over 30% of their total traffic can be filtered out based on a few factors:

  • Filter Out Known Bots: There are groups who catalog known bots of many varieties. By teaching your analytics tools to ignore traffic from those sources, you can eliminate another whole category of bad traffic very quickly. These lists are updated regularly, which means that your analytics filter needs to be kept up-to-date.
  • Limit Hostname to your Domain(s): This is a confusing one, but sometimes traffic in your analytics never actually hit your site. There are exploits of Google Analytics which register incoming traffic that never actually visited the site. By filtering out traffic to anything but valid Hostnames for your company, you immediately remove a chunk of the problematic data.
  • Language Source: For most small businesses located in the U.S. the only language source you care about is “en-us” for United States English. Below is an example of data from the same client, over the same period of time, broken down by language.
Analytics by Language
“All Users” is the total data while “All Sessions (No Spam)” is the filtered data which removes known bots and only allows valid Hostnames.
  1. en-us: For this client, only 68.54% of the total traffic was “en-us” compared to 97.49% of the filtered traffic.
  2. (not set): Language “not set” is a typical attribute for bots, and made up 21.51% of all traffic to the site, with 0.00% making it through the spam filter.
  3. Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump!: Obviously this is not a real language. These began showing up frequently in October as the election coverage reached peak interest. It is spam which traces back to Russia and former-Soviet countries. The target of these spam links are the analysts reading the analytics report. The attackers are hoping you will be curious enough to follow the link. This is an example of malicious intent which is actually harmless to your site if left alone.
  4. pt-br (Portuguese – Brazil): Again, if traffic is coming from a large country where you don’t do business (such as Brazil), the odds are it is fake. The two hits that made it through the spam filter are likely also fake and should be filtered out in the future as I improve my detection methods and update from the lists of known bots.

Use the Right Data

You should be using analytics to determine where to focus your website and social media efforts. Knowing the ratio of desktop to mobile users or Android vs. iOS traffic can determine how you update your website, which social media strategies will lead to the most growth, or which platform you roll out an app for first. Bad data can skew your conclusions and lead you to make the wrong decisions.

Good analytics are an essential tool and many marketing and core business decisions should no longer be made without justification from the data. Make sure you understand how to read and make sense of that information and never rely solely upon generic totals without filters.