This post assumes you know a little about bots and how they might influence your website. If you need to brush up on your understanding, go back to 'What is a Bot?'. Bot traffic on your site is usually innocuous, but because bots behave similarly to humans, it might skew your Google Analytics traffic statistics if it isn't properly filtered out. While screening out bot traffic won't guarantee that your data is totally clean, it will get you started.
How to exclude existing bot traffic
Fortunately, Google is aware of the majority of bots that your site is likely to encounter, so you can easily block this traffic out of your Google Analytics account. First, simply go to Google Analytics and choose the 'Admin' cog in the bottom left corner. Then, to exclude all hits from known bots and spiders, go to 'View Settings' and select the box that states, "Exclude all hits from known bots and spiders."
It's worth remembering that selecting this option will only influence traffic after that time, not retrospectively, so make sure you do it as soon as possible.
How to identify and filter unknown bots
However, even with this option set, Google Analytics' well-known bot filter isn't foolproof, and bot traffic can still enter your account. When you start digging, you'll typically see a set of people (from the exact location, using the same device, using the same network provider, etc.) that behave abnormally differently than the rest of your customers. Bot traffic is usually labeled as 'Direct,' and it shows up in your Google Analytics account as a rise in direct traffic. Abrupt surges can sometimes detect bot hits in traffic in specific dimensions. However, this is not always the case. Playing around with different combinations of dimensions and secondary dimensions to see if you can identify anything out of the usual is the most straightforward technique to hunt for bot traffic.
Here are some bot traffic red flags to keep an eye out for:
- A great number of uses are associated with the place (not set)
- Users in groups (e.g., from the same location, network, service provider, etc.) that have high bounce rates or short session durations (a few seconds)
- Users in groups with an unusually high number of new users
- Users that have a relatively high bounce rate
- Sudden surges in traffic for specific dimensions
- Sources of referral that are unusual or questionable
- Hostnames that aren't associated with your website (Hostname is a secondary dimension)
You may put up a filter to exclude bot traffic from future Google Analytics data if you've discovered a bot.
How to create a filter for unknown bots in 10 steps
- Create a New Google Analytics View — This is the most crucial step, since once you've filtered data out of Google Analytics, you'll never be able to get it back. You may start by creating a new view to test your modifications and ensure everything works as expected. Keep an unfiltered view in your Google Analytics account at all times. This serves as a fallback in the event that something goes wrong.
- Examine Your Bot - What are the common threads in your bot traffic? (For example, is it the hostname, city, or IP address?)
- Go to the Admin panel and click 'Filters' in the 'View' column on the right for your new view.
- Click ‘Add Filter
- Give your filter a name that will make it easier to remember in the future.
- Choose a filter type. You may need to use a Custom or Predefined filter depending on the criteria you'll use to filter out the bot. Look over each to find which one best fits your situation.
- Make sure 'Exclude' is chosen, then use the 'Filter Field' drop down to pick the type of dimension you want to exclude, and in the 'Filter Pattern' box, put the text you're using to identify bot traffic. (For example, if you're getting bot hits from nastybots.com, choose 'Hostname' from the 'Filter Box' selection and type 'nastybots.com' into the 'Filter Pattern' field.)
- Hit 'Save'!
- It's now time to double-check that your filter functioned as expected. It usually works right immediately, but it might take up to 24 hours, so give it a day or two. To ensure this is filtering the right traffic, compare data from your new testing view with data from your preferred view. The bot traffic should still be shown in your preferred view, but it shouldn't be visible in the testing view. (Make sure that the bot traffic is the only thing you've blocked out!) Check to see if any of your actual user data has been filtered out). Check back a couple of times over the next few days to be sure nothing has fallen through the cracks.
- If your testing view is performing as expected, it's time to add your new filter to your preferred view. Simply follow the steps above, ensuring sure everything is identical to your testing view.
That's all there is to it! You've filtered bot traffic out of your valuable Google Analytics data.
Please contact us if your filter hasn't functioned as planned or if you believe your Google Analytics account may benefit from an audit. We'd love to have a peek.