We have certain measures in place to prevent robot traffic from polluting your data: an organisation that records all known robots provides us with a list (IAB/ABC) on which we base our traffic exclusions so that bot traffic is not included alongside your other data in the interface.
Some organisations will monitor your site's viability with bots: Microsoft System Center Operations Manager, Gomez Agent, Observer, Nagios, etc. They are excluded by default, but you can choose to include them by going to Configuration > Parameters > Monitoring/Excluding traffic > Robots.
Off-line bots are not excluded by default, but you can choose to exclude them from the same menu as above. Here are some of the more notable ones: Download Ninja, Heritrix, Webcopier, PageNest Pro, WebZip, etc.
When traffic is excluded, the exclusion is based on the IP addresses or User Agents of robots in the above list.
The IAB cannot however record all robots in existence. Some, especially the more recent ones that have not yet been listed, may bypass our exclusions and cause unusual spikes in your traffic. When this is the case, you can often recognize robots by their abnormal behaviour. Using various indicators, such as the following six, you can check for this behaviour amongst your visitors:
- Time spent/ pages: if it's short, it may indicate a crawler
- Page views/ visits: if it's high, same
- URLs (Referrer sites): a spike in visits from an unknown domain is suspicious
- Countries (Geolocation): a spike in visits from a country that doesn't usually feature in your stats is suspicious
- Towns (Geolocation): same
- Models (OS): a spike in visits from the same device model is suspicious
If you suspect abnormal traffic, please contact the Support Centre while providing context for your suspicions. You can traffic based on IPs yourself via Configuration > Parameters > Monitoring/Excluding traffic > Monitored/Excluded IP addresses. We may, in certain exceptional cases, be able to provide you with the IP addresses linked to traffic you consider suspect but were unable to conclusively attribute to robots.
To do so, the admin for the contract will need to send us a written request via the Support Centre with the following elements:
- Client ID and Site ID
- Domain name of the site where there is suspicious traffic
- Explicit request for the IP addresses to be shared in order to exclude them from your traffic
/!\ Please note that if you have IP anonymisation (e.g. with the CNIL Exemption), we will not be able to provide you with a precise IP, but the associated range. It is up to you whether to add this range as an exclusion or monitoring.
It is also possible to exclude traffic through the Exclusions available in Data Management. You could then base your exclusion on one or more criteria such as city, organisation, ISP, User-Agent, etc.