Criteo crawler

What is Criteo crawler?

Criteo crawler is a software that visits web pages and analyzes their content to serve relevant ads on them.

Criteo crawler is identified by the following user agent:

CriteoBot/0.1 (+https://www.criteo.com/criteo-crawler/)

The Criteo crawler has associated the following Autonomous System Numbers (ASNs): 44788, 19750, 55569, and the following list of IP addresses:

  • 178.250.0.0/21
  • 185.235.84.0/22
  • 91.212.98.0/24
  • 91.199.242.0/24
  • 2a02:2638::/32
  • 74.119.116.0/22
  • 199.204.168.0/22
  • 177.73.128.0/21
  • 2620:100:a000::/44
  • 116.213.20.0/22
  • 182.161.72.0/22
  • 2406:2600::/32

Why does Criteo crawler visit my site?

Criteo is a leading global technology company powering the world’s marketers with trusted and impactful advertising. Criteo empowers companies of all sizes with technology to better know and serve their customers. Criteo has a contextual advertising offering to help its publisher partners better monetize their content and support advertisers by better aligning their ads to relevant web pages.

To support its contextual offering, Criteo will analyze the public web content by crawling web pages. Criteo’s technology will identify content categories on a given web page.
For example, an article about sport and running shoes would be classified in the categories “sport” and sub-category “running”.

When does Criteo crawler visit my site?

Criteo crawler will attempt to access URLs only when your website is sending a request to Criteo to deliver an ad on your domain. Criteo crawler limits the visits to your website. The crawler requests access to your website only if the compiled categories are no longer available or no longer up to date.

What data are crawled in my site?

Criteo crawler is a privacy compliant system. The Crawler does not access data of the user navigating your website. The Crawler only accesses the published data publicly available on the internet.

How can I authorize the crawler?

Many premium publishers explicitly allow Criteo crawler to access their sites. Publishers benefit from Criteo’s categorization of their inventory to optimize target campaigns.

To approve Criteo crawler, please add a separate paragraph to the robots.txt as follows:

User-agent: CriteoBot/0.1
Disallow:

How can I exclude the crawler?

If you prefer to exclude Criteo crawler to not visit specific sections of your site, please add a separate paragraph to the robots.txt and specify the path you’d like to exclude as follows:

User-agent: CriteoBot/0.1
Disallow: /path/

If you prefer to exclude Criteo crawler to not visit specific your site entirely, please add a separate paragraph to the robots.txt as follows:

User-agent: CriteoBot/0.1
Disallow: /

Note
Criteo crawler respects the crawl-delay directive (up to 30 seconds, and we accepts decimal values such as 0.1)

More Information

If you need to know more about the crawler, please contact your Criteo representative; if you are a Criteo direct partner, please email us on crawler@criteo.com.