Criteo Crawler is a software that visits web pages and analyzes its content to serve relevant ads on them.
Criteo crawler is identified by the following user-agent:
Criteo is a leading global technology company powering the world’s marketers with trusted and impactful advertising. Criteo empowers companies of all sizes with technology to better know and serve their customers. Criteo is in the process of building a contextual advertising offering to help its publisher partners better monetize their content and support advertisers by better aligning their ads to relevant web pages.
To support its contextual offering, Criteo will analyze the public web content by crawling webpages. Criteo’s technology will identify content categories on a given webpage.
e.g.: an article about sport and running shoes would be classified in the categories “sport” and sub-category “running”.
Criteo crawler will attempt to access URLs only when your website is sending a request to Criteo to deliver an ad on your domain. Criteo crawler limits the visits to your website. The crawler requests access to your website only if the compiled categories are no longer available or no longer up to date.
The crawler does not extract or store any source code; it only provides data about the publicly available content of the page, such as the language and the categories of the content (e.g. sports > running).
Criteo Crawler is a privacy compliant system. The Crawler does not access data of the user navigating your website. The Crawler only accesses the published data publicly available on the internet.
Many premium publishers explicitly allow Criteo Crawler to access their sites. Publishers benefit from Criteo’s categorization of their inventory to optimize target campaigns.
To approve Criteo crawler please add a separate paragraph to the robots.txt as following:
If you prefer to exclude Criteo crawler to not visit specific sections of your site, please add a separate paragraph to the robots.txt and specify the path you’d like to exclude as following:
If you prefer to exclude Criteo crawler to not visit specific your site entirely, please add a separate paragraph to the robots.txt as following:
It is possible to exclude the crawler when the robots.txt process is not yet available.
For publishers who are Criteo’s direct partners, please contact your Criteo’s representative. The crawling will be excluded from your domains within 24h
For publishers who are not Criteo’s direct partner, please add the User Agent “CriteoBot/0.1” to the robots.txt and contact email@example.com listing the domains that you would like to exclude from the crawling. Criteo will stop crawling your website within 24h.
If you need to know more about the crawler, please contact you criteo’s representative if you are Criteo’s direct partner or email us on firstname.lastname@example.org