Blocking Price Scraping & Content Scraping BotsOct 7, 2021
If you operate a retail website, at some point you’ll run into Price / Content Scraping bots. Typically sent by your competitors, Price / Content Scraping bots harvest your website data so your competitors can gain an advantage of some sort (for example, viewing your full product catalog & pricing, while watching for price changes).
Because bots are automated, it’s easy to make them greedy. Consequently, most Price / Content Scraping bots cycle through your website almost continually, watching to see if you make changes to your product line, like introducing a sale price or launching a new product.
Price Scraping and Content Scraping bots don’t help you, and they wast your server resources. For many retail websites, Content Scraping Bots can make up 75% or more of the requests to your server! In some cases they practically DDoS your website as they harvest data from you.
The Turnstil.Cloud Dashboard image above shows a Content Scraping Bot as it routinely pulls data from a website. Note the even intervals where it scans the website (every 5 minutes) and the huge amount of requests launched every scan (63% of total server requests over the last hour). For a website of limited resources, this could be enough to slow down your server or even crash the database.
Price / Content Scraping bots like the one pictured above are easy to stop with Turnstil.Cloud – just click the “Block” Button beside the user agent in the Top 10 list (or the corresponding IP address in the IP Top 10 list). Just like that, the scraping stops …