I noticed that my blog is inaccessible for about a minute at different times of the day. When I checked my logs it seems that two extremely aggressive search engines spider the site and use up my server resources. They are the Chinese search engine, Baidu and the Russian search engine, Yandex.
I’m on a shared hosting so resources are precious to me so I wanted to find out what I could do. I tried to block them from my site using robots.txt, as I couldn’t see the benefit of them crawling it. Apparently these two have been classed as bad bots as they do not obey robots.txt. So then I tried to block them using the .htaccess file instead but still they came.
Some forums have suggested that the misbehaviour isn’t coming from those search engines but from fake bots that definitely do not obey the rules. With the ongoing brute force attacks on WordPress sites that started earlier in the year, I’m a bit manic about security. So despite using a security plugin on my blog I’ve taken the added protection of signing up with CloudFlare. Not only do I get a free caching service that speeds up my website, I’m also protected from these aggressive bots. I’m looking forward to seeing a decrease in spam as well because spammers get on my last nerve.
I’ve not seen any downsides as yet but I’ll let you know if I do.