You get a bonus - 1 coin for daily activity. Now you have 1 coin

Site load: reduced by blocking robots

Lecture




  Site load: reduced by blocking robots

Many were faced with the problem of slow blogging or their sites, but you need to optimize the scripts to make the site work faster, but many forget that even with low attendance, the site can create a lot of work and hosting will not always work adequately.

How to reduce the load on WordPress or any other engine?

To do this, you can enable page caching or database query caching, but all this work with the consequences of the load is a necessary thing, without a doubt.

Only the load on the site is created by both search engine robots and simple spammers. Each of which invests its part in this business.

Let's look at the statistics collected using the StatPress plugin for WordPress, which does not know everything, but the major search engines and search bots know for sure:

  Site load: reduced by blocking robots

As we can see, the load is created not only by users, but also by a huge share of robots.

These are very different bots, only the tip of the iceberg:

  Site load: reduced by blocking robots

The plugin has a lot of useful data. Including the number of requests for RSS.

  Site load: reduced by blocking robots

We can accurately determine which search robots are useful to us:

  Site load: reduced by blocking robots

According to the data that we see on the screen, we understand that the main indexer of Yandex came ( Yandex / 1.01.001 (compatible; Win16; I ) and the robot indexing multimedia content ( Yandex / 1.01.001 (compatible; Win16; m ).

When promoting a site, it is necessary to analyze this data in order to understand how and what search engines index , and also with what frequency, then forecasting will be more accurate after making any changes to the site.

We also see robots that are not very interesting to us , because they create a load on the site, but we do not receive or receive traffic from them, but they are not very targeted:

  Site load: reduced by blocking robots

And here comes the most interesting moment that many miss.

ALL robots that are not targeted - you need to block!

We see in this report exactly the user agent, which can of course be forbidden in the robots.txt file, but this measure cannot be effective, because search engines have repeatedly proved how they can ignore this file and directives in it.

We need to repel robots using the .htaccess file, namely, blocking them at the server level, not even giving access to the site. This can be done either with the help of a user agent or with the help of IP .

  Site load: reduced by blocking robots

Naturally, you need to be clearly sure that you will block unnecessary elements, or you can do big things out of ignorance   Site load: reduced by blocking robots Only one Google robot comes at least under 6 different masks.

Also, analyzing the site logs, you can understand if hosting is working correctly.

  Site load: reduced by blocking robots

As you can see, after a manual request for a user agent, the idea is to serve the Twiceller-0.9 search robot, because this directive is written in .htaccess , but we see a different picture in the server logs:

  Site load: reduced by blocking robots

As a result, you can reasonably write to the support service so that they solve this problem, because not blocked, we see code 200, not code 403, which was supposed to be.

I think so, if any robot is of no use to me - I will block it. If I do not know the purpose of the stay of any robot on my site - I will block it.

Blocking robots in .htaccess you reduce the load on the server.

The user agent does not always correspond to real data, you can substitute anything here and present yourself at least by Yandex, so the safety option is to block the IP address . If you need to block someone 100%, it is better to duplicate both directives.

You can also do not only search engine robots, but also with spammers. We look at who is spamming us.

  Site load: reduced by blocking robots

Identify patterns, IP addresses and block them   Site load: reduced by blocking robots If you send spam from Chinese servers, for example, you can even block subnets.

For example, spam blogs can be useful for you. How? -   Site load: reduced by blocking robots

You collect statistics - namely, sites that are moving and increasing their performance in this way, and then safely add them to the black list on the link exchange (1 more option to replenish it), because the weight of these sites will not be transferred, no matter how authoritative they may seem , sooner or later they will be sanctioned. Yes, and overpay for the "nakruchennost" indicators are not worth it, namely the indicators determine the price of the link.

Currently, the .htaccess file has more than 300 lines of restrictions.

All that was said above is lyrical digressions, and now to practice, as we do.

SetEnvIfNoCase User-Agent "^ ia_archiver" search_bot
SetEnvIfNoCase User-Agent "^ msnbot" search_bot

<Limit GET POST HEAD>
Order Allow, Deny
Allow from all
Deny from env = search_bot
Deny from 87.118.86.131
Deny from 89.149.241
Deny from 89.149.242
Deny from 89.149.243
Deny from 89.149.244
</ Limit>

I ’m explaining that the contents of the file will block ia_archiver robots (web archive) and msnbot (msn / bing search engine), as well as several IP addresses of 100% of spammers.

Naturally, I will not fully upload my .htaccess file, but by collecting my own statistics, you can not only reduce the load, but also remove unwanted elements from the site that analyze and send spam .

Frequently analyze the site logs - there are many useful things you can find, and there are many uses for these data.   Site load: reduced by blocking robots


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

seo, smo, monetization, basics of internet marketing

Terms: seo, smo, monetization, basics of internet marketing