If put forth wordpress admin after which configurations->privacy, you will find two options asking whether you need to let your blog to become looked though by seach engines which option:

I must block search engines like google, but allow normal site visitors

So how exactly does wordpress really block search bots/spiders from searching through this website once the website is live?

Based on the codex, it is simply robots meta data, robots.txt and suppression of pingbacks:

Causes <meta name='robots' content='noindex,nofollow' /> to become produced in to the section (if wordpress_mind can be used) of the site's source, leading to internet search engine bots to disregard your website.

Causes hits to robots.txt to transmit back:

User-agent: *

Disallow: /

Note: The above mentioned only works if WordPress is set up in the website root with no robots.txt is available.

They are "recommendations" that friendly bots follows. A malicious spider trying to find E-Mail addresses or forms to junk e-mail into won't be impacted by these configurations.

Having a robots.txt (if installed as root)

 User-agent: *
 Disallow: /

or (from here)

I must block search engines like google, but allow normal site visitors - take a look of these results:

  • Causes "<meta name='robots' content='noindex,nofollow' />" to become produced in to the section (if wordpress_mind can be used) of the site's source, leading to internet search engine bots to disregard your website. * Causes hits to robots.txt to transmit back:

        User-agent: * 
        Disallow: / 
    

    Note: The above mentioned only works if WordPress is set up in the website root with no robots.txt is available.

  • Stops pings to ping-o-matic and then any other RPC ping services specified by the Update Services of Administration > Configurations > Writing. This functions by getting the function privacy_ping_filter() remove the websites to ping in the list. This filter is added by getting add_filter('option_ping_sites','privacy_ping_filter') within the default-filters. Once the generic_ping function tries to get the "ping_sites" option, the hepa filter blocks it from coming back anything.

  • Hides the Update Services option positioned on the Administration > Configurations > Writing panel using the message "WordPress is not notifying any Update Services due to your blog's privacy configurations."

You cannot really block bots and spiders from searching via a openly available site if an individual having a browser can easily see it, a bot or crawler can easily see it (caveat below).

However, there's something call the Robots Exclusion Standard (or robots.txt standard), which enables you to definitely indicate to well socialized bots and spiders they should not index your website. This site, in addition to Wikipedia, provide more details.

The caveat towards the above comment that a specific item in your browser, a bot can easily see, is: simplest bots don't incorporate a Javascript engine, so something that the browser renders consequently of Javascript code will most likely not be viewed with a bot. I recommend you don't make use of this in an effort to avoid indexing, because the robots.txt standard doesn't depend on the existence of Javascript to make sure correct rendering of the page.

Once last comment: bots have the freedom to disregard this standard. Individuals bots are badly socialized. The end result is that something that can see your HTML can perform what it really likes by using it.

I'm not sure without a doubt however it most likely creates a robots.txt file which identifies rules for search engines like google.

Utilizing a Robots Exclusion file.

Example:

User-agent: Google-Bot
Disallow: /private/