Clockwatchers Web Hosting - robots.txt Tutorial

	Your Web Hosting Solution

robots.txt Guide
• Usage Info
• List Of Bots
• Creation Tool
• Use Meta Tags
• Block Bad Bots
• Tutorials Home
• Home Page

robots.txt Tutorial - Usage

This file is used to exclude robots from sections of your web site, so they won't read files in those areas.

1. What are these robots?

These are mostly automated software which fetches content on many web sites for a variety of purposes.

Search engines often call these spiders and send them out to look for pages to include in their search results.

Some spammers also use this technology to harvest email addresses to send their junk mail to. Other uses include bots looking for illegal files or content.

2. How do I create a robots.txt file?

The syntax is very limited and easy to understand. The first part specifies the robot we are referring to.

User-agent: BotName

Replace BotName with the robot name in question. To address all of them, simply use an asterisk.

User-agent: *

The second part tells the robot in question not to enter certain parts of your web site.

Disallow: /cgi-bin/

In this example, any path on our site starting with the string /cgi-bin/ is declared off limits. Multiple paths can be excluded per robot by using several Disallow lines.

User-agent: *
Disallow: /cgi-bin/
Disallow: /temp/
Disallow: /private

This robots.txt file would apply to all bots and instruct them to stay out of directories /cgi-bin/ and /temp/.

It also tells them any path/URL on your site starting with /private (files and directories) is off limits.

To declare your entire website off limits to BotName, use the example shown below.

User-agent: BotName
Disallow: /

To have a generic robots.txt file which welcomes every robot and does not restrict them, use this sample.

User-agent: *
Disallow:

This beginner's tutorial includes a list of common robot names to get you started. Many others exist.

• Top Of Page


Copyright© 1996 - 2025 Clockwatchers, Inc.