app/soc/content/robots.txt
author Sverre Rabbelier <srabbelier@gmail.com>
Sat, 15 Nov 2008 16:17:11 +0000
changeset 482 839740b061ad
parent 73 211a3eeacf27
permissions -rw-r--r--
Factor out direct use of the page object Instead of directly using the page object in the html, pass around page_name. This will make it easier to remove Page in favor of a simpler implementation.

# Directions for web crawlers.
# See http://www.robotstxt.org/wc/norobots.html.

User-agent: HTTrack
User-agent: puf
User-agent: MSIECrawler
User-agent: Nutch
Disallow: /