Clean document references
This has the advantage that we can inform the user if a document
could not be found.
Patch by: Sverre Rabbelier
# Directions for web crawlers.# See http://www.robotstxt.org/wc/norobots.html.User-agent: HTTrackUser-agent: pufUser-agent: MSIECrawlerUser-agent: NutchDisallow: /