FindinSite-MS: Search engine for an ASP.NET website   .
  search
Powered by FindinSite-MS
. Home | Installation | Indexing | Control Panel | Web services | Advanced | Purchasing .
. .
  Web.Config options | Look and Feel | Languages | Word highlighting | Runtime parameters | Rules | Subsets | Meta-data fields | FindInSiteBot robot

 

findinsite-ms FindInSiteBot crawler robot


  1. If you are looking at this page, then your site will have been indexed by the findinsite-ms search engine.

    findinsite-ms is a search engine, usually used to search a single web site. To be able to run searches, it has to see what information is on the web site. It does this by finding all the pages available on the site, usually starting at the home page of the site. It follows HREF and SRC links to find new files to index. When findinsite-ms starts indexing a site, it retrieves pages serially until all the site has been indexed. The findinsite-ms administrator will usually set up regular indexing runs, eg hourly, daily, weekly or monthly.

    findinsite-ms is software that can be run by many people on their web sites. Usually findinsite-ms is run by a web master to provide a search facility on their own site. If someone you don't know is indexing your site using findinsite-ms then read on:

    The user-agent field in your web access logs indicates the type of browser or robot visiting your site. findinsite-ms uses a user-agent like this:

    FindInSiteBot/1.3.1865.38599 (http://www.phdcc.com/findinsite/bot.htm http://www.example.org/findinsite/)

    This string indicates the findinsite-ms version*, a link to this information page, and finally the URL of the instance of findinsite-ms that is crawling your site. In the above example, please look at http://www.example.org/ and find a contact point and ask them why they are indexing your site.

  2. findinsite-ms supports the robots.txt exclusion file - see the Robot Exclusion Standard for details. findinsite-ms looks for the FindInSiteBot user-agent; if this is not present it honours the commands for the * user-agent.

    findinsite-ms supports the Crawl-Delay option in robots.txt. The Crawl-Delay number indicates the number of seconds between accesses; values greater than 60 are reduced 60. The default value is zero.

    The findinsite-ms web master can tell findinsite-ms to ignore the robots.txt exclusion file, ie index all the site.

  3. The META robots tag is supported, including noindex and nofollow options.

    The rel="nofollow" attribute for A tags is supported.

  4. The findinsite-ms indexer maintains cookies and so preserves session state.

* the robot name can be configured to be different
Version 1.3+
Version 1.5+
  All site Copyright © 1996-2014 PHD Computer Consultants Ltd, PHDCC   Privacy  

Last modified: 30 October 2005.