Web Crawler
Web crawlers are programs that crawl through the internet web and collect information about the web sites. Web crawlers are like blood to the search engines and feed the input needed to analyze and add it to the search engine to index appropriately. The crawler behaves and collects information based on certain policies (refer the link above to know more about the policies in Wikipedia). You can create a web crawler of your own. Here is an article “Build a Web spider on Linux” in IBM developerWorks that helps you create simple crawlers.










