A Web crawler is an automated program that recursively indexes Web pages found by following hyper-links from the parsing of other pages indexed during its crawl. With the World Wide Web's exponential growth rate, however, standard Web crawlers no longer suffice when the depth of a specific topic is required. A topical, or focussed Web crawler, is one that traverses the Web looking only for sites that are related to some pre-defined topic, while avoiding those not related. This thesis explores and critically analyzes various topical crawling algorithms, details the methods they utilize, introduces our own topical Web crawler, called WooSpider, and presents the results of experiments that measure the effectiveness of various versions of WooSpider.
Mathematics; Computer Science
Radkoff, Evan, "Topical Web Crawlers" (2012). Senior Independent Study Theses. Paper 935.
Bachelor of Arts
Senior Independent Study Thesis
© Copyright 2012 Evan Radkoff