Abstract
A Web crawler is an automated program that recursively indexes Web pages found by following hyper-links from the parsing of other pages indexed during its crawl. With the World Wide Web's exponential growth rate, however, standard Web crawlers no longer suffice when the depth of a specific topic is required. A topical, or focussed Web crawler, is one that traverses the Web looking only for sites that are related to some pre-defined topic, while avoiding those not related. This thesis explores and critically analyzes various topical crawling algorithms, details the methods they utilize, introduces our own topical Web crawler, called WooSpider, and presents the results of experiments that measure the effectiveness of various versions of WooSpider.
Advisor
Daehn, James
Second Advisor
Pierce, Pamela
Department
Mathematics; Computer Science
Recommended Citation
Radkoff, Evan, "Topical Web Crawlers" (2012). Senior Independent Study Theses. Paper 935.
https://openworks.wooster.edu/independentstudy/935
Disciplines
Applied Mathematics
Publication Date
2012
Degree Granted
Bachelor of Arts
Document Type
Senior Independent Study Thesis
© Copyright 2012 Evan Radkoff