Abstract

A Web crawler is an automated program that recursively indexes Web pages found by following hyper-links from the parsing of other pages indexed during its crawl. With the World Wide Web's exponential growth rate, however, standard Web crawlers no longer suffice when the depth of a specific topic is required. A topical, or focussed Web crawler, is one that traverses the Web looking only for sites that are related to some pre-defined topic, while avoiding those not related. This thesis explores and critically analyzes various topical crawling algorithms, details the methods they utilize, introduces our own topical Web crawler, called WooSpider, and presents the results of experiments that measure the effectiveness of various versions of WooSpider.

Advisor

Daehn, James

Second Advisor

Pierce, Pamela

Department

Mathematics; Computer Science

Disciplines

Applied Mathematics

Publication Date

2012

Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis

Share

COinS
 

© Copyright 2012 Evan Radkoff