Abstract

A Web crawler is an automated program that recursively indexes Web pages found by following hyper-links from the parsing of other pages indexed during its crawl. With the World Wide Web's exponential growth rate, however, standard Web crawlers no longer suffice when the depth of a specific topic is required. A topical, or focussed Web crawler, is one that traverses the Web looking only for sites that are related to some pre-defined topic, while avoiding those not related. This thesis explores and critically analyzes various topical crawling algorithms, details the methods they utilize, introduces our own topical Web crawler, called WooSpider, and presents the results of experiments that measure the effectiveness of various versions of WooSpider.

Advisor

Daehn, James

Second Advisor

Pierce, Pamela

Department

Mathematics; Computer Science

Recommended Citation

Radkoff, Evan, "Topical Web Crawlers" (2012). Senior Independent Study Theses. Paper 935.
https://openworks.wooster.edu/independentstudy/935

Disciplines

Applied Mathematics

Publication Date

2012

Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis

Download

COinS

Open Works

Senior Independent Study Theses

Topical Web Crawlers

Abstract

Advisor

Second Advisor

Department

Recommended Citation

Disciplines

Publication Date

Degree Granted

Document Type

Search

Browse

Author Corner

Open Works

Senior Independent Study Theses

Topical Web Crawlers

Authors

Abstract

Advisor

Second Advisor

Department

Recommended Citation

Disciplines

Publication Date

Degree Granted

Document Type

Share

Search

Browse

Author Corner