Evergreen Classifier
Keywords:
Machine Leaning, Webpage Classification, Naïve Bayes, Logistic Regression, Random Forest, k-Nearest Neighbors, Term frequency Inversve document frequency, Evergreen, EphemeralAbstract
Classification of webpages is essential to a wide variety of tasks such as focused crawling, web link analysis, content
prioritization, contextual advertising and sentiment analysis of web content. In this paper, we analyze a data set of websites (with
various relevant features) that have been classified as evergreen (interesting in the long run) or ephemeral (relevant for short
period of time). The goal of the analysis is to develop models using machine learning techniques to predict how users would rate
the websites.


