Movie Review Sentiment Classifier

Project information

Suppose that we're building an app that recommends movies. We've scraped a large set of reviews off the web, but of course, we would like to recommend only movies with positive reviews.

  • Implemented a binary sentiment classifier using the Naive Bayes algorithm in Python, training on a dataset of movie reviews to classify them as positive or negative based on a bigram bag of words model
  • Enhanced the classifier by integrating a mixture of unigram and bigram models, with Laplace smoothing, balancing between unigram and bigram contributions, optimizing for highest classification accuracy.
  • Experimented with text preprocessing techniques like stemming, lowercase transformation, and stop words removal, fine-tuning classifier performance on development and hidden test datasets

Inspired by BootstrapMade