Hidden Markov Model Part-Of-Speech Tagging System

Project information

Hidden Markov Model (HMM) captures lexical and contextual information for POS tagging. They are used to represent and model the relationships between words (observations) and their corresponding POS tags (states).

  • Developed a POS tagging system using a HMM on a given dataset, focusing on estimating probabilities from training data and applying these to infer tags for test data.
  • Created two distinct tagging functions, a Baseline tagger for simple frequency-based tagging and a Viterbi tagger using the HMM trellis decoding algorithm, emphasizing transitions and emission probabilities.
  • Tested and validated the tagging accuracy on the Brown corpus and other datasets, ensuring efficient code performance and accuracy in handling unseen words and multiple-tag scenarios.

Inspired by BootstrapMade