Bigram Markov Model

About

This loose collection of scripts is a bigram transition probability model and markov chain library. It's rather low-quality, and was originally designed to load a subsample of the brown corpus as its training data. If you're unable to get that corpus you should be able to adapt it to load others.

Currently the script loads POS tagged words into the transition matrix, and then uses the matrix to:

It was used as the basis for 'CorpusBot', an IRC bot that could POS-tag and analyse sentences.

Download

Download BigramMM.tar.gz.

Usage

Typically, one would simply run brown_bigram_source.rb, which has a quick test script written into it and assumes some files from the brown corpus are in brown/. If you want to use this lib, you're going to have to poke around in that file and extract what you need. Sorry, but it was never intended for release.