You'll be wanting at the very least a naive stemming algorithm (consider the Porter stemmer; there's obtainable, free code in many languages) to system textual content initial. Maintain this processed textual content plus the preprocessed text in two individual Room-break up arrays.Learn More This thread was archived. Be sure to inquire a new query