Differences between revisions 2 and 5 (spanning 3 versions)
Revision 2 as of 2004-04-08 01:34:37
Size: 264
Editor: yakko
Comment:
Revision 5 as of 2004-04-08 16:10:57
Size: 525
Editor: yakko
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
A list of words that for reasons of volumne or ["Precision"] and ["Recall"] will not be included in the index and hence are not searchable. E.g. "and", "or", "not" etc. A list of words that for reasons of volume or ["Precision"] and ["Recall"] will not be included in the index and hence are not searchable. E.g. "and", "or", "not" etc.

There are two ways to filter stoplist words from an input token stream:

   a. Examine lexical analyzer output and remove any stopwords
   a. Remove stopwords as part of the lexical analysis: This is one of the more efficient ways to implement a StopList

Back to ComputerTerms, InformationRetrieval

A list of words that for reasons of volume or ["Precision"] and ["Recall"] will not be included in the index and hence are not searchable. E.g. "and", "or", "not" etc.

There are two ways to filter stoplist words from an input token stream:

  1. Examine lexical analyzer output and remove any stopwords
  2. Remove stopwords as part of the lexical analysis: This is one of the more efficient ways to implement a StopList

Back to ComputerTerms, InformationRetrieval

StopWords (last edited 2004-04-08 16:24:35 by yakko)