Improving search engines

The basic principle of an search engine is to find entries in the search index that match the search query. If matching is done simply by looking for exact matches then the search will fail if the query or the entries use different inflected form of the word. Natural language processing provides solution to this problem: if the search index entries are listed in their base forms and similar lemmatization process is done to the query string prior matching, then the search will return matches.

From the tools provided on this site, Machinese Tokenizer provides the optimal solution for this kind of task as dispite the limited amount of the linguistic information provided, the information is sufficient and the processing speed in very fast.