The Commons

Back to Results

Patent Title: Light weight document matcher

Assignee: IBM
Patent Number: US6286000
Issue Date: 09-04-2001
Application Number:
File Date:12-01-1998


Abstract: A lightweight document matcher employs minimal processing and storage. The lightweight document matcher matches new documents to those stored in a database. The matcher lists, in order, those stored documents that are most similar to the new document. The new documents are typically problem statements or queries, and the stored documents are potential solutions such as FAQs (Frequently Asked Questions). Given a set of documents, titles, and possibly keywords, an automatic back-end process constructs a global dictionary of unique keywords and local dictionaries of relevant words for each document. The application front-end uses this information to score the relevance of stored documents to new documents. The scoring algorithm uses the count of matched words as a base score, and then assigns bonuses to words that have high predictive value. It optionally assigns an extra bonus for a match of words in special sections, e.g., titles. The method uses minimal data structures and lightweight scoring algorithms to compute efficiently even in restricted environments, such as mobile or small desktop computers.

Notes:

Link to USPTO

IBM Pledge dated 1/11/2005