Latent Semantic Analysis
|This Glossary entry exists for the community to share information related to common terms used in prior art searching. Registered users can add, edit, or delete material on this page. Users should keep in mind that the information on this page is the result of community collaboration and, as such, is vetted by the community at large, not individual experts or fact-checkers. All information contributed to this page is public information - do not post confidential information. For more information about creating and editing Glossary articles, please see our Help pages. If you found this page through a web search, we invite you to visit our Main Page to see what Intellogist is all about.|
Latent Semantic Analysis (LSA) is a natural language processing technique related to information retrieval. The technique consists of using advanced statistical methods to analyze relationships between a set of documents and the terms within the documents by producing a set of concepts related to the documents and terms. Latent Semantic Analysis is also sometimes referred to as Latent Semantic Indexing, especially when applied to information retrieval.
LSA was patented in 1988 (US Patent 4,839,853) by a group of researchers at Bell Communications Research, Inc.
Latent Semantic Analysis allows a search engine to deduce the actual meaning of a term or group of terms and to essentially “understand” the concepts that are associated with the terms. This allows an LSA search engine to return result documents that share the same "meaning" as the query terms, but perhaps do not actually contain the terms themselves.
By identifying documents that may not contain the word you are searching for you may be able to circumvent misleading lexica or nescient descriptions pre-jargon and discover valuable documents in unrelated fields.
Latent Semantic Analysis in Patent Searching
In theory, this technique could prove very valuable when applied to patent searching since synonyms are often used to prevent discovery by less-advanced Boolean search strategies. A searcher using an LSA-based search tool doesn’t have to perfectly anticipate the words used by the author, nor does he or she have to heavily rely upon truncation and Boolean operators, because the Latent Semantic Analysis will, in theory, search for the concepts rather than the actual words.
Recently, some patent search tool providers have attempted to integrate Latent Semantic Analysis techniques into the realm of patent searching. For example, PatentCafe advertises an exclusive Latent Semantic Analysis patent search engine. PatentCafe claims that using LSA technology in searching will dramatically reduce the time required for a complete search as well as significantly increase the quality of results.
- ↑ "Semetric® Conceptual Search and Discovery: Technology White Paper." PatentCafe, http://www.patentcafe.com/library/whitepapers/semantic_engine_whitepaper.pdf. Accessed on November 13, 2008.