Truncation Operators

From Intellogist

Jump to: navigation, search
This Glossary entry exists for the community to share information related to common terms used in prior art searching. Registered users can add, edit, or delete material on this page. Users should keep in mind that the information on this page is the result of community collaboration and, as such, is vetted by the community at large, not individual experts or fact-checkers. All information contributed to this page is public information - do not post confidential information. For more information about creating and editing Glossary articles, please see our Help pages. If you found this page through a web search, we invite you to visit our Main Page to see what Intellogist is all about.


Contents

Introduction

Truncation refers to the ability of a search system to broaden the scope of results by including basic variants on an initial search term, such as plurals, prefixes, and suffixes. An operator is typically necessary to invoke this function. For example, if the truncation operator in a given system is “*” (asterisk), a search string for the term “processor” using truncation would commonly be constructed thus:

process*

The results from this search would include the desired search term “processor,” as well as additional variants, including “processing, processed, process, processes, processors,” etc. Additionally, the search would include possibly unrelated or undesired results, such as “procession, processional, processionally, processioner,” etc.


Unlimited vs. Limited Truncation

A common form of truncation is "unlimited" truncation, which does not restrict the number of characters or length of the hit terms, as long as they contain the designated word stem. The example above illustrates the use of an unlimited truncation operator. Some advanced search systems will allow the user to specify the maximum number of characters that can be returned in place of the operator. For example, the search system MicroPatent PatentWeb will allow the following truncation model:

process*3

…where the digit "3" restricts the search to return only hit terms with a maximum of 3 characters after the stem. This would allow the variants "processing, processed, processes, processors, and proccession," etc. while eliminating the longer variants "processional, processionally, processioner," etc.


Left vs. Right Truncation

Truncation at the end of a term, as illustrated above, is sometimes called “right truncation.” In contrast, when truncation operators are used at the beginning of a term, it is sometimes called "left truncation." Left truncation functions identically to right truncation, but instead retrieves the base term with various prefixes. Therefore, the following search string:

*ethane

…could return search results including "fluoroethane, chloroethane, bichloroethane, trichloroethane, etc."

Left truncation is especially useful in chemistry and biology searching, where prefixes are often used in terms of art to represent chemical structures or biological concepts.


Simultaneous Left and Right Truncation

Often abbreviated to “SLART,” the phrase "Simultaneous Left And Right Truncation" refers to the ability of some search engines to accept two unlimited truncation operators per term. This kind of truncation is particularly useful when both prefix and suffix variants of a keyword term are of interest. For example, the string:

*capsul*

…would produce search results including “capsules, capsulation, encapsulation, microcapsules, microencapsulated, re-capsulation, decapsulate” etc. Note that any combination of prefix and suffix would be supported.

Due to the high computational demands placed on a search system by SLART, it is only available from certain search engines, and for searching certain collections. It can come in particularly handy when searching in the chemistry and biology/biotechnology fields.


Single Truncation

Search systems may also allow the use of a single truncation operator, of which there are several kinds. A “single-character-only” operator searches for variants of the given term that have the same number of characters as the term plus the operator or operators. For example, if the truncation operator is “?” (question mark), the search string “robot?” would return only “robots,” not “robot” or “robotic.” (It cannot return "robotic" because only a single character is allowed to appear after the stem, and it cannot return "robot" because the operator requires the 6th character place to be filled.)

The second type is sometimes called a “0-or-1 character" operator, and would function slightly differently in that the search would allow results that do not include a character in place of the truncation operator. Thus, if “?” functioned as a 0-or-1 character operator in the above example, the “robot?” search query would return both “robot” and “robots,” but it still would not return the term “robotics.”


Internal Truncation

A search system may also include “internal truncation,” a function that allows operators to be placed in the middle of a word to include alternate spellings. For example, instead of searching “aluminum” and “aluminium” separately, the term “alumin??m” would include both spellings, where “?” functions as a 0-or-1 operator.

Some systems can even support unlimited internal truncation. This feature can be especially useful when searching for chemical names, which consist of prefixes, stems and suffixes that represent components of the structure as a whole.


Stemming

Stemming is a form of truncation, often carried out automatically by the search system, which allows the engine to determine the base or "stem" of a keyword and to accept variants of the stem as keyword hits. Stemming is generally accomplished by programming a search engine to recognize common word prefixes and suffixes. For more details, please see Stemming.


Usage

Ideal usage of the truncation function will include as many different forms of the desired term as possible while still truncating at the proper point of the term to eliminate as many unrelated terms as possible. Used well, truncation is a powerful tool for patent searchers. The ability to broaden the search can allow the collection of a range of results that would have otherwise required multiple time-intensive queries. Additionally, truncation can decrease the chance of human error or oversight by relying on the search engine to determine variants on a term, including some that a searcher may have otherwise overlooked.

Unfortunately, truncation also introduces complications. First, even with precise use, truncation will often increase the number of undesired or unrelated results. For example, when using limited truncation as in the keyword term "process*3," it is impossible to specify that the system should return “processors,” but not “procession.” Also, some systems will function slowly when truncation is used, and overuse can crash the search engine altogether.

Patent search questions. Expert answers.  Brought to you by Landon IP
HOT Items

Intellogist is brought to you by the patent search experts at Landon IP.

Welcome to Intellogist!

To network with our international community of patent info pros, please create an account.

For a list of our current members, see our Community Page.