General Searching Best Practices
|If you found this page through a web search, we invite you to visit our Main Page to see what Intellogist is all about. This article exists for the community to share common wisdom and valuable insights about the prior art searching process. Registered users can add, edit, or delete material on this page. Users should keep in mind that the information on this page is the result of community collaboration and, as such, is vetted by the community at large, not individual experts or fact-checkers. All information contributed to this page is public information - do not post confidential information. For more information about creating and editing Best Practices articles, please see our Help pages.|
The following article addresses the major elements that are necessary to any prior art search, including the essential information any searcher needs to know before beginning a search, as well as an overview of the general types of searching and strategies that are commonly applied to them. For information related to specific technology fields, see the Best Practices directory for a list of technology-focused best practices articles. In addition, for information on each major type of patent search, see the patentability, infringement and validity best practices articles.
Before Beginning the Search
There are three necessary elements to understand before beginning any search. A healthy conversation with the search recipient can help the searcher to gain an understanding of these essential points.
Understand the legal reasons for the search. Whether it is conducted to spur product development, gain competitive intelligence, or to defeat impending litigation, all technology searching will have legal motivations and ramifications. The legal aspects of the patent system will always influence not only why, but also how searches are conducted. The more a searcher knows about the reasons motivating a search request, the more intelligently the search can be adjusted to suit the recipient’s ultimate needs. To understand special types of search requests motivated by legal needs, see the articles on patentability, infringement and validity searching.
Understand the “state of the art” in the field of search. Essentially, searchers need to be aware of what ideas are cutting edge, what is known as protected technology and what is commonly used in the public domain. Ideally, searchers should have a technical background closely related to the search subject matter, and should already understand these aspects of their field. However, in reality, searchers are often asked to adapt to subject areas that are only tangentially related to their areas of expertise. Obviously, it is best to avoid spending time searching on features of an invention that are actually widely known/used in the industry.
To gain a basic level of background competency, textbooks are good resources for understanding known concepts, and entire libraries of up-to-date textbook resources are now available on the web, such as Knovel.com. Searchers can also look for seminal articles in the field by using the citation impact feature available from major non-patent search providers like Google Scholar, ISI Web of Knowledge, and/or Scopus. Browsing new issues of leading journals, or even surfing web sources like technology blogs, can likewise get users oriented as to the direction current research is taking.
Understand search scope to select sources/collections accordingly. If the motivations behind conducting the search are properly understood, this is usually fairly straightforward. Selecting sources requires an understanding of what kind of documents are “of interest”; for example, an infringement investigation might be directed only towards live patent documents, while a patentability search would encompass any publicly available information, in patents, scientific journals, or even available on the Web. In addition, the appropriate search sources may vary given the search budget limitations. Validity investigations usually justify more cost and extensive searching than a quick-and-dirty patentability study, and thus it might be feasible to include more expensive, highly curated collections in the search, as opposed to a simple low-cost search in major online sources.
When Conducting the Search
After the searcher understands the three points above, the searching can begin. The following search methodology advice consists of two conflicting points, and the art of resolving the conflict between the two points is the art of conducting a “complete” search.
- Be prepared to iteratively adjust the search strategy as you proceed. The large volume of available prior art information means that searchers must constantly adjust their strategy in pursuit of the most relevant information. Searchers must stay alert to discover alternative keyword terminology and analogous technology fields that may yield results of interest, and such discoveries may justify adjusting the strategy mid-search.
- On the other hand, be prepared to conduct an exhaustive review of the best search strings. If a classification or keyword search seems reasonably relevant, it is best to investigate ALL the results of that search—time permitting. Use discretion to decide which search strings should be completely reviewed – perform narrow, targeted searches at first to help create a search strategy, and then follow through with structured queries and examine the best text strings and classification areas fully.
Types of Searching
Patents have been traditionally searched by national and international classification codes. The appropriate classification system will vary by issuing patent authority; all major patent offices will use the International Patent Classification (IPC) system to classify their documents, although examiners in the US are likely to be less well trained to assign international classes and better versed in their own national classification system. It is probably beneficial to use a national classification system on specific collections (such as the US classification system) in addition to international classifications, wherever possible. Other major classification systems include the ECLA, DEKLA and Japanese F-term classifications, which are all extensions of the IPC system used regionally to divide the IPC classes into smaller sub-sections. Some non-patent files are also indexed by classification codes. This is almost always performed by a human indexing staff. One example of this type of file is the Ei Compendex file, in which records are tagged with controlled vocabulary and special classifications when they are added to the collection. The collection can then be searched using these data points.
One major benefit to using a classification system is that classification “concepts” are applied to documents no matter what language or unusual terminologies they may rely upon to describe their content. This can help users find documents using obscure language or unusual wording that would not be identified in a straightforward keyword search.
Strategies for Classification Searching
Identify relevant classes through an initial search: to identify highly relevant classifications, it can be helpful to try the “jugular search” technique. This is accomplished by doing a highly targeted title/abstract keyword search to identify at one or two documents related to the search subject matter, and then scanning their classification codes for starting points. If a document is provided at the beginning of the search as an example of relevant content, performing a citation search on that document could also lead to the discovery of other relevant classification areas
Search the classification schedule: take the time to fully read the definitions of the classes found through initial searching efforts. Oftentimes, the class definition will include other recommended classifications which pertain to analogous or related subject matter not covered under that class.
A few classification schedule resources are available at the following links:
- The US Classification Schedule can be browsed by subject area, or searched by keyword
- IPC (version 8) classifications can be browsed and/or searched
- ECLA subdivisions to the IPC classification can be browsed/searched
Contact a USPTO examiner to confirm a US class search. This strategy is specific to US classification searching, but it is very effective. No one understands the US patent classification system better than the examiners who use it every day, and as part of their job they help public searchers understand whether selected classes are appropriate to search a given invention. To find the appropriate examiner, the “jugular” search is again useful to identify highly relevant patents. Limiting such a search to the most recent publications ensures that this person is still employed by the USPTO. An employee directory is also available from the patent office website to identify a work phone number.
A note on searching with multiple class codes: Many documents are indexed using multiple codes from the same classification scheme. For example, both primary and secondary US classes of interest can be assigned to the same patent document. For precision retrieval, it is sometimes useful to search on two or more codes simultaneously (using the Boolean AND operator between codes). However, users should be aware that this step is even more likely to exclude documents of interest that bear one relevant class mark, but not the other. It should generally only be used to produce a very targeted set of search results, usually when time is limited.
A text search is performed by using one or more search keywords to query bibliographic data, indexed data, and sometimes abstract and even full text data in an electronic database. Text searching is often aided by special operators. Widely known operators include the Boolean AND, OR, and NOT operators, but may also include proximity operators that specify the order between two words and the maximum distance that should exist between them. Allowed operators may vary depending on the search engine that is selected. One benefit to a text search is that it can find “outlying” documents that have been improperly classified. A global text searching strategy should be used independently of classification limitations whenever possible to ensure that these misclassified documents have a higher chance of being examined during the search.
Strategies for Text Searching
Identifying Keywords. One of the biggest obstacles to an effective text search is the need to identify all potential keyword combinations that could describe the search subject matter. When identifying keywords related to the search subject, it is vital to consider the function of the invention/product as well as its component parts. To identify keywords related to both structural and functional elements of the invention, it is sometimes useful to ask three initial questions:
- What problem does the invention solve?
- What is the invention (what are its physical components)?
- What does the invention do?
These questions encourage the searcher to take a Problem/Need/Solution/Function/Structure approach to identifying keywords. Searchers should remember to think of keywords relating to the abstract or high-level problem being solved, as well as the physical and structural components of the invention and how they act. Consider the following invention disclosure as an example of this approach:
- High speed police chases are a danger to civilians and property. The amount of time a high speed chase continues will increase the chances of civilian or material harm. To prevent these chases from occurring, the need has arisen for a remote car disabling mechanism used by police officers to impede the progress of the getaway car.
- The mechanism would involve a tamper-proof receiver installed by default in every automobile upon manufacture that responds to the signal from a transmitter to cut the fuel supply and ignition to the engine. The receiver is connected to a relay that may cut off power to a vehicle’s electric fuel pump and ignition pack. Transmitters and control modules are installed in all police vehicles. The officer may use the control module to select the vehicle that requires disablement by identification (e.g. license plate number) and transmit a fuel-cut-off signal to the appropriate vehicle. The officer may be in a police vehicle such as a patrol car or helicopter.
Based on the invention disclosure above, a searcher might use the four facets of Problem/Need/Solution/Function to construct this hypothetical table of related keywords:
Another obstacle to identifying keywords is that the searcher must rely on his/her technical background and knowledge to ensure that all alternate terminologies that could apply to the technology are being searched. The following examples (taken from the field of communications) illustrate how technical knowledge can expand the search by including alternate terminologies:
- Voice over Internet Protocol (VoIP) is a protocol optimized for the transmission of voice through the Internet or other packet switched networks. This term is often used abstractly to refer to the actual transmission of voice (rather than the protocol implementing it), but VoIP is also known as IP Telephony, Internet telephony, Broadband telephony, Broadband Phone and Voice over Broadband. Someone proficient in this technical area would have known about these alternate terminologies, whereas a layperson or inexperienced searcher may not have included these possibilities in their search strategy, possibly leading to missed references.
- Imagine an invention disclosure describing an invention that could “transmit data via a microwave signal.” A searcher proficient in wireless technology would understand that the word “microwave” in this context represents a range of wave frequencies. A search on the word “microwave” would not yield thorough results; other wireless technologies, such as GSM, Bluetooth, satellites, 802.11 (used by wireless computer networks), are all technically “microwave” signals, and would fulfill the search criteria if used to transmit data.
Using a methodical search progression. Text searching is only ultimately effective if the searcher plans the approach to cover as much ground as possible. To construct a careful and methodical text search, use the Problem/Need/Solution/Function/Structure approach described above to combine multiple keywords from each of these groups.
The proposed approach described here uses both a broad-to-narrow and narrow-to-broad progression approach, where “broad” terms directed to the problem and “narrow terms” are directed either to the structure or function. For completeness, searchers should do this for both the structure and function elements. Examples of this approach might include:
- Start with the generic structure or function (broad) and combine text queries gradually to include the problem (narrow).
- Start with the generic problem (broad) and combine text queries gradually to include the structure or function (narrow).
- Start with the structure and function combined (narrow) and subtract limitations gradually (broad).
Example of a Broad-to-Narrow progression:
- 1: (chase or pursu*) and (disabl* or imped* or block* or prevent* or inhibit*) and (car or automobile or truck or vehicle) (Results 22425)
- 2: (police or law enforcement) and (chase or pursu*) and (disabl* or imped* or block* or prevent* or inhibit*) and (car or automobile or truck or vehicle) (Results 1081)
- 3: ((police or law enforcement) and (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) and (ignition or fuel)) and (car or automobile or truck or vehicle) (Results 390)
- 4: ((police or law enforcement) and (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) w20 (ignition or fuel)) and (car or automobile or truck or vehicle) (Results 108)
- 5: ((police or law enforcement) w20 (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) w20 (ignition or fuel)) and (car or automobile or truck or vehicle) (Results 66)
- 6: ((police or law enforcement) near (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) near (ignition or fuel)) and (car or automobile or truck or vehicle) (Results 32)
- 7: ((police or law enforcement) near (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) near (ignition or fuel)) and (car or automobile or truck or vehicle) and (RF or radio frequenc* or infrared or IR) (Results 22)
Example of a Narrow-to-Broad progression:
- 1: ((police or law enforcement) near (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) near (ignition or fuel)) and (car or automobile or truck or vehicle) and (RF or radio frequenc* or infrared or IR) (Results 22)
- 2: ((police or law enforcement) near (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) near (ignition or fuel)) and (car or automobile or truck or vehicle) (Results 32)
- 3: ((police or law enforcement) w20 (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) w20 (ignition or fuel)) and (car or automobile or truck or vehicle) (Results 66)
- 4: ((police or law enforcement) and (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) w20 (ignition or fuel)) and (car or automobile or truck or vehicle) (Results 108)
- 5: ((police or law enforcement) and (chase or pursu*)) and ((disabl* or imped* or block* or prevent* or inhibit*) and (ignition or fuel)) and (car or automobile or truck or vehicle) (Results 390)
- 6: (police or law enforcement) and (chase or pursu*) and (disabl* or imped* or block* or prevent* or inhibit*) and (car or automobile or truck or vehicle) (Results 1081)
- 7: (chase or pursu*) and (disabl* or imped* or block* or prevent* or inhibit*) and (car or automobile or truck or vehicle) (Results 22425)
As a general rule of thumb, searchers should proceed with this progression until they hit, and then surpass, the number of hits they would normally want to review. The searcher should investigate how many limiting terms can be added or subtracted before the results become too broad or narrow for the string to be worthwhile.
A combination classification/text search can be a powerful way to target a search even further, but should be approached with caution as it could exclude relevant results. Searchers should bear in mind that the benefits to classification and text searching independently is that each method can be somewhat relied upon to find documents that the other could not have found (e.g. documents with obscure keywords, or misclassified documents). Combining the two strategies is necessary when a concept can only be described by the most commonplace keywords, such that hit counts on text searches are impossible to narrow, and when the best classifications for a given concept contain thousands of patents, such that the search strings cannot be exhaustively reviewed without further restrictions.
When using the two strategies together, it is no longer necessary to use keyword terms that are already effective as limitations due to the classification definition. For example, if the searcher was looking for a document about a red hat, and was looking in a classification area designated for “hats,” only the keyword “red” should be used as a text search.
As an extension of the argument above, it sometimes follows that a classification-and-text combination search is necessary when a keyword has a special meaning in the context of a certain technology, but another commonplace meaning outside of it. As a simple example, a search for an “optical mouse” could be limited to classification areas pertaining to computer hardware to prevent an innumerable number of biotechnology hits related to genetic material from mice.
Citation searching is the act of investigating other documents related to a result document. Patent documents often list related documents on their front page; these related documents were provided by the applicant or examiner as examples of earlier art. Looking back in time to investigate these earlier documents is often called “backward citation searching.” In addition, electronic databases often establish electronic links between older documents and the newer publications which cite them. Looking forward in time to investigate later documents which may have cited a relevant hit is called “forward citation searching.”
The cited/citing documents on a relevant search hit are likely to have closely related content, and should always be investigated. A true advantage to citation searching often arises when an interesting document is found that was not turned up by any earlier classification or text searching. Such a document should always be examined to determine the reason that it was not found. Any new keyword terms or classifications of interest should then be used in the next iteration of the search strategy.
Related Entities Searching
Patent and non-patent documents alike are always identified by various entities like the author/inventor, and patent applicant. Other entities might include the patent agent/representative, or sponsor(s) of the research, such as the US government.
Performing an entity search is so simple that it should almost always be attempted in any type of prior art search situation. Learning about the research endeavors of an inventor can lead to the discovery of frequently cited authors or common coauthors, who may have published related papers or patents. Searching for early iterations of an inventor’s ideas may possibly even lead to the discovery of novelty-destroying early disclosures of the patented work.
One obstacle to performing an entity search is that proper names are often misspelled on documents and in electronic databases. Multiple alternate versions of a name are also commonly in use. Also common is the accidental incorporation of an entity location or address into the indexed data, so that two separate entries, “Biomedical Labs” and “Biomedical Labs, Houston TX” actually represent the same entities (this is a fictional example). Names that originated from countries with national languages that are not written via the Roman alphabet can be transformed into multiple variations thanks to multiple transliteration conventions: for example, a certain “ch” sound from Mandarin Chinese could be transliterated as either a “Q” or “Ts,” leading to a name that could contain “Qing” or “Tsing”. In addition, another obstacle to entity searching is sometimes presented by the format of a name: some search engines may have trouble indexing names that contain punctuation marks, such as "O'Brian" or "Jones-Smith;" punctuation marks are often ignored by electronic databases.
When searching related entities, special care should be taken to identify these alternative versions and spellings of entity names. Search services which offer index browsing features can allow users to view various entries for a given name, and to search on all of them if necessary. Such a browsing feature will normally let users enter the first few letters of an entity name, and then display related entries indexed by the search engine by matching them to the entered characters. When a name is from a non-Roman alphabet, using a little time to perform a background investigation on transliteration conventions might lead to more complete search results.
Date Range Searching
Date range searching can be combined with any of the strategies described above to limit a result set. Sometimes date searching becomes necessary due to the legal reasons behind a search request; for example, an infringement search may take place only on unexpired patents, requiring the searcher to limit the search set to approximately the last 20-25 years.
Patent documents have multiple important dates associated with them, each with a different legal significance. (NOTE: as with the rest of the information contained in this article, the following definitions are informal and do not constitute legal advice.)
- Publication Date- The date that the document became available to the public.
- Issue Date – The date that a patent was granted/went into force.
- Application or Filing Date – The date that the patent office received the initial paperwork for an application.
- Priority Date – The filing date of the first application disclosing the invention under the Paris Convention. The date can only be used to claim legal priority for 12 months after filing.
- Publication Date- The date that the document became available to the public.
The date a searcher chooses to limit any search should be carefully chosen based on the legal requirements, and should be agreed upon with the search requester prior to the search, whenever possible.
Date limitations can be very helpful in limiting the size of a results set. In certain cases, publications about a given technology may rapidly increase after a certain breakthrough discovery. If these early documents are definitely not of interest, it may be more efficient to limit a search to begin a few years after this initial peak. However, any decision about which references to include or exclude from a search should always be vetted by a conference with the search requester.
Example of A Typical Search Sequence
The following search sequence is a generic progression of search steps that could be applicable to many prior art investigations. For searching techniques specific to certain technology fields or types of prior art searches, see the full list of available best practices articles.
- 1. Understand the search. This usually requires reading of one or more technical publications in the field of search where familiarity is lacking. If the person who requested the search does not have any recommendations, a web search on the general search topic is usually good place to start for identifying these resources. Performing an entity search on any known authors or applicants can also help to orient the searcher and identify some useful references as a starting point.
- 2. Full-text search to quantify the scope of the art. Where the scope is broad, research the topic to narrow the scope with more specific search terms. For example, in a chemical engineering reactor search, is the topic a fluidized bed reactor or a packed bed reactor? If a packed bed reactor, what other terms are typically used for the reactor type and specific media used therein? Use an industry standard resource to become familiar with the terms of art (in this case, Perry's Handbook would be a good choice).
- 3. Identify related patent documents to determine more specific terms related to art in the field. (To continue the reactor example, a document may disclose silica as a type of inert media used in a packed bed reactor. However, silica is merely one species of inert media used in this type of reactor. Identify the other species and consider including them as additional keywords to broaden the search when appropriate.)
- 4. Narrow the search body with the most relevant classes and subclasses from the appropriate classification area(s) of interest. Patentability searches that encompass US art will benefit from a US class search in that collection, while at least IPC and/or ECLA classes should be used to adequately cover collections from other patent issuing authorities. A healthy discussion with a USPTO Examiner is also sometimes beneficial to determine important US subclasses that may otherwise be overlooked.
- 5. Search all relevant art within each chosen subclass. Review each central reference for additional keywords and structural features that can be used to massage the body of the full-text searching in (3).
- 6. Iterate (4) and (5) to identify additional references.
- 7. After exhausting (6), examine key central references for classes and subclasses not originally considered and repeat with respect to each new subclass.
- 8. Return to the full-text searching body and search the art for more recently identified keywords. If the search engine permits it, exclude search strings or subclasses which were already fully reviewed.
- 9. Search the remaining body of art using keywords found from central references, client notes, Examiner suggestions, etc.
- 10. Perform a forward and backward citation search on each centrally relevant reference found during the search. Examine any relevant document discovered by this process to ascertain why it was not discovered during the text/class search. Perform additional search iterations to cover any newly identified classes or keyword terms.
To learn more about strategies that can be helpful when searching a specific technology, or performing a specific type of legally motivated investigation, see the following articles for help.
- Business Methods
- Chemistry and Pharmaceuticals
- Chemical Engineering
- Computer and Information Sciences
- Electrical Engineering
- Electrical Communications
- Mechanical Engineering
- Medical Devices
- Physical Sciences
- Hunt, David, Long Nguyen, and Matthew Rodgers, (ed.). Patent Searching: Tools & Techniques. Hoboken, NJ: Wiley. 2007.