Report:MARPAT on STN/Data Coverage/Patent Coverage/Introduction

From Intellogist

Jump to: navigation, search
  Report          
This search system report was created by the Intellogist Team and is available for viewing only. If you'd like to share your knowledge on Intellogist, please visit the Best Practices, Glossary, or Community Reports pages. Registered users may be notified of any substantial changes to this report by placing a "watch" on the Revisions page, which is the last page listed in the table of contents. To learn more about using the Intellogist "watchlist," see the Watchlist Help page.

<< Summary | Images >>

Introduction

MARPAT is a database of generic chemical structures found in the patents contained within Chemical Abstracts databases (CAplus, CAS REGISTRY, CASREACT, CHEMLIST, and CHEMCATS). The database can be used as a key to search patents by generic chemical structure queries (as opposed to known, specific structures). Some are initially confused as to why STN offers two distinct structure-searchable databases. This point is very important to comprehend. At the crux of the issue is this: CAS REGISTRY only contains records for known compounds with specific chemical structures. Therefore, REGISTRY does not take note of any potential chemical variants that are disclosed generically via a Markush chemical structure within a patent claim. A Markush claim is a claim that covers multiple alternative chemical structures; these claims may encompass numerous chemical substitutions. It often happens that hundreds of possible alternative structures may be disclosed through Markush groups in the claims of a single patent – again, most of these compounds exist only on paper, but they may still be legally relevant to a search.

Another reason to use MARPAT would be to discover so-called "prophetic" substances in patent documents. Some compounds disclosed in patents exist merely on paper, and have never been synthesized; unfortunately for patent searchers, until early 2008 those compounds would have been totally excluded from the REGISTRY file. These compounds are sometimes referred to as “paper chemistry,” or “prophetic" compounds. This presents a problem for chemical patent searchers, because numerous “prophetic” patent compounds may have been claimed in a Markush construction, without ever having been synthesized in a lab. Obviously, prior art searchers would still need to discover these patent documents, as these claims are still legally enforceable. Until January 2008, CAS policy was that CAS Registry numbers can only be assigned to compounds that have been discovered or synthesized in the natural world; however, after 2008, REGISTRY has also begun to index additional structures that fulfill these four criteria:[1]

  • Identified by CAS as coming from a reputable source, including but not limited to patents, journals, chemical catalogs, and selected substance collections on the web
  • Described in largely unambiguous terms
  • Characterized by physical methods or described in a patent document example or claim
  • Consistent with the laws of atomic covalent organization


When this indexing began, prophetic substances were only indexed from documents from certain patenting authorities. According to the CAS website:[2]

CAS coverage includes exemplified prophetic substances and uses identified in patents in all languages from 9 of the major patent offices from January 2009-present. Coverage also includes English-, French-, and German-language patents from 1998-2008 and selected patents from the same languages from 1993-1997. CAS has comprehensive coverage of prophetic substances from Japanese patents from 2009 to present. In addition, prophetic substances were extracted from selected Japanese patents from 2008 back through 2004, and publication year 2003 is in progress.

As mentioned in the above quote, some selected patent documents from prior to 2008 have been indexed as containing prophetic substances in REGISTRY, but the file should be regarded as principally containing compounds that have been synthesized in the natural world.

The MARPAT database was created in 1988 to address the problem of retrieving patents with generically disclosed structures via a structure searching mechanism. The indexing in MARPAT differs from that of REGISTRY in two major points: 1) it is indexed broadly to capture all possible variations of any generically disclosed structure in a patent document, and 2) by nature, it includes compounds that have never existed in the natural world, but were described or claimed in a patent document. Although REGISTRY has included prophetic compounds from 2008 forward, it will never be possible for CAS to assign REGISTRY numbers to every compound disclosed in a Markush patent claim. Thus, MARPAT retains its utility and relevance as a generically disclosed structure database.

While MMS runs on the Markush Darc (or MDARC) software, a different search software algorithm was developed for MARPAT, on which CAS holds the patent.[3] Unlike MMS, MARPAT requires users to graphically input a chemical structure, which is then automatically translated into query language (according to Adams, the system relies on a form of “connection table”) for the software to run the search.


Patent Coverage

Each of the three main CAS structure-searchable files contains distinctly different coverage. While REGISTRY is a dictionary-style database of chemical names and structures and CAplus is a bibliographic database of both patents and literature citations related to chemistry, MARPAT contains only patents from CAplus which generically disclose/claim chemical structures. The MARPAT file contains records from 1988 (and some additional pre-1988 records from INPI, discussed further below). Beginning in 2012, CAS is also adding additional backfile coverage from 1987 on with Markush content for English, German, and Japanese-language patents.[4] As of March 2012, MARPAT contained over 377,210 patent records[5] and over 918,500 searchable Markush structures.[6] The database is updated daily with approximately 60-75 patent citations and 150-200 Markush structures. MARPAT principally focuses on claimed structures, but often includes structures disclosed in the patent specification as well.[7]

The database covers organic or organometallic molecules found in patents from most countries covered by the Chemical Abstracts service (the exceptions is Russian patents prior to the year 2000). Alloys, metal oxides, inorganic salts, intermetallics, and polymers are not indexed into the database.

Not every patent document issued by a particular country will be indexed by CAS: patent documents are selected on the basis of their relation to one of the 80 Chemical Abstracts sections (listed on the CAS website). In practice, CAS chooses patent documents to index by focusing on those documents with relevant International Patent Classification (IPC) codes. There is comprehensive coverage for all documents in centrally relevant IPC codes, with selective coverage for codes that are on the fringe of the technologies covered by CAS. A list of the core and secondary IPC codes can be found on the CAS website.

MARPAT includes the Markush structure records for records found in CAplus from 1988 to present, with some exceptions made for authorities that were incorporated into CAplus later than 1988. Currently, 63 patenting authorities are covered by the file (this represents real growth in the file: in 2001, only 33 patenting authorities were covered).[8] According to a CAS representative, this country coverage chart on the CAS website corresponds to the country coverage of both CAplus and MARPAT (although coverage dates for MARPAT may be different than those dates listed in the chart).

With the addition of published PCT applications from WIPO generic structure indexing in 2008, the active patenting authorities in MARPAT are probably equal to those active in CAplus. For more information about CAplus coverage, see the the coverage chart on the CAS website.

Although indexing for MARPAT began in 1988, the CAS website states that some supplementary Markush structures from 1961-1987 have been derived from INPI data. INPI, the French Patent Office, has performed Markush structure indexing for six major patenting authorities (United States (US), European Patent Office (EP), Patent Cooperation Treaty (WO/PCT), Germany (DE), France (FR), United Kingdom (GB)) from 1978, and also has indexed a collection of special French Medicine (Pharmaceutical) Patents from 1961-1976. While MARPAT states that it contains indexed structures from INPI back to 1961, users should know that only this special collection of French pharmaceutical patents could possibly be covered by INPI before 1976. Although documents from the same patenting authorities are indexed to produce the content of both MARPAT and REGISTRY (with a few exceptions), chemical structure searches within the two files often yield different results due to indexing conventions (and possibly, sometimes, indexing error) and searches should be conducted in both of them whenever possible.


Sources

  1. "REGISTRY/ZREGISTRY (CAS REGISTRYSM)." CAS website, http://www.cas.org/ASSETS/9577694C17BA45F4887088140072393A/registry.pdf. Accessed July 7, 2011.
  2. "CAS Coverage of Prophetic Substances." CAS website, http://www.cas.org/expertise/cascontent/prophetics.html. Accessed July 7, 2011.
  3. Adams, Stephen R. Information Sources in Patents, 2nd Ed. Munich: K.G. Saur, 2006. Page 148.
  4. "MARPAT Database Enhanced with Additional Markush Backfile Content for STN." CAS website, http://www.cas.org/support/stngen/stnews/dbnews/mar2012.html#marpat. Accessed March 23, 2012.
  5. "MARPAT." STN International website, http://www.fiz-karlsruhe.de/marpat.html?&L=1. Accessed April 12, 2012.
  6. American Chemical Society. MARPAT User Guide. March 2012. PDFcari website, http://www.pdfcari.com/MARPAT-User-Guide.html#. Accessed April 12, 2012.
  7. "MARPAT® - The CAS Markush database containing the keys to generic substances in patents." CAS website, http://www.cas.org/expertise/cascontent/marpat.html. Accessed on July 7, 2011.
  8. Austin, Robert. “The Complete Markush Structure Search: Mission Impossible?” Presented at the PIUG NE Workshop, October 16th, 2001. STN Website, http://www.stn-international.de/training_center/chemistry/piug1.pdf. Accessed on January 28, 2008.
Patent search questions. Expert answers.  Brought to you by Landon IP
HOT Items

Intellogist is brought to you by the patent search experts at Landon IP.

Welcome to Intellogist!

To network with our international community of patent info pros, please create an account.

For a list of our current members, see our Community Page.