Report:VantagePoint/Data Visualization/Co-occurency Matrices
|Report||Patent Coverage Map||Ratings||Comments|
|This report was created by the Intellogist Team and is available for viewing only. If you'd like to share your knowledge on Intellogist, please visit the Best Practices, Glossary, or Community Reports pages. If you are a registered user and would like to be notified of any substantial changes to this report, you may place a "watch" on the Revisions page, which is the last page listed on the table of contents. To learn more about using the Intellogist "watchlist," see the Watchlist Help page.|
VantagePoint has four main types of matrices: co-occurrence, cross-correlation, auto-correlation, and factor. The first three are accessed via Sheets --> Add Matrix, while the factor matrix is found via Sheets --> Add Factor Matrix. Users should note that it is often useful to create smaller groups within a field before creating matrices. This not only reduces processing time, but allows users to focus on only the important records with a field. For example, before creating the matrix, users could create a top 20 group within the inventor field to capture only those who have made significant contributions in the art.
The three main types of matrices are initiated from the same menu. Only one field is selected for the auto-correlation matrix, while two fields are used with co-occurrence and cross-correlation matrices.
A co-occurence matrix lists the number of times items from one field occur with the items in the second field. An example co-occurrence matrix of patent assignees to international (IPC) classification is shown in the figure below. The assignee name "Denso Corp." occurs 30 times on documents that also have been classified with international classification B60H. This shows Denso has a large number of patents in this class, and may indicate this technology area is at the core of their business.
An auto-correlation matrix shows the relationships among data points within a single data field. After selecting the field to view, users can see a numeric value given to each relationship. Numbers close to 1 indicate a stronger correlation than numbers close to 0, or negative numbers. The system allows for color coding of the values. Shown in the picture below, the red color coding gets darker the closer to 1 the value gets. For example, the inventors Perkins, Small and Clark have a value of 1 in relation to one another, while Oomura and Homan have a value of .858. This would indicate that Perkins, Small and Clark worked exclusively with one another on the patents in the dataset, while Oomura and Homan mostly worked together.
A cross-correlation matrix shows the relationship between members of one field, using an additional field as a point of reference. In the example given, inventor names have been used to find similarities between patent assignees. The relationship between Nippon Denso Co. and Denso Corp. has the highest value, indicating that these companies are related more closely than just sharing similar names. This could indicate a partnership or a subsidiary that was not combined into the parent company during the initial list cleanup.
A factor matrix uses a statistical analysis (a Principal Component Analysis, to be specific) to determine how members within a list are inter-related. As with a factor map, users select the field to analyze (shown below), along with the number of factors to use in the calculation (typically the square root of the number of members within the field). A factor matrix is best used with a group containing a limited number of individual members, such as a Top 20 list.
The results of the analysis are similar to the other matrices available in VantagePoint. Again, using the color coding is recommended to help visualize the results. In the below example, the groupings within a factor show relatedness between inventors. Factor 3 groups Homan and Oomura together, while Factor 5 groups Clark, Perkins, and Small together. This confirms the finding of inventor groups using the other matrices.
Once users determine what type of relationships and correlations they are looking for, using the matrices is a good way to find them. However, using smaller groups, such as a top 30 list, is critical for getting good results. Not only does it reduce processing time, but it removes virtually useless information that can clutter the results if not removed. Adding a color to each matrix is highly recommended, as it makes the results easier to read.
Differences between a Factor Matrix and an Auto-Correlation Matrix
The differences between a factor matrix and an auto-correlation matrix are very similar to those found between the factor maps and auto-correlation maps. See Mapping Document Clusters to read more about those differences.