Organization data map
organization data issues
organization Data Sources
The organizations covered by PatentPia are i) right holders of patents (applicant, current right holder, previous right holder, etc.), ii) parties to events, iii) patent agents, iv) organizations of persons such as authors of papers, v) commercial entities in the corporate-financial-economic-industrial and social/market/news data, and v) researchers at universities/research institutes.
Parties to events include i) plaintiffs/defendants in litigation, ii) claimants/defendants in trials as petitioners, iii) assignees/assignors in patent assignments, iv) declarant in standard patents, v) applicants/grantors in FDA approvals, vi) participants in government-funded R&D, etc.
The need for organization data maintenance
If organizational data is not maintained, there is a serious loss of quality not only in search, but also in aggregation and analysis, and organizational data maintenance is essential for crossing over between two or more nations, two or more languages, or two or more heterogeneous data.
Organization identification issues(organizational expressions vs. unique IDs)
Company/organization identification within the patent world vs. outside the patent world
For Korean companies or organizations, we identify companies with the official mapping data between 'application number (granted by the intellectual property office) vs. corporate registration number/business number (granted by the nation)' provided by the Korean Intellectual Property Office (KIPO). Because of this, the accuracy of the mapping is at an upper level, even with other numbers outside the patent world (e.g., listed company codes) that are matched 1:1 with the corporate registration number.
However, for overseas companies (e.g., Apple) that have applied to the Korean Intellectual Property Office, there is no 'application number (given by the intellectual property office) vs. corporate registration number (given by the nation)' mapping, so we only use the applicant code. As a result, the accuracy of mapping to data outside the patent world is limited.
However, within a nation, the applicant code, which is assigned at the level of the nation's intellectual property office, can be used with high reliability for mapping between 'Applicant Name vs. Applicant Code vs. Patent'. In other words, in countries such as Korea and Japan, which operate an applicant code system, the applicant code can be used to map patent sets. However, in the United States of America, China, Europe, etc., applicant codes are not open, so patent sets must be mapped by applicant name, and the accuracy is relatively lower than in Korea and Japan.
Representation of organization name(including representation of applicant name)
The first name can be divided into 'organization name' + 'organization type notation'. For example, in "Samsung Electronics co., ltd.", "Samsung Electronics" is the name of the organization and "co., ltd." is the type of organization. Representation of organization name is name representation. The reason for name representation is not only problems with organization name such as typos, spacing, tuning fork, partial omission, etc. in organization name representation, but also abbreviation (ltd.)/original representation (limited), omission of punctuation or unnecessary punctuation in organization type representation.
There are three main types of first name representation: i) representation of organization names within one nation, ii) representation of organization names within the same language within n countries, and iii) representation of organization names between different languages.
(within 1/n nations)Representation of organizational name representation within the same language
In real patent data or events data, there are a myriad of variant representations of a particular organization. The causes of the occurrence of variant notations are mainly i) diversity of notations for organizational forms, and ii) misspellings of organizational names. Among the organizational form notations, dozens of different notations are used for the expression of joint stock companies alone, such as 'co., ltd', 'co. ltd.', 'co. ltd.', 'co. limited'...etc. In Korea, too, '(株)' and '株式会社' are used interchangeably, especially when the organizational form in the first language is a transliteration in the second language (e.g., "kaishikigabusa" and "kaishikigabusha" for "joint stock company" in Japanese, etc.), and the variants are countless.
Representation of organizational name representation across languages
Patent data is published in the official language of each nation. Accordingly, 'Samsung Electronics' is written as 'Samsung Electronics' in Korea, 'Samsung Electronics' in the United States of America, etc., '三星電子' in Japan, and '三星电子' in China. Accordingly, we need to generate mapping data such as 'Samsung Electronics' = 'Samsung Electronics' = '三星電子' = '三星电子', etc.
Organization data processing
PatentPia's efforts
organization name maintenance system operation
Since 2008, PatentPia has built and operated its own organization name maintenance system. We have been continuously performing i) intra- and inter-nation maintenance within patent data, ii) inter-language maintenance, as well as maintenance between patent and non-patent (e.g., stock market) data.