FIELD: computer equipment.
SUBSTANCE: invention relates to computer engineering. Technical result is achieved by selecting a predetermined number of columns from the table as subject candidate columns, each candidate column is potentially suitable for the correct subject table column, with each subject candidate column including a plurality of values; for each subject candidate column: determining the joint occurrence for values in the subject candidate column, including determining how often the values in the subject candidate column also occur in the correct subject columns in a variety of other tables, calculating an estimate for the subject candidate column based on the said determined collaborative occurrence, the computed estimate showing the likelihood that the subject candidate column is the correct subject column; and classifying the subject candidate column as one of the correct subject table column and a non-proprietary column of the table based on the calculated score for the subject candidate column.
EFFECT: technical result consists in increasing the efficiency of detecting one or more subject table columns.
33 cl, 11 dwg
Title | Year | Author | Number |
---|---|---|---|
EXTRACTING INFORMATION FROM STRUCTURED DOCUMENTS CONTAINING TEXT IN NATURAL LANGUAGE | 2015 |
|
RU2607976C1 |
SYSTEM AND METHOD FOR SELECTING RELEVANT PAGE ITEMS WITH IMPLICITLY SPECIFYING COORDINATES FOR IDENTIFYING AND VIEWING RELEVANT INFORMATION | 2015 |
|
RU2708790C2 |
METHOD AND SYSTEM OF SEMANTIC PROCESSING TEXT DOCUMENTS | 2016 |
|
RU2630427C2 |
CONSTRUCTING QUERIES FOR EXECUTION OVER MULTI-DIMENSIONAL DATA STRUCTURES | 2014 |
|
RU2679977C1 |
METHOD AND SYSTEM FOR STORING AND SEARCHING INFORMATION EXTRACTED FROM TEXT DOCUMENTS | 2015 |
|
RU2605077C2 |
RECOVERY OF TEXT ANNOTATIONS RELATED TO INFORMATION OBJECTS | 2017 |
|
RU2665261C1 |
METHOD OF DETECTING TRAINING DATA FOR MACHINE LEARNING OF COMPUTER SYSTEM OF INDUSTRIAL INTERNET OF THINGS POWERED BY RECHARGEABLE BATTERY | 2023 |
|
RU2819568C1 |
LONG-TERM STORAGE OF TYPES AND COPIES OF NET DATA | 2005 |
|
RU2400803C2 |
METHOD AND SYSTEM FOR AUTOMATIC LEGAL DECISION-MAKING | 2019 |
|
RU2732071C1 |
METHODS AND SYSTEMS FOR CONVERTING MATRIXES BASED ON SPARSE VECTORS | 2019 |
|
RU2764557C1 |
Authors
Dates
2018-10-29—Published
2014-06-30—Filed