FIELD: information technology.
SUBSTANCE: facts are extracted from electronic documents by recognising factual descriptions using a fact-word table to match to words of the electronic documents. The words of those factual descriptions may be tagged with the appropriate part of speech. More detailed analysis is then performed on those factual descriptions, rather than on the entire electronic document, and particularly to the text in the neighbourhood of the fact-word matches. The analysis may involve identifying the linguistic constituents of each phrase and determining the role as either subject or object. Exclusion rules may be applied to eliminate those phrases unlikely to be part of facts, the exclusion rules being based in part on the linguistic constituents. Scoring rules may be applied to remaining phrases, and for those phrases having a score in excess of a threshold, the corresponding sentence part, whole sentence, paragraph, or other document portion may be presented as representing one or more facts.
EFFECT: more accurate search results.
20 cl, 6 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD FOR SYNTHESIS OF SELF-TEACHING SYSTEM FOR EXTRACTING KNOWLEDGE FROM TEXT DOCUMENTS FOR SEARCH ENGINES | 2002 |
|
RU2273879C2 |
SYSTEM AND METHOD FOR SEMANTIC SEARCH | 2013 |
|
RU2563148C2 |
METHOD OF CLUSTERING OF SEARCH RESULTS DEPENDING ON SEMANTICS | 2014 |
|
RU2564629C1 |
METHOD FOR AUTOMATED ANALYSIS OF TEXT AND SELECTION OF RELEVANT RECOMMENDATIONS TO IMPROVE READABILITY THEREOF | 2021 |
|
RU2769427C1 |
EXPANDING OF INFORMATION SEARCH POSSIBILITY | 2015 |
|
RU2618375C2 |
METHOD FOR AUTOMATIC TEXT PROCESSING IN NATURAL LANGUAGE THROUGH SEMANTIC INDEXATION, METHOD FOR AUTOMATIC PROCESSING COLLECTION OF TEXTS IN NATURAL LANGUAGE THROUGH SEMANTIC INDEXATION AND COMPUTER READABLE MEDIA | 2008 |
|
RU2399959C2 |
COMPREHENSIVE AUTOMATIC PROCESSING OF TEXT INFORMATION | 2014 |
|
RU2662699C2 |
METHOD FOR AUTOMATIC SEMANTIC INDEXING OF NATURAL LANGUAGE TEXT | 2012 |
|
RU2518946C1 |
METHOD AND SYSTEM OF SEMANTIC PROCESSING TEXT DOCUMENTS | 2016 |
|
RU2630427C2 |
METHOD OF SEARCHING FOR INFORMATION IN TEXT ARRAY | 2008 |
|
RU2392660C2 |
Authors
Dates
2012-05-27—Published
2007-07-20—Filed