FIELD: physics, computer engineering.
SUBSTANCE: invention relates to processing natural language text and can be used to automate search of required documents in their large collection. Upon request, its content is processed on sentences. Sentences of the text array and the search request are compared pairwise and relevancy of each document of the text array to the request is calculated from the results based on sentences included in the document. The text array is indexed on separate sentences. Precise meaning of words in sentences is identified first and semantic links between them are established. The precise word meanings are then replace them by breaking down to elementary meanings which are stored for each meaning in thesaurus, after which a matrix is made for each sentence, which contains the link between all pairs of objects included in the sentence. An inverted index is then made, where for each object included in the text array, the documents, sentences and the number of times it is met is indicated.
EFFECT: invention enables comparison of phrases according to sense.
2 cl
Authors
Dates
2010-06-20—Published
2008-04-15—Filed