FIELD: information technology.
SUBSTANCE: method for automatic semantic classification of natural language texts comprises presenting each text to be classified in digital form for subsequent processing; indexing the text to obtain elementary units of the first through fifth levels; detecting the frequency of occurrence of units of the fourth level, each being a semantically significant object or attribute, and the frequency of occurrence of semantically significant relationships linking semantically significant objects, as well as objects and attributes; forming a semantic network from a triad which is units of the fifth level; renormalising the frequencies of occurrence into the semantic weight of the units of the fourth level; ranking the units of the fourth level according to the semantic weight by comparison thereof with a threshold value and those having a weight below the threshold value; detecting the degree of crossing semantic networks of the text and text samples; selecting as a class for text object regions, the degree of crossing the semantic network with the semantic network of text is greater than the threshold.
EFFECT: faster process of comparing texts.
6 cl, 2 dwg, 24 tbl
Authors
Dates
2015-01-10—Published
2013-08-22—Filed