FIELD: information technology.
SUBSTANCE: invention relates to means of summarising an electronic document. Method comprises generating a feature vector for an electronic document, wherein feature vector comprises a plurality of features of electronic document. Weight coefficient is assigned to each of plurality of features. Summarisability core is assigned to electronic document to be summarised in accordance with weight coefficient assigned to each of plurality of features, wherein summarisability score indicates whether electronic document is summarisable. Method then includes determining if electronic document is summarisable. Electronic document is split into a plurality of parts, wherein each of plurality of parts is associated with a respective length, corresponding to an informativeness score, and a respective coherence score. Method then includes automatically selecting a subset of plurality of parts, such that an aggregate informativeness score of subset is maximised, while an aggregate length of subset is less than or equal to a maximum length. Subset is then arranged as a summary of electronic document.
EFFECT: technical result is improved relevance for finding documents.
23 cl, 7 dwg
Authors
Dates
2016-08-27—Published
2012-09-11—Filed