FIELD: information technology.
SUBSTANCE: invention relates to means of retrieving web-pages subject-matter. Obtaining possible web pages and pre-built machine learning model, wherein each possible web page comprises a plurality of pre-selected possible thematic sentences, wherein each possible thematic sentence comprises several verbal segments. Values of verbal characteristics are determined, which indicate levels of importance of verbal segments in each possible web page, respectively, and inputting said verbal characteristics values into machine learning model to obtain importance value for each verbal segment. For each possible web page determining the value of partial order for each possible thematic proposal in accordance with values of importance of verbal segments contained in a possible thematic proposal. For each possible web page, selecting one of a plurality of possible subject proposals associated with the partial order value, exceeding a predetermined threshold value as a target thematic sentence of a possible web page.
EFFECT: technical result consists in improvement of accuracy of subject proposals extracted from web pages.
20 cl, 6 dwg
Title | Year | Author | Number |
---|---|---|---|
CONSTRUCTION AND APPLICATION OF WEB-CATALOGUES FOR FOCUSED SEARCH | 2005 |
|
RU2382400C2 |
METHOD AND SYSTEM FOR CREATING ANNOTATION VECTORS FOR DOCUMENT | 2017 |
|
RU2720074C2 |
METHOD OF PROCESSING TARGET MESSAGE, METHOD OF PROCESSING NEW TARGET MESSAGE AND SERVER (VERSIONS) | 2014 |
|
RU2589856C2 |
COLLECTING DATA ON USER BEHAVIOUR DURING WEB SEARCH TO INCREASE WEB SEARCH RELEVANCE | 2007 |
|
RU2435212C2 |
METHOD OF DETERMINING PROFILE OF MOBILE DEVICE USER ON MOBILE DEVICE ITSELF AND DEMOGRAPHIC PROFILING SYSTEM | 2016 |
|
RU2647661C1 |
METHOD FOR DETERMINING SEQUENCE OF WEB BROWSING AND SERVER USED | 2014 |
|
RU2634218C2 |
METHOD AND SYSTEM OF SEARCH QUERY PROCESSING | 2015 |
|
RU2640639C2 |
CHECKING METHOD OF WEB PAGES FOR CONTENT IN THEM OF TARGET AUDIO AND/OR VIDEO (AV) CONTENT OF REAL TIME | 2013 |
|
RU2530671C1 |
METHOD OF SELECTING EFFECTIVE VERSIONS IN SEARCH AND RECOMMENDATION SYSTEMS (VERSIONS) | 2013 |
|
RU2543315C2 |
SYSTEM AND METHOD FOR PROVIDING PREFERRED LANGUAGE FOR SORTING SEARCH RESULTS | 2004 |
|
RU2319202C2 |
Authors
Dates
2020-08-05—Published
2016-11-18—Filed