FIELD: computer equipment.
SUBSTANCE: method of classifying and filtering content in a network, performed on a computing device comprising at least a processor and memory, which comprises instructions for executing a preparatory step, on which a collection of HTML documents is formed, wherein collection is formed so that each of documents included in it can be related to different classes of content; converting obtained from previous step data from HTML document into text; generating a token matrix for training the ensemble of classifiers; based on the produced token matrix, creating an ensemble of classifiers, comprising at least four classifiers, wherein for each classifier a decision priority is predetermined; a working step of obtaining a URL and downloading an associated HTML document; converting HTML document into text; generating a vector of pure tokens for the classifier ensemble; starting the ensemble of classifiers trained at the preparatory stage; outputting an analysis result comprising a result of classifying content contained in the received document; content is filtered based on the obtained class.
EFFECT: technical result consists in improvement of accuracy of classification and filtration of prohibited content in a network.
12 cl, 6 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD FOR ATTRIBUTION OF PARTIALLY STRUCTURED TEXTS FOR FORMATION OF NORMATIVE-REFERENCE INFORMATION | 2020 |
|
RU2750852C1 |
METHOD FOR TEXTUAL INFORMATION RECOGNITION AND ITS INTEGRITY EVALUATION IN INTERNET ELECTRONIC DOCUMENTS | 2013 |
|
RU2550543C1 |
METHOD AND SYSTEM FOR EXTRACTING NAMED ENTITIES | 2021 |
|
RU2823914C2 |
RETRIEVAL OF INFORMATION OBJECTS USING A COMBINATION OF CLASSIFIERS ANALYZING LOCAL AND NON-LOCAL SIGNS | 2018 |
|
RU2686000C1 |
METHOD AND SYSTEM FOR STATIC ANALYSIS OF EXECUTABLE FILES BASED ON PREDICTIVE MODELS | 2020 |
|
RU2759087C1 |
METHOD AND SYSTEM FOR ARRANGING DIALOGUE WITH USER IN USER-FRIENDLY CHANNEL | 2018 |
|
RU2688758C1 |
ESG-RATING WORD PROCESSING SYSTEM | 2023 |
|
RU2825081C1 |
METHOD OF DETERMINING PROFILE OF MOBILE DEVICE USER ON MOBILE DEVICE ITSELF AND DEMOGRAPHIC PROFILING SYSTEM | 2016 |
|
RU2647661C1 |
METHOD AND SERVER FOR PROCESSING TEXT SEQUENCE IN MACHINE PROCESSING TASK | 2020 |
|
RU2775820C2 |
AUTOMATED LEGAL ADVICE SYSTEM CONTROL METHOD | 2019 |
|
RU2718978C1 |
Authors
Dates
2020-12-11—Published
2020-05-12—Filed