RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://patents.google.com/patent/CN109583443B/en below:

CN109583443B - Video content judgment method based on character recognition

CN109583443B - Video content judgment method based on character recognition - Google PatentsVideo content judgment method based on character recognition Download PDF Info

Publication number: CN109583443B
Authority: CN; China
Prior art keywords: character; training; image; files; model
Prior art date: 2018-11-15
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Active

Application number

CN201811360543.9A

Other languages

Chinese (zh)

Other versions

CN109583443A (en

Inventor

å¨å»ºæ³¢

é«å²

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Sichuan Changhong Electric Co Ltd

Original Assignee

Sichuan Changhong Electric Co Ltd

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2018-11-15

Filing date

2018-11-15

Publication date

2022-10-18

2018-11-15 Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd

2018-11-15 Priority to CN201811360543.9A priority Critical patent/CN109583443B/en

2019-04-05 Publication of CN109583443A publication Critical patent/CN109583443A/en

2022-10-18 Application granted granted Critical

2022-10-18 Publication of CN109583443B publication Critical patent/CN109583443B/en

Status Active legal-status Critical Current

2038-11-15 Anticipated expiration legal-status Critical

Links

238000000034 method Methods 0.000 title claims abstract description 21
238000001514 detection method Methods 0.000 claims abstract description 23
238000003058 natural language processing Methods 0.000 claims abstract description 5
238000012549 training Methods 0.000 claims description 55
238000012795 verification Methods 0.000 claims description 36
238000013528 artificial neural network Methods 0.000 claims description 18
238000013527 convolutional neural network Methods 0.000 claims description 7
230000000306 recurrent effect Effects 0.000 claims description 7
238000012545 processing Methods 0.000 claims description 6
230000011218 segmentation Effects 0.000 claims description 2
238000013473 artificial intelligence Methods 0.000 description 4
238000005516 engineering process Methods 0.000 description 4
238000012986 modification Methods 0.000 description 2
230000004048 modification Effects 0.000 description 2
230000009286 beneficial effect Effects 0.000 description 1
238000004364 calculation method Methods 0.000 description 1
230000007547 defect Effects 0.000 description 1
238000011161 development Methods 0.000 description 1
230000018109 developmental process Effects 0.000 description 1
230000000694 effects Effects 0.000 description 1
230000006870 function Effects 0.000 description 1
239000000126 substance Substances 0.000 description 1

Images

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/635—Overlay text, e.g. embedded captions in a TV program
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words

Landscapes

Engineering & Computer Science (AREA)
Physics & Mathematics (AREA)
Theoretical Computer Science (AREA)
General Physics & Mathematics (AREA)
General Health & Medical Sciences (AREA)
General Engineering & Computer Science (AREA)
Biophysics (AREA)
Computational Linguistics (AREA)
Data Mining & Analysis (AREA)
Evolutionary Computation (AREA)
Artificial Intelligence (AREA)
Molecular Biology (AREA)
Computing Systems (AREA)
Biomedical Technology (AREA)
Life Sciences & Earth Sciences (AREA)
Mathematical Physics (AREA)
Software Systems (AREA)
Health & Medical Sciences (AREA)
Multimedia (AREA)
Computer Vision & Pattern Recognition (AREA)
Character Discrimination (AREA)
Image Analysis (AREA)

Abstract

The invention discloses a video content judgment method based on character recognition, which comprises the following steps: A. screenshot is carried out on a video picture; B. calling a pre-trained character detection model to analyze character areas of the screenshot picture, finding out and dividing the character areas in the picture, and obtaining one or more character areas; C. after the character areas are detected, calling a character recognition model trained in advance, circularly performing character recognition on each character area, and recognizing the character content of each character area; D. and (4) carrying out natural language processing aiming at the recognized character content, understanding the semantics of the character content and making corresponding video playing setting. The video content judgment method can run on the embedded platform in real time, can identify the character information in the video, and can set scenes according to the prompt of the character information.

Description Video content judgment method based on character recognition

Technical Field

The invention relates to the technical field of image recognition, in particular to a video content judgment method based on character recognition.

Background

With the rapid development of artificial intelligence technology, artificial intelligence has gradually entered into various aspects of human life. By utilizing the artificial intelligence technology, the television is intelligentized, the use experience of a user can be greatly improved, and the life of people becomes more convenient.

Video image information in a television often contains a large amount of information content. In addition to the image frame, a frame of image may also contain text information, which is usually a display of important information of the currently playing scene. Compared with ever-changing image information, the text information is analyzed, and generally, which scene is played currently is easier to know.

At present, artificial intelligence technologies of most products are operated at a cloud server end of the internet, and due to the limitation of hardware conditions of an Android system, large-scale calculation cannot be operated, and too many resources, such as occupation of a CPU (central processing unit), cannot be occupied, so that a good technical scheme for character recognition in an image scene operated on an embedded platform is temporarily provided.

Disclosure of Invention

The invention aims to overcome the defects in the background art, and provides a video content judgment method based on character recognition, which can run on an embedded platform in real time, can recognize character information in a video, and can perform scene setting (image or voice setting) according to the prompt of the character information, and is suitable for specific fields, such as the television field and the like.

In order to achieve the technical effects, the invention adopts the following technical scheme:

a video content judgment method based on character recognition comprises the following steps:

A. screenshot is carried out on a video picture;

B. calling a pre-trained character detection model to analyze character areas of the screenshot picture, finding out and dividing the character areas in the picture, and obtaining one or more character areas;

C. after the character areas are detected, calling a character recognition model trained in advance, circularly recognizing characters of each character area, and recognizing the character content of each character area;

D. and (4) carrying out natural language processing aiming at the recognized character content, understanding the semantics of the character content and making corresponding video playing setting.

Further, the step A also comprises the step of dividing and setting a plurality of image areas needing character recognition on the screenshot picture;

the step B specifically comprises the following steps:

B1. calling a character detection model which is trained in advance to analyze character areas of the screenshot picture, finding out and dividing the character areas in the picture, and obtaining one or more character areas;

B2. and C, if the detected character area is in the preset image area needing character recognition, otherwise, returning to the step A.

Further, the character detection model in the step B is a convolutional neural network.

Further, the convolutional neural network is a mobilenet-ssd neural network based on tensoflow.

Further, the training procedure for the convolutional neural network is as follows:

s1, collecting a preset number of video image samples with text contents according to the input characteristics of a neural network;

s2, extracting at least rectangular frame coordinates, text contents, information of text language categories, and image size and image format information of an image sample per Zhang Youwen text content video image sample;

s3, aiming at the image samples and the sample information thereof obtained in the steps S1 and S2, generating training files and verification files in tfrechrd formats supported by tensoflow, wherein the images of the training files and the verification files are different, and the image formats stored in the training files and the verification files are the same as the image information formats;

s4, training the model by using the training file to generate a predetermined character detection model, and verifying the generated character detection model by using the verification file;

s5, if the verification accuracy is larger than or equal to a preset threshold value, or the training step number reaches a certain step number, finishing the training;

and S6, if the verification accuracy is lower than the preset threshold, increasing a video image sample with text content or debugging model parameters, and repeatedly executing the steps S1 to S4 until the training is finished.

Further, the character recognition model in the step C is a convolutional recurrent neural network based on an attention model.

Further, the convolutional recurrent neural network based on the Attention model is an Attention-CRNN neural network based on tensoflow.

Further, the training step of the convolutional recurrent neural network based on the attention model is as follows:

s101, creating a Chinese dictionary, cutting a character area image in a video image sample used in a character detection model, and generating a character image sample data set;

s102, combining sample data into a Chinese dictionary, generating tfrecrd format files required by training, and dividing the tfrecrd format files into training files and verification files, wherein the images of the training files and the verification files are different, but the stored image formats are the same as the image information formats;

s103, training the model by using a training file to generate a predetermined character recognition model, and verifying the generated character recognition model by using a verification file;

s104, if the verification accuracy is larger than or equal to a preset threshold value, or the training step number reaches a certain step number, finishing training;

and S105, if the verification accuracy is lower than the preset threshold, adding a video image sample with text content or debugging model parameters, and repeatedly executing the steps S101 to S103 until the training is finished.

Further, the step D specifically includes the following steps:

D1. dividing the recognized characters into words and single phrases;

D2. carrying out keyword matching on each phrase and a predetermined phrase table;

D3. if the phrase in the current image is the predetermined phrase and the continuous frames of images are all the predetermined phrase, the current image scene is judged to be the predetermined phrase scene, and corresponding scene processing is performed.

Compared with the prior art, the invention has the following beneficial effects:

according to the video content judgment method based on character recognition, the content displayed by the current video image can be judged by automatically recognizing the characters in the video image, and corresponding scene content processing is carried out.

Drawings

Fig. 1 is a flowchart illustrating a method for determining video content based on text recognition according to an embodiment of the present invention.

Detailed Description

The invention will be further elucidated and described with reference to the embodiments of the invention described hereinafter.

The embodiment is as follows:

the first embodiment is as follows:

as shown in fig. 1, a method for determining video content based on text recognition, which is applied to an intelligent television system in this embodiment, mainly includes the following steps:

step 1: screenshot is carried out on a video picture:

when the television system detects that there is a video stream, a current video image is cut out of the video stream every 1s, and the video image is 1080P (size: 1920 Ã 1080). Meanwhile, after the background program acquires the images, images with the width of 640 and the height of 360 are generated uniformly through an image scaling technology and are sent to a pre-trained character detection model for detection.

Step 2: calling a character detection model which is trained in advance to analyze character areas of the screenshot picture, finding out the character areas in the picture and dividing the character areas to obtain one or more character areas:

and analyzing the character areas of the image by the character detection model trained in advance, automatically finding the character areas in the picture, giving the coordinates and the width and the height of the character areas, and obtaining a plurality of character areas.

In order to improve efficiency, a plurality of image areas to be subjected to character recognition are generally set in advance. When the detected character area is in a preset image area needing character recognition, entering the next step for character recognition; if the detected character area is not in the preset image area needing character recognition, the next character recognition is not carried out, and the character detection and recognition are directly carried out on the next frame of image.

In this embodiment, for the detected text region, the picture is scaled, and the method includes:

for the horizontal text region, the fixed height is 150, and when the width is less than 600, the image is filled to 600. When the width is larger than 600, the image is cut into a plurality of 500 Ã 150 images (the first image is special, the cut is 550 Ã 150), and special processing is performed on the edge, namely, 50 Ã 150 images are respectively placed at the left and right edges of the original image to be cut into the newly cut image. If the last image is less than 600 width, the last image is filled to 600 width. Finally, the text area image is generated into a plurality of 600 Ã 150 images.

For the vertical character area, the characters are cut into single characters for recognition according to the principle that the font proportion is about 0.7-1 and the length-width ratio of the character area, after a single character image is cut, the fixed height of the image is scaled to be 150, and the width of the image is filled to be 600. And finally, generating a plurality of character area images.

And step 3: after the character areas are detected, calling a character recognition model trained in advance, circularly recognizing characters of each character area, and recognizing the character content of each character area:

and calling a pre-trained Chinese text recognition model, circularly performing character recognition on each character area, and recognizing the character content of each character area.

And 4, step 4: and (3) aiming at the recognized text content, performing natural language processing, understanding the semantics of the text content, and making corresponding video playing settings:

the method specifically comprises the following steps:

step 4.1, dividing the recognized characters into words and phrases;

step 4.2, carrying out keyword matching on each phrase and a predetermined phrase table;

and 4.3, if the phrase in the current image is the predetermined phrase and the continuous frames of images are all the predetermined phrase, judging that the current image scene is the predetermined phrase scene, and carrying out corresponding scene processing.

Specifically, in this embodiment, for the text content, the pre-defined text includes phrases such as "advertisement" and "news". These phrases may all represent the image content category of the current video playback. And (4) carrying out natural language processing such as word segmentation, word group matching and the like aiming at the extracted character contents. When a predefined phrase such as an advertisement phrase or a news phrase is detected in the same position in a plurality of continuous frames, the current scene is determined to be a scene such as an advertisement or news, and corresponding video playing setting is subsequently made according to different scenes.

Specifically, the pre-trained character detection model is a convolutional neural network, and in this embodiment, a mobilenet-ssd neural network based on tensoflow of google is specifically adopted. The training process of the neural network is as follows:

A. aiming at the input characteristics of the neural network, collecting about 5000 video image samples with character contents in a television playing video, and uniformly setting the size of 640 Ã 360;

B. extracting information such as rectangular frame coordinates, text contents, text language categories and the like of an area where the text is located and information such as image size and image format of the image sample per se from the video image sample of the Zhang Youwen text contents;

C. aiming at the image samples and the sample information thereof obtained in the two steps, training files and verification files in tfrecrd format supported by tensoflow are generated, wherein the images of the training files and the verification files are different, but the stored image formats are the same as the image information formats.

D. Training the model by using a training file to generate a predetermined character detection model, and verifying the generated character detection model by using a verification file;

E. if the verification accuracy is greater than or equal to a preset threshold (the preset verification accuracy threshold is 95% in this embodiment), or the training step number reaches a certain step number (2 ten thousand steps), the training is completed;

F. if the verification accuracy is lower than the preset threshold (95%), increasing video image samples with text content or debugging model parameters, and repeatedly executing the step A, B, C, D, E until the training is completed.

G. And generating a tflite model file for being called by the android program.

Specifically, the pre-trained character recognition model is a convolutional recurrent neural network based on an Attention model, and an Attention-CRNN neural network based on tenserflow of google is specifically adopted in this embodiment. Although the Attention-CRNN is composed of several different neural networks and components (CNN, RNN, attention), they can be trained end-to-end using the same penalty function. Therefore, the model can be trained uniformly, and the training process is as follows:

A. a chinese dictionary was created containing 5462 chinese characters. Only two columns in the dictionary, the left column is serial numbers (0,1, â¦ â¦), and the right column is Chinese characters (toilet, in-pants, â¦ â¦);

B. cutting a character area image in a video image sample used in a character detection model, and generating a character image sample data set with the image size width of 600 and the height of 150;

C. and combining the sample data set with a Chinese dictionary to generate tfrecrd format files required by training, wherein the tfrecrd format files are also divided into training files and verification files, the images of the training files and the verification files are different, but the stored image formats of the training files and the verification files are the same as the image information formats of the training files and the verification files.

D. Training the model by using a training file to generate a predetermined character recognition model, and verifying the generated character detection model by using a verification file;

E. if the verification accuracy is greater than or equal to a preset threshold (the preset verification accuracy threshold is 90% in this embodiment), or the training step number reaches a certain step number (2 ten thousand steps), the training is completed;

F. if the verification accuracy is lower than the preset threshold (90%), increasing video image samples with text content or debugging model parameters, and repeatedly executing the step A, B, C, D, E until the training is completed.

G. And generating a tflite model file for being called by the android program. .

It will be understood that the above embodiments are merely exemplary embodiments adopted to illustrate the principles of the present invention, and the present invention is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims (5)

1. A video content judgment method based on character recognition is characterized by comprising the following steps:

A. screenshot is carried out on a video picture;

C. after the character areas are detected, calling a character recognition model trained in advance, circularly performing character recognition on each character area, and recognizing the character content of each character area;

D. carrying out natural language processing aiming at the recognized character content, understanding the semantics thereof and making corresponding video playing setting;

the step A also comprises the division setting of a plurality of image areas needing character recognition on the screenshot picture;

the step B specifically comprises the following steps:

B1. calling a pre-trained character detection model to analyze character areas of the screenshot picture, finding out and dividing the character areas in the picture, and obtaining one or more character areas;

B2. if the detected character area is in a preset image area needing character recognition, entering the step C, otherwise, returning to the step A;

the character detection model in the step B is a convolutional neural network based on a mobilenet-ssd neural network of tensoflow; the training procedure for the convolutional neural network is as follows:

s1, collecting a preset number of video image samples with text contents according to the input characteristics of a neural network;

s4, training the model by using the training file to generate a predetermined character detection model, and verifying the generated character detection model by using the verification file;

s5, if the verification accuracy is larger than or equal to a preset threshold value, or the training step number reaches a certain step number, finishing the training;

and S6, if the verification accuracy is lower than a preset threshold, increasing a video image sample with text content or debugging model parameters, and repeatedly executing the steps until the training is finished.

2. The method as claimed in claim 1, wherein the character recognition model in step C is a convolutional recurrent neural network based on attention model.

3. The method as claimed in claim 2, wherein the convolutional recurrent neural network based on Attention model is an Attention-CRNN neural network based on tensoflow.

4. The method of claim 3, wherein the training of the convolutional recurrent neural network based on the attention model comprises the following steps:

s101, creating a Chinese dictionary, cutting a character area image in a video image sample used in a character detection model, and generating a character image sample data set;

s103, training the model by using a training file to generate a predetermined character recognition model, and verifying the generated character recognition model by using a verification file;

s104, if the verification accuracy is larger than or equal to a preset threshold value, or the training step number reaches a certain step number, finishing training;

5. The method for determining video content based on character recognition according to any one of claims 1 to 4, wherein the step D specifically comprises the following steps:

D1. performing word segmentation on the recognized characters, and dividing the recognized characters into single phrases;

D2. carrying out keyword matching on each phrase and a predetermined phrase table;

CN201811360543.9A 2018-11-15 2018-11-15 Video content judgment method based on character recognition Active CN109583443B (en) Priority Applications (1) Application Number Priority Date Filing Date Title CN201811360543.9A CN109583443B (en) 2018-11-15 2018-11-15 Video content judgment method based on character recognition Applications Claiming Priority (1) Application Number Priority Date Filing Date Title CN201811360543.9A CN109583443B (en) 2018-11-15 2018-11-15 Video content judgment method based on character recognition Publications (2) Family ID=65922743 Family Applications (1) Application Number Title Priority Date Filing Date CN201811360543.9A Active CN109583443B (en) 2018-11-15 2018-11-15 Video content judgment method based on character recognition Country Status (1) Families Citing this family (8) * Cited by examiner, â Cited by third party Publication number Priority date Publication date Assignee Title CN110210299A (en) * 2019-04-26 2019-09-06 å¹³å®ç§æï¼æ·±å³ï¼æéå¬å¸ Voice training data creation method, device, equipment and readable storage medium storing program for executing CN111081105B (en) * 2019-07-17 2022-07-08 å¹¿ä¸å°å¤©æç§ææéå¬å¸ Dictation detection method in black screen standby state and electronic equipment CN110490232B (en) * 2019-07-18 2021-08-13 åäº¬æ·éåå£°ç§æè¡ä»½æéå¬å¸ Method, device, equipment and medium for training character row direction prediction model CN110458162B (en) * 2019-07-25 2023-06-23 ä¸æµ·åè§ä¿¡æ¯ç§æææ¯æéå¬å¸ Method for intelligently extracting image text information CN111147891B (en) * 2019-12-31 2022-09-13 æå·å¨ä½©ç½ç»ç§ææéå¬å¸ Method, device and equipment for acquiring information of object in video picture CN111814642A (en) * 2020-06-30 2020-10-23 åäº¬ç©å¨ä¸èµ·ç§ææéå¬å¸ A kind of identification method and system of electric competition event data CN113255689B (en) * 2021-05-21 2024-03-19 åäº¬æç«¹å±ç½ç»ææ¯æéå¬å¸ Text line picture identification method, device and equipment CN115937855B (en) * 2023-03-10 2023-06-06 åå·è¯çç§ææéå¬å¸ Intelligent equipment control method and system based on big data Citations (14) * Cited by examiner, â Cited by third party Publication number Priority date Publication date Assignee Title WO2003051031A2 (en) * 2001-12-06 2003-06-19 The Trustees Of Columbia University In The City Of New York Method and apparatus for planarization of a material by growing and removing a sacrificial film CN101667251A (en) * 2008-09-05 2010-03-10 ä¸æçµåæ ªå¼ä¼ç¤¾ OCR recognition method and device with auxiliary positioning function CN101692269A (en) * 2009-10-16 2010-04-07 åäº¬ä¸æå¾®çµåæéå¬å¸ Method and device for processing video programs CN102331990A (en) * 2010-12-22 2012-01-25 åå·å¤§å¦ A News Video Retrieval Method Based on Subtitle Extraction CN103020618A (en) * 2011-12-19 2013-04-03 åäº¬æ·æä¸çºªç§æè¡ä»½æéå¬å¸ Detection method and detection system for video image text CN103336954A (en) * 2013-07-08 2013-10-02 åäº¬æ·æä¸çºªç§æè¡ä»½æéå¬å¸ Identification method and device of station caption in video CN103503463A (en) * 2011-11-23 2014-01-08 åä¸ºææ¯æéå¬å¸ Video advertisement broadcasting method, device and system CN103544467A (en) * 2013-04-23 2014-01-29 Tcléå¢è¡ä»½æéå¬å¸ Method and device for detecting and recognizing station captions CN105183758A (en) * 2015-07-22 2015-12-23 æ·±å³å¸ä¸å§å®ç¥ ç½ç»ç§æè¡ä»½æéå¬å¸ Content recognition method for continuously recorded video or image CN105516802A (en) * 2015-11-19 2016-04-20 ä¸æµ·äº¤éå¤§å¦ Multi-feature fusion video news abstract extraction method JP2016119552A (en) * 2014-12-19 2016-06-30 ä¸æé»åæ ªå¼ä¼ç¤¾ï¼³ï½ï½ï½ï½ï½ï½ ï¼¥ï½ï½ï½ï½ï½ï½ï½ï½ï½ï½ ï¼£ï½ï¼ï¼ï¼¬ï½ï½ï¼ Video contents processing device, video contents processing method and program CN106557768A (en) * 2016-11-25 2017-04-05 åäº¬å°ç±³ç§»å¨è½¯ä»¶æéå¬å¸ The method and device is identified by word in picture CN108182420A (en) * 2018-01-24 2018-06-19 åäº¬ä¸ç§ç«ç¼ç§ææéå¬å¸ A kind of advertisement localization method based on the detection of advertisement printed words US10007863B1 (en) * 2015-06-05 2018-06-26 Gracenote, Inc. Logo recognition in images and videos Family Cites Families (10) * Cited by examiner, â Cited by third party Publication number Priority date Publication date Assignee Title US20040125877A1 (en) * 2000-07-17 2004-07-01 Shin-Fu Chang Method and system for indexing and content-based adaptive streaming of digital video content US8769584B2 (en) * 2009-05-29 2014-07-01 TVI Interactive Systems, Inc. Methods for displaying contextually targeted content on a connected television EP3043570B1 (en) * 2013-09-04 2018-10-24 Panasonic Intellectual Property Management Co., Ltd. Video reception device, video recognition method, and additional information display system WO2018033156A1 (en) * 2016-08-19 2018-02-22 åäº¬å¸åæ±¤ç§æå¼åæéå¬å¸ Video image processing method, device, and electronic apparatus CN107564034A (en) * 2017-07-27 2018-01-09 ååçå·¥å¤§å¦ The pedestrian detection and tracking of multiple target in a kind of monitor video CN108256493A (en) * 2018-01-26 2018-07-06 ä¸å½çµåç§æéå¢å¬å¸ç¬¬ä¸åå«ç ç©¶æ A kind of traffic scene character identification system and recognition methods based on Vehicular video CN108460106A (en) * 2018-02-06 2018-08-28 åäº¬å¥èç§ææéå¬å¸ A kind of method and apparatus of identification advertisement video CN108229442B (en) * 2018-02-07 2022-03-11 è¥¿åç§æå¤§å¦ Fast and stable face detection method in image sequence based on MS-KCF CN108399161A (en) * 2018-03-06 2018-08-14 å¹³å®ç§æï¼æ·±å³ï¼æéå¬å¸ Advertising pictures identification method, electronic device and readable storage medium storing program for executing CN108446621A (en) * 2018-03-14 2018-08-24 å¹³å®ç§æï¼æ·±å³ï¼æéå¬å¸ Bank slip recognition method, server and computer readable storage medium

2018
- 2018-11-15 CN CN201811360543.9A patent/CN109583443B/en active Active

Patent Citations (14) * Cited by examiner, â Cited by third party Publication number Priority date Publication date Assignee Title WO2003051031A2 (en) * 2001-12-06 2003-06-19 The Trustees Of Columbia University In The City Of New York Method and apparatus for planarization of a material by growing and removing a sacrificial film CN101667251A (en) * 2008-09-05 2010-03-10 ä¸æçµåæ ªå¼ä¼ç¤¾ OCR recognition method and device with auxiliary positioning function CN101692269A (en) * 2009-10-16 2010-04-07 åäº¬ä¸æå¾®çµåæéå¬å¸ Method and device for processing video programs CN102331990A (en) * 2010-12-22 2012-01-25 åå·å¤§å¦ A News Video Retrieval Method Based on Subtitle Extraction CN103503463A (en) * 2011-11-23 2014-01-08 åä¸ºææ¯æéå¬å¸ Video advertisement broadcasting method, device and system CN103020618A (en) * 2011-12-19 2013-04-03 åäº¬æ·æä¸çºªç§æè¡ä»½æéå¬å¸ Detection method and detection system for video image text CN103544467A (en) * 2013-04-23 2014-01-29 Tcléå¢è¡ä»½æéå¬å¸ Method and device for detecting and recognizing station captions CN103336954A (en) * 2013-07-08 2013-10-02 åäº¬æ·æä¸çºªç§æè¡ä»½æéå¬å¸ Identification method and device of station caption in video JP2016119552A (en) * 2014-12-19 2016-06-30 ä¸æé»åæ ªå¼ä¼ç¤¾ï¼³ï½ï½ï½ï½ï½ï½ ï¼¥ï½ï½ï½ï½ï½ï½ï½ï½ï½ï½ ï¼£ï½ï¼ï¼ï¼¬ï½ï½ï¼ Video contents processing device, video contents processing method and program US10007863B1 (en) * 2015-06-05 2018-06-26 Gracenote, Inc. Logo recognition in images and videos CN105183758A (en) * 2015-07-22 2015-12-23 æ·±å³å¸ä¸å§å®ç¥ ç½ç»ç§æè¡ä»½æéå¬å¸ Content recognition method for continuously recorded video or image CN105516802A (en) * 2015-11-19 2016-04-20 ä¸æµ·äº¤éå¤§å¦ Multi-feature fusion video news abstract extraction method CN106557768A (en) * 2016-11-25 2017-04-05 åäº¬å°ç±³ç§»å¨è½¯ä»¶æéå¬å¸ The method and device is identified by word in picture CN108182420A (en) * 2018-01-24 2018-06-19 åäº¬ä¸ç§ç«ç¼ç§ææéå¬å¸ A kind of advertisement localization method based on the detection of advertisement printed words Non-Patent Citations (10) * Cited by examiner, â Cited by third party Title A Method of Caption Detection in News Video;He Huang ç;ã3rd International Conference on Multimedia Technology ICMT 2013ã;20131130(ç¬¬2013æ);502-509 * Effects of Captioning on Video Comprehension and Incidental Vocabulary Learning;Maribel Montero Perez ç;ãLanguage Learning & Technologyã;20140228;ç¬¬18å·(ç¬¬01æ);118-141 * ä¸¤é¶æ®µçè§é¢åå¹æ£æµåæåç®æ³;çæºæ§ ç;ãè®¡ç®æºç§å¦ã;20180815;ç¬¬45å·(ç¬¬08æ);50-53 * ä½åè¾¨çèªç¶åºæ¯ææ¬è¯å«;æµ¦ä¸äº® ç;ãä¸å½å®é²ã;20170901(ç¬¬(2017)09æ);94-101 * åºäºå·ç§¯ç¥ç»ç½ç»çèªçè·è¸ªè§é¢å³é®ä¿¡æ¯è¯å«;å®çè¥¿ ç;ãç©ºåé¢è¦å¦é¢å¦æ¥ã;20181015;ç¬¬32å·(ç¬¬05æ);353-358 * åºäºææ¬ç¹å¾çè§é¢æ£ç´¢åæåææ¯ç ç©¶;å¼ å;ãçµèç¥è¯ä¸ææ¯ã;20101025;ç¬¬06å·(ç¬¬30æ);8609-8610 * åºäºæ¬ä½çè§é¢è¯ä¹åå®¹åæ;ç½äº® ç;ãè®¡ç®æºç§å¦ã;20090715;ç¬¬36å·(ç¬¬07æ);170-174 * åºäºæ·±åº¦å¦ä¹ çåºæ¯æåæ£æµä¸è¯å«;ç½ç¿ ç;ãä¸å½ç§å¦:ä¿¡æ¯ç§å¦ã;20180520;ç¬¬48å·(ç¬¬05æ);51-64 * æ°åè§é¢ä¸æ é¢æåçæ£æµä¸æå;æéªé¾ ç;ãåäº¬çµåç§æå¦é¢å¦æ¥ã;20071215;ç¬¬15å·(ç¬¬04æ);23-27 * è§é¢ä¸çæåæ¢æµ;çè¾° ç;ãå°åå¾®åè®¡ç®æºç³»ç»ã;20020421;ç¬¬23å·(ç¬¬04æ);478-481 * Also Published As Similar Documents Publication Publication Date Title CN109583443B (en) 2022-10-18 Video content judgment method based on character recognition CN111582241B (en) 2022-12-09 Video subtitle recognition method, device, equipment and storage medium CN106406806B (en) 2020-01-24 Control method and device for intelligent equipment WO2020221298A1 (en) 2020-11-05 Text detection model training method and apparatus, text region determination method and apparatus, and text content determination method and apparatus CN106878632B (en) 2020-07-10 Video data processing method and device CN114465737B (en) 2022-06-24 Data processing method and device, computer equipment and storage medium US12236696B2 (en) 2025-02-25 Method and apparatus for recognizing subtitle region, device, and storage medium CN112399269B (en) 2023-06-20 Video segmentation method, device, equipment and storage medium CN111813998B (en) 2020-12-11 Video data processing method, device, equipment and storage medium CN107748744B (en) 2021-01-26 Method and device for establishing drawing box knowledge base CN118470717B (en) 2024-10-01 Method, device, computer program product, equipment and medium for generating annotation text CN114996506B (en) 2024-07-23 Corpus generation method, corpus generation device, electronic equipment and computer readable storage medium CN116524906A (en) 2023-08-01 Training data generation method and system for voice recognition and electronic equipment CN112149642A (en) 2020-12-29 Text image recognition method and device US20240064383A1 (en) 2024-02-22 Method and Apparatus for Generating Video Corpus, and Related Device CN113035199A (en) 2021-06-25 Audio processing method, device, equipment and readable storage medium CN113705300A (en) 2021-11-26 Method, device and equipment for acquiring phonetic-to-text training corpus and storage medium CN117333800A (en) 2024-01-02 Cross-platform content operation optimization method and system based on artificial intelligence CN116962787A (en) 2023-10-27 Interaction method, device, equipment and storage medium based on video information CN112668561A (en) 2021-04-16 Teaching video segmentation determination method and device CN112802469A (en) 2021-05-14 Method and device for acquiring training data of voice recognition model CN111079504A (en) 2020-04-28 Character recognition method and electronic equipment CN112347764B (en) 2024-05-07 Method and device for generating barrage cloud and computer equipment CN112149564A (en) 2020-12-29 Face classification and recognition system based on small sample learning CN108021918B (en) 2021-11-30 Character recognition method and device Legal Events Date Code Title Description 2019-04-05 PB01 Publication 2019-04-05 PB01 Publication 2019-04-30 SE01 Entry into force of request for substantive examination 2019-04-30 SE01 Entry into force of request for substantive examination 2022-10-18 GR01 Patent grant 2022-10-18 GR01 Patent grant

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4