A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://patents.google.com/patent/CN109583443B/en below:

CN109583443B - Video content judgment method based on character recognition

CN109583443B - Video content judgment method based on character recognition - Google PatentsVideo content judgment method based on character recognition Download PDF Info
Publication number
CN109583443B
CN109583443B CN201811360543.9A CN201811360543A CN109583443B CN 109583443 B CN109583443 B CN 109583443B CN 201811360543 A CN201811360543 A CN 201811360543A CN 109583443 B CN109583443 B CN 109583443B
Authority
CN
China
Prior art keywords
character
training
image
files
model
Prior art date
2018-11-15
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811360543.9A
Other languages
Chinese (zh)
Other versions
CN109583443A (en
Inventor
周建波
高岚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
2018-11-15
Filing date
2018-11-15
Publication date
2022-10-18
2018-11-15 Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
2018-11-15 Priority to CN201811360543.9A priority Critical patent/CN109583443B/en
2019-04-05 Publication of CN109583443A publication Critical patent/CN109583443A/en
2022-10-18 Application granted granted Critical
2022-10-18 Publication of CN109583443B publication Critical patent/CN109583443B/en
Status Active legal-status Critical Current
2038-11-15 Anticipated expiration legal-status Critical
Links Images Classifications Landscapes Abstract

The invention discloses a video content judgment method based on character recognition, which comprises the following steps: A. screenshot is carried out on a video picture; B. calling a pre-trained character detection model to analyze character areas of the screenshot picture, finding out and dividing the character areas in the picture, and obtaining one or more character areas; C. after the character areas are detected, calling a character recognition model trained in advance, circularly performing character recognition on each character area, and recognizing the character content of each character area; D. and (4) carrying out natural language processing aiming at the recognized character content, understanding the semantics of the character content and making corresponding video playing setting. The video content judgment method can run on the embedded platform in real time, can identify the character information in the video, and can set scenes according to the prompt of the character information.

Description Video content judgment method based on character recognition

Technical Field

The invention relates to the technical field of image recognition, in particular to a video content judgment method based on character recognition.

Background

With the rapid development of artificial intelligence technology, artificial intelligence has gradually entered into various aspects of human life. By utilizing the artificial intelligence technology, the television is intelligentized, the use experience of a user can be greatly improved, and the life of people becomes more convenient.

Video image information in a television often contains a large amount of information content. In addition to the image frame, a frame of image may also contain text information, which is usually a display of important information of the currently playing scene. Compared with ever-changing image information, the text information is analyzed, and generally, which scene is played currently is easier to know.

At present, artificial intelligence technologies of most products are operated at a cloud server end of the internet, and due to the limitation of hardware conditions of an Android system, large-scale calculation cannot be operated, and too many resources, such as occupation of a CPU (central processing unit), cannot be occupied, so that a good technical scheme for character recognition in an image scene operated on an embedded platform is temporarily provided.

Disclosure of Invention

The invention aims to overcome the defects in the background art, and provides a video content judgment method based on character recognition, which can run on an embedded platform in real time, can recognize character information in a video, and can perform scene setting (image or voice setting) according to the prompt of the character information, and is suitable for specific fields, such as the television field and the like.

In order to achieve the technical effects, the invention adopts the following technical scheme:

a video content judgment method based on character recognition comprises the following steps:

A. screenshot is carried out on a video picture;

B. calling a pre-trained character detection model to analyze character areas of the screenshot picture, finding out and dividing the character areas in the picture, and obtaining one or more character areas;

C. after the character areas are detected, calling a character recognition model trained in advance, circularly recognizing characters of each character area, and recognizing the character content of each character area;

D. and (4) carrying out natural language processing aiming at the recognized character content, understanding the semantics of the character content and making corresponding video playing setting.

Further, the step A also comprises the step of dividing and setting a plurality of image areas needing character recognition on the screenshot picture;

the step B specifically comprises the following steps:

B1. calling a character detection model which is trained in advance to analyze character areas of the screenshot picture, finding out and dividing the character areas in the picture, and obtaining one or more character areas;

B2. and C, if the detected character area is in the preset image area needing character recognition, otherwise, returning to the step A.

Further, the character detection model in the step B is a convolutional neural network.

Further, the convolutional neural network is a mobilenet-ssd neural network based on tensoflow.

Further, the training procedure for the convolutional neural network is as follows:

s1, collecting a preset number of video image samples with text contents according to the input characteristics of a neural network;

s2, extracting at least rectangular frame coordinates, text contents, information of text language categories, and image size and image format information of an image sample per Zhang Youwen text content video image sample;

s3, aiming at the image samples and the sample information thereof obtained in the steps S1 and S2, generating training files and verification files in tfrechrd formats supported by tensoflow, wherein the images of the training files and the verification files are different, and the image formats stored in the training files and the verification files are the same as the image information formats;

s4, training the model by using the training file to generate a predetermined character detection model, and verifying the generated character detection model by using the verification file;

s5, if the verification accuracy is larger than or equal to a preset threshold value, or the training step number reaches a certain step number, finishing the training;

and S6, if the verification accuracy is lower than the preset threshold, increasing a video image sample with text content or debugging model parameters, and repeatedly executing the steps S1 to S4 until the training is finished.

Further, the character recognition model in the step C is a convolutional recurrent neural network based on an attention model.

Further, the convolutional recurrent neural network based on the Attention model is an Attention-CRNN neural network based on tensoflow.

Further, the training step of the convolutional recurrent neural network based on the attention model is as follows:

s101, creating a Chinese dictionary, cutting a character area image in a video image sample used in a character detection model, and generating a character image sample data set;

s102, combining sample data into a Chinese dictionary, generating tfrecrd format files required by training, and dividing the tfrecrd format files into training files and verification files, wherein the images of the training files and the verification files are different, but the stored image formats are the same as the image information formats;

s103, training the model by using a training file to generate a predetermined character recognition model, and verifying the generated character recognition model by using a verification file;

s104, if the verification accuracy is larger than or equal to a preset threshold value, or the training step number reaches a certain step number, finishing training;

and S105, if the verification accuracy is lower than the preset threshold, adding a video image sample with text content or debugging model parameters, and repeatedly executing the steps S101 to S103 until the training is finished.

Further, the step D specifically includes the following steps:

D1. dividing the recognized characters into words and single phrases;

D2. carrying out keyword matching on each phrase and a predetermined phrase table;

D3. if the phrase in the current image is the predetermined phrase and the continuous frames of images are all the predetermined phrase, the current image scene is judged to be the predetermined phrase scene, and corresponding scene processing is performed.

Compared with the prior art, the invention has the following beneficial effects:

according to the video content judgment method based on character recognition, the content displayed by the current video image can be judged by automatically recognizing the characters in the video image, and corresponding scene content processing is carried out.

Drawings

Fig. 1 is a flowchart illustrating a method for determining video content based on text recognition according to an embodiment of the present invention.

Detailed Description

The invention will be further elucidated and described with reference to the embodiments of the invention described hereinafter.

The embodiment is as follows:

the first embodiment is as follows:

as shown in fig. 1, a method for determining video content based on text recognition, which is applied to an intelligent television system in this embodiment, mainly includes the following steps:

step 1: screenshot is carried out on a video picture:

when the television system detects that there is a video stream, a current video image is cut out of the video stream every 1s, and the video image is 1080P (size: 1920 × 1080). Meanwhile, after the background program acquires the images, images with the width of 640 and the height of 360 are generated uniformly through an image scaling technology and are sent to a pre-trained character detection model for detection.

Step 2: calling a character detection model which is trained in advance to analyze character areas of the screenshot picture, finding out the character areas in the picture and dividing the character areas to obtain one or more character areas:

and analyzing the character areas of the image by the character detection model trained in advance, automatically finding the character areas in the picture, giving the coordinates and the width and the height of the character areas, and obtaining a plurality of character areas.

In order to improve efficiency, a plurality of image areas to be subjected to character recognition are generally set in advance. When the detected character area is in a preset image area needing character recognition, entering the next step for character recognition; if the detected character area is not in the preset image area needing character recognition, the next character recognition is not carried out, and the character detection and recognition are directly carried out on the next frame of image.

In this embodiment, for the detected text region, the picture is scaled, and the method includes:

for the horizontal text region, the fixed height is 150, and when the width is less than 600, the image is filled to 600. When the width is larger than 600, the image is cut into a plurality of 500 × 150 images (the first image is special, the cut is 550 × 150), and special processing is performed on the edge, namely, 50 × 150 images are respectively placed at the left and right edges of the original image to be cut into the newly cut image. If the last image is less than 600 width, the last image is filled to 600 width. Finally, the text area image is generated into a plurality of 600 × 150 images.

For the vertical character area, the characters are cut into single characters for recognition according to the principle that the font proportion is about 0.7-1 and the length-width ratio of the character area, after a single character image is cut, the fixed height of the image is scaled to be 150, and the width of the image is filled to be 600. And finally, generating a plurality of character area images.

And step 3: after the character areas are detected, calling a character recognition model trained in advance, circularly recognizing characters of each character area, and recognizing the character content of each character area:

and calling a pre-trained Chinese text recognition model, circularly performing character recognition on each character area, and recognizing the character content of each character area.

And 4, step 4: and (3) aiming at the recognized text content, performing natural language processing, understanding the semantics of the text content, and making corresponding video playing settings:

the method specifically comprises the following steps:

step 4.1, dividing the recognized characters into words and phrases;

step 4.2, carrying out keyword matching on each phrase and a predetermined phrase table;

and 4.3, if the phrase in the current image is the predetermined phrase and the continuous frames of images are all the predetermined phrase, judging that the current image scene is the predetermined phrase scene, and carrying out corresponding scene processing.

Specifically, in this embodiment, for the text content, the pre-defined text includes phrases such as "advertisement" and "news". These phrases may all represent the image content category of the current video playback. And (4) carrying out natural language processing such as word segmentation, word group matching and the like aiming at the extracted character contents. When a predefined phrase such as an advertisement phrase or a news phrase is detected in the same position in a plurality of continuous frames, the current scene is determined to be a scene such as an advertisement or news, and corresponding video playing setting is subsequently made according to different scenes.

Specifically, the pre-trained character detection model is a convolutional neural network, and in this embodiment, a mobilenet-ssd neural network based on tensoflow of google is specifically adopted. The training process of the neural network is as follows:

A. aiming at the input characteristics of the neural network, collecting about 5000 video image samples with character contents in a television playing video, and uniformly setting the size of 640 × 360;

B. extracting information such as rectangular frame coordinates, text contents, text language categories and the like of an area where the text is located and information such as image size and image format of the image sample per se from the video image sample of the Zhang Youwen text contents;

C. aiming at the image samples and the sample information thereof obtained in the two steps, training files and verification files in tfrecrd format supported by tensoflow are generated, wherein the images of the training files and the verification files are different, but the stored image formats are the same as the image information formats.

D. Training the model by using a training file to generate a predetermined character detection model, and verifying the generated character detection model by using a verification file;

E. if the verification accuracy is greater than or equal to a preset threshold (the preset verification accuracy threshold is 95% in this embodiment), or the training step number reaches a certain step number (2 ten thousand steps), the training is completed;

F. if the verification accuracy is lower than the preset threshold (95%), increasing video image samples with text content or debugging model parameters, and repeatedly executing the step A, B, C, D, E until the training is completed.

G. And generating a tflite model file for being called by the android program.

Specifically, the pre-trained character recognition model is a convolutional recurrent neural network based on an Attention model, and an Attention-CRNN neural network based on tenserflow of google is specifically adopted in this embodiment. Although the Attention-CRNN is composed of several different neural networks and components (CNN, RNN, attention), they can be trained end-to-end using the same penalty function. Therefore, the model can be trained uniformly, and the training process is as follows:

A. a chinese dictionary was created containing 5462 chinese characters. Only two columns in the dictionary, the left column is serial numbers (0,1, … …), and the right column is Chinese characters (toilet, in-pants, … …);

B. cutting a character area image in a video image sample used in a character detection model, and generating a character image sample data set with the image size width of 600 and the height of 150;

C. and combining the sample data set with a Chinese dictionary to generate tfrecrd format files required by training, wherein the tfrecrd format files are also divided into training files and verification files, the images of the training files and the verification files are different, but the stored image formats of the training files and the verification files are the same as the image information formats of the training files and the verification files.

D. Training the model by using a training file to generate a predetermined character recognition model, and verifying the generated character detection model by using a verification file;

E. if the verification accuracy is greater than or equal to a preset threshold (the preset verification accuracy threshold is 90% in this embodiment), or the training step number reaches a certain step number (2 ten thousand steps), the training is completed;

F. if the verification accuracy is lower than the preset threshold (90%), increasing video image samples with text content or debugging model parameters, and repeatedly executing the step A, B, C, D, E until the training is completed.

G. And generating a tflite model file for being called by the android program. .

It will be understood that the above embodiments are merely exemplary embodiments adopted to illustrate the principles of the present invention, and the present invention is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims (5)

1. A video content judgment method based on character recognition is characterized by comprising the following steps:

A. screenshot is carried out on a video picture;

B. calling a pre-trained character detection model to analyze character areas of the screenshot picture, finding out and dividing the character areas in the picture, and obtaining one or more character areas;

C. after the character areas are detected, calling a character recognition model trained in advance, circularly performing character recognition on each character area, and recognizing the character content of each character area;

D. carrying out natural language processing aiming at the recognized character content, understanding the semantics thereof and making corresponding video playing setting;

the step A also comprises the division setting of a plurality of image areas needing character recognition on the screenshot picture;

the step B specifically comprises the following steps:

B1. calling a pre-trained character detection model to analyze character areas of the screenshot picture, finding out and dividing the character areas in the picture, and obtaining one or more character areas;

B2. if the detected character area is in a preset image area needing character recognition, entering the step C, otherwise, returning to the step A;

the character detection model in the step B is a convolutional neural network based on a mobilenet-ssd neural network of tensoflow; the training procedure for the convolutional neural network is as follows:

s1, collecting a preset number of video image samples with text contents according to the input characteristics of a neural network;

s2, extracting at least rectangular frame coordinates, text contents, information of text language categories, and image size and image format information of an image sample per Zhang Youwen text content video image sample;

s3, aiming at the image samples and the sample information thereof obtained in the steps S1 and S2, generating training files and verification files in tfrechrd formats supported by tensoflow, wherein the images of the training files and the verification files are different, and the image formats stored in the training files and the verification files are the same as the image information formats;

s4, training the model by using the training file to generate a predetermined character detection model, and verifying the generated character detection model by using the verification file;

s5, if the verification accuracy is larger than or equal to a preset threshold value, or the training step number reaches a certain step number, finishing the training;

and S6, if the verification accuracy is lower than a preset threshold, increasing a video image sample with text content or debugging model parameters, and repeatedly executing the steps until the training is finished.

2. The method as claimed in claim 1, wherein the character recognition model in step C is a convolutional recurrent neural network based on attention model.

3. The method as claimed in claim 2, wherein the convolutional recurrent neural network based on Attention model is an Attention-CRNN neural network based on tensoflow.

4. The method of claim 3, wherein the training of the convolutional recurrent neural network based on the attention model comprises the following steps:

s101, creating a Chinese dictionary, cutting a character area image in a video image sample used in a character detection model, and generating a character image sample data set;

s102, combining sample data into a Chinese dictionary, generating tfrecrd format files required by training, and dividing the tfrecrd format files into training files and verification files, wherein the images of the training files and the verification files are different, but the stored image formats are the same as the image information formats;

s103, training the model by using a training file to generate a predetermined character recognition model, and verifying the generated character recognition model by using a verification file;

s104, if the verification accuracy is larger than or equal to a preset threshold value, or the training step number reaches a certain step number, finishing training;

and S105, if the verification accuracy is lower than the preset threshold, adding a video image sample with text content or debugging model parameters, and repeatedly executing the steps until the training is finished.

5. The method for determining video content based on character recognition according to any one of claims 1 to 4, wherein the step D specifically comprises the following steps:

D1. performing word segmentation on the recognized characters, and dividing the recognized characters into single phrases;

D2. carrying out keyword matching on each phrase and a predetermined phrase table;

D3. if the phrase in the current image is the predetermined phrase and the continuous frames of images are all the predetermined phrase, the current image scene is judged to be the predetermined phrase scene, and corresponding scene processing is carried out.

CN201811360543.9A 2018-11-15 2018-11-15 Video content judgment method based on character recognition Active CN109583443B (en) Priority Applications (1) Application Number Priority Date Filing Date Title CN201811360543.9A CN109583443B (en) 2018-11-15 2018-11-15 Video content judgment method based on character recognition Applications Claiming Priority (1) Application Number Priority Date Filing Date Title CN201811360543.9A CN109583443B (en) 2018-11-15 2018-11-15 Video content judgment method based on character recognition Publications (2) Family ID=65922743 Family Applications (1) Application Number Title Priority Date Filing Date CN201811360543.9A Active CN109583443B (en) 2018-11-15 2018-11-15 Video content judgment method based on character recognition Country Status (1) Families Citing this family (8) * Cited by examiner, † Cited by third party Publication number Priority date Publication date Assignee Title CN110210299A (en) * 2019-04-26 2019-09-06 平安科技(深圳)有限公司 Voice training data creation method, device, equipment and readable storage medium storing program for executing CN111081105B (en) * 2019-07-17 2022-07-08 广东小天才科技有限公司 Dictation detection method in black screen standby state and electronic equipment CN110490232B (en) * 2019-07-18 2021-08-13 北京捷通华声科技股份有限公司 Method, device, equipment and medium for training character row direction prediction model CN110458162B (en) * 2019-07-25 2023-06-23 上海兑观信息科技技术有限公司 Method for intelligently extracting image text information CN111147891B (en) * 2019-12-31 2022-09-13 杭州威佩网络科技有限公司 Method, device and equipment for acquiring information of object in video picture CN111814642A (en) * 2020-06-30 2020-10-23 北京玩在一起科技有限公司 A kind of identification method and system of electric competition event data CN113255689B (en) * 2021-05-21 2024-03-19 北京有竹居网络技术有限公司 Text line picture identification method, device and equipment CN115937855B (en) * 2023-03-10 2023-06-06 四川语璐科技有限公司 Intelligent equipment control method and system based on big data Citations (14) * Cited by examiner, † Cited by third party Publication number Priority date Publication date Assignee Title WO2003051031A2 (en) * 2001-12-06 2003-06-19 The Trustees Of Columbia University In The City Of New York Method and apparatus for planarization of a material by growing and removing a sacrificial film CN101667251A (en) * 2008-09-05 2010-03-10 三星电子株式会社 OCR recognition method and device with auxiliary positioning function CN101692269A (en) * 2009-10-16 2010-04-07 北京中星微电子有限公司 Method and device for processing video programs CN102331990A (en) * 2010-12-22 2012-01-25 四川大学 A News Video Retrieval Method Based on Subtitle Extraction CN103020618A (en) * 2011-12-19 2013-04-03 北京捷成世纪科技股份有限公司 Detection method and detection system for video image text CN103336954A (en) * 2013-07-08 2013-10-02 北京捷成世纪科技股份有限公司 Identification method and device of station caption in video CN103503463A (en) * 2011-11-23 2014-01-08 华为技术有限公司 Video advertisement broadcasting method, device and system CN103544467A (en) * 2013-04-23 2014-01-29 Tcl集团股份有限公司 Method and device for detecting and recognizing station captions CN105183758A (en) * 2015-07-22 2015-12-23 深圳市万姓宗祠网络科技股份有限公司 Content recognition method for continuously recorded video or image CN105516802A (en) * 2015-11-19 2016-04-20 上海交通大学 Multi-feature fusion video news abstract extraction method JP2016119552A (en) * 2014-12-19 2016-06-30 三星電子株式会社Samsung Electronics Co.,Ltd. Video contents processing device, video contents processing method and program CN106557768A (en) * 2016-11-25 2017-04-05 北京小米移动软件有限公司 The method and device is identified by word in picture CN108182420A (en) * 2018-01-24 2018-06-19 北京中科火眼科技有限公司 A kind of advertisement localization method based on the detection of advertisement printed words US10007863B1 (en) * 2015-06-05 2018-06-26 Gracenote, Inc. Logo recognition in images and videos Family Cites Families (10) * Cited by examiner, † Cited by third party Publication number Priority date Publication date Assignee Title US20040125877A1 (en) * 2000-07-17 2004-07-01 Shin-Fu Chang Method and system for indexing and content-based adaptive streaming of digital video content US8769584B2 (en) * 2009-05-29 2014-07-01 TVI Interactive Systems, Inc. Methods for displaying contextually targeted content on a connected television EP3043570B1 (en) * 2013-09-04 2018-10-24 Panasonic Intellectual Property Management Co., Ltd. Video reception device, video recognition method, and additional information display system WO2018033156A1 (en) * 2016-08-19 2018-02-22 北京市商汤科技开发有限公司 Video image processing method, device, and electronic apparatus CN107564034A (en) * 2017-07-27 2018-01-09 华南理工大学 The pedestrian detection and tracking of multiple target in a kind of monitor video CN108256493A (en) * 2018-01-26 2018-07-06 中国电子科技集团公司第三十八研究所 A kind of traffic scene character identification system and recognition methods based on Vehicular video CN108460106A (en) * 2018-02-06 2018-08-28 北京奇虎科技有限公司 A kind of method and apparatus of identification advertisement video CN108229442B (en) * 2018-02-07 2022-03-11 西南科技大学 Fast and stable face detection method in image sequence based on MS-KCF CN108399161A (en) * 2018-03-06 2018-08-14 平安科技(深圳)有限公司 Advertising pictures identification method, electronic device and readable storage medium storing program for executing CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium Patent Citations (14) * Cited by examiner, † Cited by third party Publication number Priority date Publication date Assignee Title WO2003051031A2 (en) * 2001-12-06 2003-06-19 The Trustees Of Columbia University In The City Of New York Method and apparatus for planarization of a material by growing and removing a sacrificial film CN101667251A (en) * 2008-09-05 2010-03-10 三星电子株式会社 OCR recognition method and device with auxiliary positioning function CN101692269A (en) * 2009-10-16 2010-04-07 北京中星微电子有限公司 Method and device for processing video programs CN102331990A (en) * 2010-12-22 2012-01-25 四川大学 A News Video Retrieval Method Based on Subtitle Extraction CN103503463A (en) * 2011-11-23 2014-01-08 华为技术有限公司 Video advertisement broadcasting method, device and system CN103020618A (en) * 2011-12-19 2013-04-03 北京捷成世纪科技股份有限公司 Detection method and detection system for video image text CN103544467A (en) * 2013-04-23 2014-01-29 Tcl集团股份有限公司 Method and device for detecting and recognizing station captions CN103336954A (en) * 2013-07-08 2013-10-02 北京捷成世纪科技股份有限公司 Identification method and device of station caption in video JP2016119552A (en) * 2014-12-19 2016-06-30 三星電子株式会社Samsung Electronics Co.,Ltd. Video contents processing device, video contents processing method and program US10007863B1 (en) * 2015-06-05 2018-06-26 Gracenote, Inc. Logo recognition in images and videos CN105183758A (en) * 2015-07-22 2015-12-23 深圳市万姓宗祠网络科技股份有限公司 Content recognition method for continuously recorded video or image CN105516802A (en) * 2015-11-19 2016-04-20 上海交通大学 Multi-feature fusion video news abstract extraction method CN106557768A (en) * 2016-11-25 2017-04-05 北京小米移动软件有限公司 The method and device is identified by word in picture CN108182420A (en) * 2018-01-24 2018-06-19 北京中科火眼科技有限公司 A kind of advertisement localization method based on the detection of advertisement printed words Non-Patent Citations (10) * Cited by examiner, † Cited by third party Title A Method of Caption Detection in News Video;He Huang 等;《3rd International Conference on Multimedia Technology ICMT 2013》;20131130(第2013期);502-509 * Effects of Captioning on Video Comprehension and Incidental Vocabulary Learning;Maribel Montero Perez 等;《Language Learning & Technology》;20140228;第18卷(第01期);118-141 * 两阶段的视频字幕检测和提取算法;王智慧 等;《计算机科学》;20180815;第45卷(第08期);50-53 * 低分辨率自然场景文本识别;浦世亮 等;《中国安防》;20170901(第(2017)09期);94-101 * 基于卷积神经网络的航班跟踪视频关键信息识别;宁煜西 等;《空军预警学院学报》;20181015;第32卷(第05期);353-358 * 基于文本特征的视频检索和提取技术研究;张华;《电脑知识与技术》;20101025;第06卷(第30期);8609-8610 * 基于本体的视频语义内容分析;白亮 等;《计算机科学》;20090715;第36卷(第07期);170-174 * 基于深度学习的场景文字检测与识别;白翔 等;《中国科学:信息科学》;20180520;第48卷(第05期);51-64 * 数字视频中标题文字的检测与提取;李雪龙 等;《北京电子科技学院学报》;20071215;第15卷(第04期);23-27 * 视频中的文字探测;王辰 等;《小型微型计算机系统》;20020421;第23卷(第04期);478-481 * Also Published As Similar Documents Publication Publication Date Title CN109583443B (en) 2022-10-18 Video content judgment method based on character recognition CN111582241B (en) 2022-12-09 Video subtitle recognition method, device, equipment and storage medium CN106406806B (en) 2020-01-24 Control method and device for intelligent equipment WO2020221298A1 (en) 2020-11-05 Text detection model training method and apparatus, text region determination method and apparatus, and text content determination method and apparatus CN106878632B (en) 2020-07-10 Video data processing method and device CN114465737B (en) 2022-06-24 Data processing method and device, computer equipment and storage medium US12236696B2 (en) 2025-02-25 Method and apparatus for recognizing subtitle region, device, and storage medium CN112399269B (en) 2023-06-20 Video segmentation method, device, equipment and storage medium CN111813998B (en) 2020-12-11 Video data processing method, device, equipment and storage medium CN107748744B (en) 2021-01-26 Method and device for establishing drawing box knowledge base CN118470717B (en) 2024-10-01 Method, device, computer program product, equipment and medium for generating annotation text CN114996506B (en) 2024-07-23 Corpus generation method, corpus generation device, electronic equipment and computer readable storage medium CN116524906A (en) 2023-08-01 Training data generation method and system for voice recognition and electronic equipment CN112149642A (en) 2020-12-29 Text image recognition method and device US20240064383A1 (en) 2024-02-22 Method and Apparatus for Generating Video Corpus, and Related Device CN113035199A (en) 2021-06-25 Audio processing method, device, equipment and readable storage medium CN113705300A (en) 2021-11-26 Method, device and equipment for acquiring phonetic-to-text training corpus and storage medium CN117333800A (en) 2024-01-02 Cross-platform content operation optimization method and system based on artificial intelligence CN116962787A (en) 2023-10-27 Interaction method, device, equipment and storage medium based on video information CN112668561A (en) 2021-04-16 Teaching video segmentation determination method and device CN112802469A (en) 2021-05-14 Method and device for acquiring training data of voice recognition model CN111079504A (en) 2020-04-28 Character recognition method and electronic equipment CN112347764B (en) 2024-05-07 Method and device for generating barrage cloud and computer equipment CN112149564A (en) 2020-12-29 Face classification and recognition system based on small sample learning CN108021918B (en) 2021-11-30 Character recognition method and device Legal Events Date Code Title Description 2019-04-05 PB01 Publication 2019-04-05 PB01 Publication 2019-04-30 SE01 Entry into force of request for substantive examination 2019-04-30 SE01 Entry into force of request for substantive examination 2022-10-18 GR01 Patent grant 2022-10-18 GR01 Patent grant

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4