Text Localization & Segmentation in Images, Web Pages and Videos Media Mining I



Similar documents
A Dynamic Approach to Extract Texts and Captions from Videos

VSSN 06 Algorithm Competition

A Method of Caption Detection in News Video

Method for Extracting Product Information from TV Commercial

UNIVERSITY OF CENTRAL FLORIDA AT TRECVID Yun Zhai, Zeeshan Rasheed, Mubarak Shah

Interactive person re-identification in TV series

Study and Implementation of Video Compression standards (H.264/AVC, Dirac)

Tracking and Recognition in Sports Videos

Combating Anti-forensics of Jpeg Compression

FCE: A Fast Content Expression for Server-based Computing

Data Mining in Web Search Engine Optimization and User Assisted Rank Results

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE.

Navigation Aid And Label Reading With Voice Communication For Visually Impaired People

Study and Implementation of Video Compression Standards (H.264/AVC and Dirac)

AUTOMATIC VIDEO STRUCTURING BASED ON HMMS AND AUDIO VISUAL INTEGRATION

Video compression: Performance of available codec software

Efficient Background Subtraction and Shadow Removal Technique for Multiple Human object Tracking

Blog Post Extraction Using Title Finding

Edge tracking for motion segmentation and depth ordering

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features

Lecture Video Indexing and Analysis Using Video OCR Technology

A Scalable Video-on-Demand Service for the Provision of VCR-Like Functions 1

PageX: An Integrated Document Processing and Management Software for Digital Libraries

An Active Head Tracking System for Distance Education and Videoconferencing Applications

Friendly Medical Image Sharing Scheme

Vision based Vehicle Tracking using a high angle camera

A General Framework for Tracking Objects in a Multi-Camera Environment

Improving Computer Vision-Based Indoor Wayfinding for Blind Persons with Context Information

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Text Information Extraction in Images and Video: A Survey. Keechul Jung, Kwang In Kim, Anil K. Jain

International Journal of Advanced Information in Arts, Science & Management Vol.2, No.2, December 2014

A Prediction-Based Transcoding System for Video Conference in Cloud Computing

Information Model for Multimedia Medical Record in Telemedicine

Multimedia Document Authentication using On-line Signatures as Watermarks

Framework for Biometric Enabled Unified Core Banking

2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India

Image Spam Filtering Using Visual Information

Neural Network based Vehicle Classification for Intelligent Traffic Control

Self-Compressive Approach for Distributed System Monitoring

A comprehensive survey on various ETC techniques for secure Data transmission

A World Wide Web Based Image Search Engine Using Text and Image Content Features

Mobile video streaming and sharing in social network using cloud by the utilization of wireless link capacity

Bandwidth Adaptation for MPEG-4 Video Streaming over the Internet


ISSN: A Review: Image Retrieval Using Web Multimedia Mining

Teaching in School of Electronic, Information and Electrical Engineering

EXTRACTION OF UNCONSTRAINED CAPTION TEXT FROM GENERAL-PURPOSE VIDEO

Video Authentication for H.264/AVC using Digital Signature Standard and Secure Hash Algorithm

Simultaneous Gamma Correction and Registration in the Frequency Domain

Low-resolution Character Recognition by Video-based Super-resolution

An automatic system for sports analytics in multi-camera tennis videos

People today have access to more

Visual Structure Analysis of Flow Charts in Patent Images

Florida International University - University of Miami TRECVID 2014

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER

IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD

A new cut detection algorithm with constant false-alarm ratio for video segmentation

Database-Centered Architecture for Traffic Incident Detection, Management, and Analysis

Region of Interest Access with Three-Dimensional SBHP Algorithm CIPR Technical Report TR

3D Vehicle Extraction and Tracking from Multiple Viewpoints for Traffic Monitoring by using Probability Fusion Map

Very Low Frame-Rate Video Streaming For Face-to-Face Teleconference

How To Filter Spam Image From A Picture By Color Or Color

Face Recognition in Low-resolution Images by Using Local Zernike Moments

Color Segmentation Based Depth Image Filtering

Introduzione alle Biblioteche Digitali Audio/Video

Original Research Articles

Speed Performance Improvement of Vehicle Blob Tracking System

Template-based Eye and Mouth Detection for 3D Video Conferencing

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

Image Spam Filtering by Content Obscuring Detection

Detection and Recognition of Mixed Traffic for Driver Assistance System

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

Transcription:

Text Localization & Segmentation in Images, Web Pages and Videos Media Mining I Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org}

PSNR_Y Goal: Text Extraction Locate text of any size at any position in images, web pages and videos Segment and recognize text Encode extracted text as rigid foreground object in MPEG4 (with Yen-Kuang Chen) 27.5 31.5 31 30.5 30 29.5 29 28.5 28 Signle VOP 160 165 170 175 180 185 190 195 KBits/sec Multiple VOP 2

Related Work 1. Y. Zhong, K. Karu and A. K. Jain. Locating Text in Complex Color Images. Pattern Recognition, Vol. 28, No. 10, pp. 1523-1535, October 1995. 2. Rainer Lienhart and Frank Stuber. Automatic Text Recognition in Digital Videos. In Image and Video Processing IV 1996, Proc. SPIE 2666-20, pp. 180-188, Jan. 1996; also TR-95-036, Dec. 1995. 3. B.-L. Yeo, B. Liu. Visual Content Highlightning via Auromatic Extraction of Embedded Captions on MPEG Compressed Video. IS&T / SPIE Digital Video Compression: Algorithms and Technologies, Feb. 1996. 4. Rainer Lienhart. Automatic Text Recognition for Video Indexing. Proc. ACM Multimedia 96, Boston, MA, Nov. 1996, pp. 11-20. 5. S. Sato and T. Kanade. NAME-IT: Association of Face and Name in Video. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, 17-19 June, 1997. 6. Sato, T., Kanade, T., Hughes, E., Smith, M. Video OCR for Digital News Archives. IEEE Workshop on Content- Based Access of Image and Video Databases (CAIVD'98), Bombay, India, January, 1998. 7. Anil K. Jain and Bin Yu. Automatic Text Location in Images and Video Frames. Pattern Recognition, Vol. 31, No. 12, pp. 2055-2076, 1998. 8. H. Li, O. Kia and D. Doermann. Text Enhancement In Digital Videos. In Proceedings of SPIE99, Document Recognition and Retrieval, 1999. 9. Rainer Lienhart and Wolfgang Effelsberg. Automatic Text Segmentation and Text Recognition for Video Indexing. ACM/Springer Multimedia Systems Magazine, Vol. 8, pp. 69-81, Jan. 2000. 10. Huiping Li, David Doemann, Omid Kia. Automatic text detection and tracking in digital video. IEEE Transactions on Image Processing, Vol. 9, No. 1, Jan. 2000. 11. Daniel Loprestie and JiangYing Zhou. Locating and Recognizing Text in WWW Images. Information Retrieval 2 (Kluwer Academic Publishers.), 177-206, (2000). 12. Axel Wernicke and Rainer Lienhart. On the Segmentation of Text in Videos. IEEE Int. Conference on Multimedia and Expo (ICME2000), Vol.3, pp. 1511-1514, July 2000. More information at www.videoanalysis.org Rainer Lienhart, Axel Wernicke. Localizing and Segmenting Text in Images and Videos. IEEE Transactions on Circuits and Systems for Video Technology, pp. 256-268, April 2002. 1996 1998 2000 1 2 3 4 5 6 7 8 9,10 12 11 3

Design Decisions What kind of text occurrences? Scene text Overlay text With what style attributes? Font size Font type Text color In what kind of media data? Image-based Video-based any both What should be achieved? Localization Segmentation Recognition Integrated recognition How will the results be used? Indexing both Object-based video encoding 4

Overview OCR result: Dec 25 1998 5

Text Localization (1/2) 6

Text Box Consolidation (2/2) Derive initial text bounding boxes Refine bounding boxes Remove text boxes which are Too small/large, or Have a bad width-to-height aspect ratio 7

Monitoring + Tracking Result: Text Objects 8

Background Removal Temporal alignment of text lines 3 bitmap at t, t+45, t+90 Low variance image Border floodfilling Binarized image 9

Experimental Results Text localization Image-based: 69.5% (boxes) / 85% (pixels) Video-based: 94.9% (boxes) Text segmentation 79.6% correctly segmented 7.6% damaged, but still recognizable Text recognition 70% (over all steps) 10

Demo 11