Bases de données avancées Bases de données multimédia



Similar documents
Open issues and research trends in Content-based Image Retrieval

Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig

Big Data: Image & Video Analytics

The Scientific Data Mining Process

Topics in basic DBMS course

Managing large sound databases using Mpeg7

Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig

Statistical Models in Data Mining

Professor, D.Sc. (Tech.) Eugene Kovshov MSTU «STANKIN», Moscow, Russia

Multimedia Data Mining: A Survey

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Introduction to Computer Graphics

The basic data mining algorithms introduced may be enhanced in a number of ways.

Module II: Multimedia Data Mining

Extracting, Storing And Viewing The Data From Dicom Files

Clustering & Visualization

Chapter 3 Data Warehouse - technological growth

MultiMedia and Imaging Databases

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Content-Based Recommendation

Standardized Multimedia Retrieval in Distributed Heterogenous Database Systems. Dr. Mario Döller

Data Warehousing und Data Mining

Image Databases INLS 623 Lina Huang Elise Moore Ketan Palshikar Nevin Yang

ANALYTICS IN BIG DATA ERA

Content-Based Image Retrieval

CHAPTER 1 INTRODUCTION

RANDOM PROJECTIONS FOR SEARCH AND MACHINE LEARNING

PSG College of Technology, Coimbatore Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

Big Data Text Mining and Visualization. Anton Heijs

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Android Ros Application

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Fast Matching of Binary Features

Chapter 14: Databases and Database Management Systems

Traffic Monitoring Systems. Technology and sensors

Similarity Search in a Very Large Scale Using Hadoop and HBase

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Multiscale Object-Based Classification of Satellite Images Merging Multispectral Information with Panchromatic Textural Features

Tracking and Recognition in Sports Videos

ELECTRONIC JOURNAL OF POLISH AGRICULTURAL UNIVERSITIES

Multimedia Database Systems: Where are we now?

MULTIMEDIA MINING RESEARCH AN OVERVIEW

Topic Maps Visualization

New Paltz Central School District Technology Computer Graphics 1 and 2. Time Essential Questions/Content Standards/Skills Assessments

Visual Structure Analysis of Flow Charts in Patent Images

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Applications of Dynamic Representation Technologies in Multimedia Electronic Map

Unsupervised Data Mining (Clustering)

Data Mining. Vera Goebel. Department of Informatics, University of Oslo

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

TOOLS FOR 3D-OBJECT RETRIEVAL: KARHUNEN-LOEVE TRANSFORM AND SPHERICAL HARMONICS

Modern Databases. Database Systems Lecture 18 Natasha Alechina

TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Databases in Organizations

ImageLab Group: Digital Library research directions

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

Master s Program in Information Systems

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Visual Data Mining. Motivation. Why Visual Data Mining. Integration of visualization and data mining : Chidroop Madhavarapu CSE 591:Visual Analytics

Data Mining as Part of Knowledge Discovery in Databases (KDD)

Microsoft SQL Business Intelligence Boot Camp

Oracle Database 10g: Building GIS Applications Using the Oracle Spatial Network Data Model. An Oracle Technical White Paper May 2005

An Oracle White Paper February Managing Unstructured Data with Oracle Database 11g

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Approaches of Using a Word-Image Ontology and an Annotated Image Corpus as Intermedia for Cross-Language Image Retrieval

Introduzione alle Biblioteche Digitali Audio/Video

B2.53-R3: COMPUTER GRAPHICS. NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions.

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Multi-dimensional index structures Part I: motivation

Big Data: Rethinking Text Visualization

MMGD0203 Multimedia Design MMGD0203 MULTIMEDIA DESIGN. Chapter 3 Graphics and Animations

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Module 1: Getting Started with Databases and Transact-SQL in SQL Server 2008

Colorado School of Mines Computer Vision Professor William Hoff

Filling the Semantic Gap: A Genetic Programming Framework for Content-Based Image Retrieval

A Multimedia Database Server for information storage and querying

Mining Text Data: An Introduction

CHAPTER 6 TEXTURE ANIMATION

Image Compression through DCT and Huffman Coding Technique

Writing Queries Using Microsoft SQL Server 2008 Transact-SQL

Security and privacy for multimedia database management systems

ANIMATION a system for animation scene and contents creation, retrieval and display

An Information System to Analize Cultural Heritage Information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Search and Information Retrieval

Technology in Action. Alan Evans Kendall Martin Mary Anne Poatsy. Eleventh Edition. Copyright 2015 Pearson Education, Inc.

Environmental Remote Sensing GEOG 2021

Building Views and Charts in Requests Introduction to Answers views and charts Creating and editing charts Performing common view tasks

Structural Design Patterns Used in Data Structures Implementation

Transcription:

Bases de données avancées Bases de données multimédia Université de Cergy-Pontoise Master Informatique M1 Cours BDA Multimedia DB Multimedia data include images, text, video, sound, spatial data Data of large volume; their use is widespread Photos, personal videos Digital television, online music Web pages (text, images, video, sound) Maps, directions, GPS "non-traditional data requiring special management incompatible with conventional DB Very weak structure (image, sound, video, text) or very specific (spatial) Specific methods of access / modification The special case of image DB Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 2

Images Unstructured content Matrix of dots (pixels) Each point color information Further comments Textual, structured, semantic... Search images in a collection needs Detect objects in images Summarizing a collection of images Search images on various criteria (similarity, objects) methods Based on the annotations Based on the content Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 3 Methods of image search By annotation First method developed, using the text Problem: how to annotate an image? disadvantages Ambiguity inherent subjectivity Depends on the language and application Expensive manual annotation The visual content Basic idea: extract visual features (descriptors) and use the similarity of these characteristics? Advantages: extraction program, language-independent, less choice problems and dependence on the application Problem: the semantic gap = gap between the intention of the user and the search criteria used for annotation Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 4

Applications of content search Medical images: detection of pathologies Satellite Images: detection of object, meteorological phenomena Audiovisual: find a film plane or a character, copy detection Investigation: Fingerprint research, identification of a face Culture: seeking information about a work of art, e.g. monument from a photo Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 5 Types of content-based search Search by Example: The mostly used It gives a picture and seeks similar images in descending order of similarity Variant: given an area of an image and find images containing similar areas Generally it is satisfied with the k best answers (top-k) Search with relevance feedback Extension that tries to reduce the semantic gap Among the best answers, the user marks those that are not relevant The system changes the similarity function to "stick" to the indication several iterations Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 6

Image Descriptors Feature descriptor extracted from the image Extraction method Similarity measure Ex. color histogram Signature = vector representing a descriptor Ex. components of the color histogram Images space description Multidimensional space where each signature is a point Number of dimensions = number of components comprising the signature vector Image similarity = signatures in the feature space are near Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 7 Types of descriptors Global descriptor Approximate description of the entire image Describes an atmosphere, an overall impression Ex. color, texture, shape descriptor Produces a signature (point in the feature space) per picture Large Feature space Local descriptor Accurate description of parts of the image Suitable for objects (parts of images) Ex descriptor areas, points of interest Produced several signatures picture Feature space of smaller size Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 8

Example: color histogram Space representation of color (eg RGB) Each color represented by the intensity values of the components Objective: two colors close to the eye have close representations Sampling colors in the image For eg RGB 24 bits (8 bits = 256 levels per component) Sampling of 6 values component 6 3 =216 colors Choice values : eg levels 0, 256/6, 2 * 256/6,..., 5 * 256/6 Result: Histogram of 216 colors = 216 vector elements Example of calculating the color histogram C for each color among the 216 colors of the histogram as the percentage of pixels of the image with a color "near" c Histogram vector = 216 values between 0 and 100 Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 9 Indexing of Images Index = data structure that allows to accelerate search General Idea: organize points in space description to quickly find the nearest neighbors of a given point Same type of problem in conventional DB Given a value, find the objects in the database that match this value The index allows not to browse the entire database... But it needs special index structures multidimensional space Similarity search, no exact match Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 10

Architecture of an image storing/querying system Image Database Image Signature Description space Index Indexing Search Image Query Signature Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 11 The curse of dimensionality For a point in the feature space, the index should allow to quickly find neighboring images but in a multidimensional space there are problems! Main problem: the points tend to be equidistant When we seek the neighborhood of a point is recovered either nothing or almost all BD! index no longer useless, as do a sequential scan The problem increases with increasing number of dimensions Solutions Reduce the number of dimensions of local descriptors, project spaces of smaller size (eg find separately on groups of dimensions of the color histogram and merge results) Approximate search by hash (LSH) Classification (interactive) points for a given query Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 12

Index Types General Idea: group of close points in bundles, packages superpackets, etc. A hierarchy of packages we obtain = tree neighboring points in the same package, super-package, etc.. Most of the time: Space Division Division of the feature space into zones, areas into sub-areas, etc. Product division of the data paquet = packet zone points Data Division Dividing the data into packets, packets into sub-packets, etc. A subdivision of space zone = region (bounding sphere) package containing the points Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 13 Example Index: KD-tree Division of space according to each dimension At each step a size and pivot value is chosen The two sub-areas continue recursively for each It stops when the number of points of an area is under a given threshold Example: points in a 2D space (source: thesis Eduardo Valle, "Local Descriptor for Image Matching Identification Systems) Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 14

Search with KD-tree Simple case points in a single zone Depth-first traversal from the root The coordinates of the motion guide the course Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 15 Search with KD-tree (continued) More complex cases: points in several areas The depth-first traversal can follow several paths Can possibly give priority to the most interesting areas Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 16

SQL and multimédia Multimedia in the relational database Consideration from SQL3 and object-relational model General Idea: introduction of types of multimedia objects with properties / methods specific Standard SQL / MM (2001), several types multimedia FullText: text (words, phrases, stemming,...) Spatial: spatial and geometric data Types: ST_Geometry, ST_Point, ST_Curve, ST_MultiPolygon... Still Image: Image Types: SI_StillImage, SI_ColorHistogram, SI_Texture,... Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 17 Multimédia in Oracle Extension intermedia, named Oracle Multimedia found in version 11g Data Types Adaptation of SQL type / MM in Oracle Four types: ORDAudio, ORDDoc, ORDImage, ORDVideo Type ORDDicom dedicated to medical imaging (DICOM) Type ORDSource for storing multimedia content ORDImage = content (ORDSource) + image properties Properties: size, content size, format... Metadata stored in the content, standard formats (XMP, EXIF ) based on XML Oracle also supports type SQL / MM Still Image Key Features Storage: in a BLOB or outside of DB Modification of content and properties Signature extraction and metadata from the content Research on properties, metadata and content Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 18

Image search in Oracle approaches considered By the attributes that accompany the image in the database tables By the properties of the image (size, format,...) By XML metadata The content from the signature image Search by content Signature images: ORDImageSignature Contains descriptions of color, texture and shape as BLOBs Methods to extract the signature of an image, measure the similarity of two signatures Special index for signatures ORDImageIndex Problem: image descriptors imposed Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 19 Sources Cours BD multimédia, Valérie Gouet, CNAM Paris Local-Descriptor Matching for Image Identification Systems, Eduardo Valle, thèse de doctorat, 2008 SQL Multimedia and Application Packages (SQL/MM), Jim Melton, Andrew Eisenberg, 2001 Oracle Multimedia User's Guide / Reference 11g Oracle intermedia Reference 10g Cours BDA (UCP/M1) : Bases de données objet, multimédia, flux 20