Standardized Multimedia Retrieval in Distributed Heterogenous Database Systems. Dr. Mario Döller

Size: px
Start display at page:

Download "Standardized Multimedia Retrieval in Distributed Heterogenous Database Systems. Dr. Mario Döller"


1 Standardized Multimedia Retrieval in Distributed Heterogenous Database Systems Dr. Mario Döller

2 Motivation Current Situation Query Languages MMRS Metadata Annotation Professional Content Provider SQL/MM MOQL XIRQL Oracle InterMedia MPEG-7 Dublin Core MMDOC- QL XQuery IMAQL XPath SVQL IBM MMRS Informix Blobworld TV- Anytime NISO Propri etary Private Content Provider 2/34

3 Standardized Retrieval in Distributed MM Systems MPEG Query Format Multimedia Middleware MPEG Query Format MPEG-7 MM Database Dublin Core Image Database Audio Database 3/34

4 The MPEG Query Format

5 The MPEG Query Format (MPQF) Becomes International Standard Spring 2009 General Concepts o bases on XML and is defined by an XML Schema o decoupled from any other metadata standard (also MPEG-7) o support for any XML based MM metadata description o integration of limited XQuery functionality o MPQF divided into 3 main categories Management Input Query Format Output Query Format 5/34

6 MPQF Concepts Management How to find MPQF based MM Retrieval Systems (MMRSs)? o 2 Scenarios MMRS is known to the Client 1. Scenario Query Response 2. Scenario MMRS MMRS(s) is/are not known to the Client Search MMRS Available MMRS Query Response Proprietary Systems Response Query Query MMRS Response MMRS 6/34

7 MPQF Concepts Query I How to query MMRS satisfactorily? Query Design MPQF supports: o Synchronous/Async hronous mode o Timeout functionality QFDeclaration (0..1) OutputDescription (0..1) Query QueryCondition (0..1) 7/34

8 MPQF Concepts Query II QFDeclaration o declaration of resources for query conditions resource can be: (structured) text, raw media or any metadata description type (e.g., DominantColorType of MPEG-7) OutputDescription o defines the content as well as structure of the result set o uses XPath for selecting desired types/elements of the supported metadata description supports absolute and relative addressing o Metadata description independent o supports grouping/sorting o limitation/paging of result set 8/34

9 MPQF Concepts Query - Condition modular filter architecture filter data according to TargetMediaType join functionality specification of granularity 9/34

10 MPQF Concepts Query - Condition [0.. 1] AND scoringfunction [0.. 1] [0.. 1] thresholdvalue QbMedia QbMedia preferencevalue assign preferencevalue and thresholdvalue to every condition assign scoringfunction to every Boolean Operator (AND, OR, XOR) (recommended to follow t-norm, t- conorm rules) result in rank and confidence evaluation for every item 10/34

11 MPQF Examples Management I request: give me all available MMRS! request: give me all available MMRS fitting to my desired requirements <MpegQuery> <Management> </Input> </Management> </MpegQuery> <MpegQuery> <Management> <Input> <DesiredCapability> <SupportedMetadata href="urn:mpeg:mpeg7:schema:2004" /> <SupportedQueryTypes href="querybymedia" /> <SupportedQueryTypes href="querybyfreetext" /> </DesiredCapability> </Input> </Management> </MpegQuery> 11/34

12 MPQF Examples Query I Browsing Query <MpegQuery> <Query> <Input /> </Query> </MpegQuery> QueryByFree Text <MpegQuery> <Query> <Input> <QueryCondition> <Condition xsi:type="querybyfreetext"> <FreeText>This is a free text query </FreeText> </Condition> </QueryCondition> </Input> </Query> </MpegQuery> 12/34

13 Multimedia Middleware

14 Specifications Existing Frameworks o RMI (Remote Method Invocation) Communication protocol, Java o CORBA (Common Object Request Broker Architecture) platform-comprehensive protocols and services Frameworks: o dlimit o Framework of the HERON project o MOCHA (Middleware based On a Code shipping Architecture) 14/34

15 MPQF Web Service based MM Middleware 15/34

16 MPQF Mobile Agent based MM Middleware 16/34

17 Result Aggregation Existing Algorithm Identic Content Combination techniques: Min, Max, CombSum, CombMNZ Weighted combination techniques Borda-fuse voting model Overlapping Content Data fusion techniques Shadow Document Disjoint Content Method Raw score merging Round Robin Reference statistics Feature distance ranking algorithms Cross rank similarity comparison Query-based sampling 17/34

18 Applicable Result Aggregation Algorithm Multimedia-Databases: Degree of overlap not known, different content (image, audio, etc.) uncooperativ Result: list containing rank, record number, confidence information,... Applicable Algorithms: (weighted) combination techniques Borda-fuse voting model Shadow document method Round Robin: o Precondition: no elimination of duplicates o Problem: leads to duplicated elements in the result set 18/34

19 Multimedia Database

20 MM Metadata Description MPEG-7 concentrates on describing multimedia content in a semantically rich manner. ISO / IEC : Systems ISO / IEC : Description Definition Language ISO / IEC : Visual ISO / IEC : Audio ISO / IEC : Multimedia Description Schemes (MDS) ISO / IEC : Reference Software ISO / IEC : Conformance ISO / IEC : Extraction and Use ISO / IEC : Profile ISO / IEC : Schema Definition ISO / IEC : MPEG-7 Schema Profiles ISO / IEC : MPEG Query Format 20/34

21 MPEG 7 Network Description Definition Language Definition Tags extension D10 D1 D7 D6 D4 D9 D2 D8 D5 D3 Structuring Descriptors: (Syntax & semantic of feature representation) Instantiation D2 D6 D1 DS1 DS4 D4 DS2 DS3 D3 D5 Description Schemes <scene id=1> <time>... <camera>.. <annotation </scene> Encoding & Delivery /34

22 Introduction to Extensible ORDBMS Multimedia Extension ORDBMS Extensibility Interfaces Type System Server Execution Query Processing Query Optimizer Data Indexing currently available systems o Oracle Data Cartridges o Informix DataBlades o DB2 Extenders 22/34

23 MPEG-7 MMDB Architecture 23/34

24 Multimedia Schema based on MPEG-7 approach o selected MPEG-7 types become DB types / tables o use of XMLType to reduce the amount of datatypes in the database o reduce inheritance hierarchy through skipping abstract types manually o navigation via references (top-down) and keys (bottom-up) 24/34

25 Multimedia Schema based on MPEG-7(2) relational keys (DOC_ID, PART_ID) o to distinguish data from various MPEG-7 files o to store several occurence of the same element in the MPEG-7 document xsi:type references collections: o table of type o table of REF type MPEG-7 attributes/elements are mapped to DB table attributes Metadata for insertion process 25/34

26 Multimedia Schema based on MPEG-7 (3) <!-- Definition of StillRegion DS --> <complextype name="stillregiontype"> <complexcontent> <extension base="mpeg7:segmenttype"> <sequence> <choice minoccurs="0"> <element name="spatiallocator" type="mpeg7:regionlocatortype"/> <element name="spatialmask" type="mpeg7:spatialmasktype"/> </choice>... <choice minoccurs="0" maxoccurs="unbounded"> <element name="visualdescriptor" type="mpeg7:visualdtype"/> <element name="visualdescriptionscheme" type="mpeg7:visualdstype"/> <element name="gridlayoutdescriptors" type="mpeg7:gridlayouttype"/> </choice>... </sequence> </extension> </complexcontent> </complextype> dummynttype27 StillRegionType DOC_ID : Integer PART_ID : Integer MediaLocator : SYS.XMLType TextAnnotation : dummynttype83 SpatialLocator : SYS.XMLType SpatialMask : SYS.XMLType dummyattr23 : dummynttype27 Image() StillRegion() dummynttype83 TextAnnotationType DOC_ID : Integer PART_ID : Integer dummyattr12 : dummynttype13 relevance : Float(126) confidence : Float(126) lang : Varchar2(50) TextAnnotation() dummytype28 VisualDescriptor : Varchar2(60) VisualDescriptor_PART_ID : Integer VisualDescriptionScheme : SYS.XMLType GridLayoutDescriptors : SYS.XMLType 26/34

27 Multimedia Indexing Framework Main features: GistService o GiST GistWrapper Multimedia Index Type o consists of several index types, operators o rt_nearest_point(clob, CLOB, number, number); appropriate implementation (objects) 27/34

28 Multimedia Query Optimizer enhance cost based query optimizer o cost (k-nearest neighbor searches) implementation relies on a model for knn query cost by Ju-Hong Lee, Guang-Ho Cha and Chin- Wan Chung o selectivity (range searches) 28/34

29 approach: Query Optimizer - Selectivity o cluster the data set (feature vectors) with a density based cluster algorithm (CURE, DBSCAN) o calculate the density factor for every cluster -> #points/v(mbr) o compute selectivity for a point by identifying the correct cluster (intersect operation with MBR) and multiply the density factor with the volume of the range search 29/34

30 Querying MPEG-7 MMDB query traditionally with the help of XPATH expressions: SELECT extract(medialocator,'/medialocator/mediauri/text()') FROM audio; query the database with Oracles XMLDB functionality to produce well formed MPEG-7 output query the database with the help of our internal QueryLib 30/34

31 Content Based Image Retrieval MPEG-7 - Blobworld original system was introduced by Prof. Hellerstein and his group at the University of California, Berkeley improvements: o images and their blobs are described with MPEG-7 and stored and indexed by our MPEG-7 MMDB o providing means for enhancing the image pool at runtime retrieval bases on knn search (MIF) of following weighted features: color, shape and texture result is returned as MPEG-7 document 31/34

32 Audio Recognition Tool developed in connection with RWTH Aachen (Holger Crysandt) extracted MPEG-7 descriptions of audio files are stored and indexed by the MPEG-7 MMDB retrieval allows: o exact match search by interpreter, genre or song title o similarity search that relies on audio signature 32/34

33 Outlook

34 MPQF MM Database System MPQF based Multimedia Retrieval Engine MPQF Parser MPQF Optimizer Validation LL-Feature extraction MPQF Algebra MM Metadata Data MM Raw Data MPQF Query Processing Text Retrieval XQuery Engine. Indexer MPQF Result Generation QBE 34/34

35 Outlook Future Work o 88. MPEG Meeting (April 2008) Reference Software o JPEG Decided to use MPQF as query language for the JPSearch project o MPQF is used in the following EU projects SAPIR ( PHAROS ( ENTHRONE ( o Research Develop an MPQF Retrieval Engine Develop an MPQF Service Aggregation Develop applications using MPQF Retrieval Engine and/or JPSearch environment 35/34

36 Questions? 36/34