Caching Dynamic Skyline Queries
|
|
|
- Delilah Caldwell
- 10 years ago
- Views:
Transcription
1 Caching Dynamic Skyline Queries D. Sacharidis 1, P. Bouros 1, T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management of Information Systems R.C. Athena
2 Outline Introduction Skyline (SL) and dynamic skyline queries (DSL) Related work Evaluating dynamic skyline queries Computing orthant skylines (OSL) Computing dynamic skyline via caching LRU, LFU, LPP cache replacement policies Experimental evaluation Conclusions and Future work
3 Skyline queries (SL) Given a dataset of d- dimensional points SL contains points not dominated by others x dominates y iff x as good as y in all dimensions and strictly better in at least one
4 Skyline queries (SL) Given a dataset of d- dimensional points SL contains points not dominated by others x dominates y iff x as good as y in all dimensions and strictly better in at least one Example Dataset of hotels Prefer cheap hotels close to the sea
5 Skyline queries (SL) Given a dataset of d- dimensional points SL contains points not dominated by others x dominates y iff x as good as y in all dimensions and strictly better in at least one Example Skyline points Dataset of hotels Prefer cheap hotels close to the sea
6 Skyline queries (SL) Given a dataset of d- dimensional points SL contains points not dominated by others x dominates y iff x as good as y in all dimensions and strictly better in at least one Example Skyline points Dataset of hotels Prefer cheap hotels close to the sea p 1
7 Skyline queries (SL) Given a dataset of d- dimensional points SL contains points not dominated by others x dominates y iff x as good as y in all dimensions and strictly better in at least one Example p 2 Skyline points Dataset of hotels Prefer cheap hotels close to the sea p 1
8 Dynamic skyline queries (DSL) Extension of skyline queries Given a query point q DSL contains points not dynamically dominated by others w.r.t q x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one Can be treated as static SL Transform points w.r.t. q
9 Dynamic skyline queries (DSL) Extension of skyline queries Given a query point q DSL contains points not dynamically dominated by others w.r.t q x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one Can be treated as static SL Transform points w.r.t. q Example Query point q User defines ideal hotel q
10 Dynamic skyline queries (DSL) Extension of skyline queries Given a query point q DSL contains points not dynamically dominated by others w.r.t q x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one Can be treated as static SL Transform points w.r.t. q q Example Dynamic Skyline points User defines ideal hotel q
11 Dynamic skyline queries (DSL) Extension of skyline queries Given a query point q DSL contains points not dynamically dominated by others w.r.t q x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one Can be treated as static SL Transform points w.r.t. q p 4 p 5 q Example Dynamic Skyline points User defines ideal hotel q
12 Intuition (1) Traditional SL algorithms need to run anew for each DSL query Our idea Exploit results from past queries to reduce processing cost for future DSL queries Cache past queries Decide which queries in cache are useful
13 Intuition (2)
14 Intuition (2) 2 past DSL queries q a, q b Each query partitions space in 4 quadrants q a q b
15 Intuition (3) A new query q arrives Consider DSL for q a p 1 is contained DSL(q a ) p 1 dominates p 2, p 3, p 4 p 1 lies in upper right quadrant w.r.t. q a q a lies in upper right quadrant w.r.t. q p 1 dominates also p 2, p 3, p 4 w.r.t. to q Exclude p 2, p 3, p 4 from dominance test for DSL(q) q b q q a p 1 p 2 p 3 p 4 Shaded area denotes points dominated by p 1
16 Contribution in brief Caching past DSL queries cannot reduce processing cost for future ones We need more information about dominance relationships Introduce orthant skylines (OSL) and examine their relationship with DSL Extend Bitmap algorithm to compute OSL in parallel with DSL Cache OSL to enhance DSL queries evaluation Present 3 cache replacement policies LRU, LFU, LPP Experimental evaluation of caching mechanism
17 Related work Non-indexed methods Block-Nested Loops (BnL) Bitmap Multidimensional Divide and Conquer (DnC) Sort First Scan (SFS) Index-based methods B-tree sort points according to the lowest valued coordinate R-tree Nearest neighbor based (NN) Branch and bound (BBS)
18 Related work Non-indexed methods Block-Nested Loops (BnL) Bitmap Multidimensional Divide and Conquer (DnC) Sort First Scan (SFS) Index-based methods B-tree sort points according to the lowest valued coordinate R-tree Nearest neighbor based (NN) Branch and bound (BBS)
19 Bitmap BnL variant Suitable for domains with low cardinality and discrete In brief Computes a bitmap representation of the points in the dataset Examines each point separately (dominance test) Checks whether it is contained in the skyline or not Exploits fast bitwise operations OR/AND
20 Bitmap Dominance test For each point p Define A = A 1 & A 2 & & A d Denotes the points as good as p in all dimensions Define B = B 1 B 2 B d Denotes the points strictly better than p in at least one dimension Dominance test: If C = A & B has all bits set to 0 then p is in SL
21 Orthant skyline (OSL) OSL provides more information about dominance relationships than DSL Useful for pruning Given a dataset of d- dimensional points and a query point q Space partitioned in 2 d orthants o-th orthant skyline (OSL) of q contains points of the o-th orthant not dynamically dominated by others inside orthant o w.r.t q
22 Orthant skyline (OSL) OSL provides more information about dominance relationships than DSL Useful for pruning Given a dataset of d- dimensional points and a query point q Space partitioned in 2 d orthants o-th orthant skyline (OSL) of q contains points of the o-th orthant not dynamically dominated by others inside orthant o w.r.t q Quadrant 1 Quadrant 0 Query point q Quadrant 3 Quadrant 2
23 Orthant skyline (OSL) OSL provides more information about dominance relationships than DSL Useful for pruning Given a dataset of d- dimensional points and a query point q Space partitioned in 2 d orthants o-th orthant skyline (OSL) of q contains points of the o-th orthant not dynamically dominated by others inside orthant o w.r.t q Quadrant 1 Quadrant 0 Query point q Quadrant 3 Quadrant 2
24 Orthant skyline (OSL) OSL provides more information about dominance relationships than DSL Useful for pruning Given a dataset of d- dimensional points and a query point q Space partitioned in 2 d orthants o-th orthant skyline (OSL) of q contains points of the o-th orthant not dynamically dominated by others inside orthant o w.r.t q Quadrant 1 Quadrant 3 Quadrant 0 Query point q Quadrant 2 skyline points Quadrant 2
25 OSL and DSL relationship Quadrant 1 Quadrant 0 q Quadrant 3 Quadrant 2
26 OSL and DSL relationship Quadrant 1 Quadrant 0 q Quadrant 3 Quadrant 2
27 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Quadrant 1 q Quadrant 0 Quadrant 3 Quadrant 2
28 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Compute DSL w.r.t. q Quadrant 1 q Quadrant 0 Quadrant 3 Quadrant 2
29 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Compute DSL w.r.t. q Union of all OSLs is superset of DSL w.r.t. to q Quadrant 1 q Quadrant 3 Quadrant 0 Quadrant 2
30 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Compute DSL w.r.t. q Union of all OSLs is superset of DSL w.r.t. to q Quadrant 1 Quadrant 3 p 1 p 2 q Quadrant 0 Quadrant 2
31 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Compute DSL w.r.t. q Union of all OSLs is superset of DSL w.r.t. to q Quadrant 1 Quadrant 3 p 2 q p 3 Quadrant 0 Quadrant 2
32 Computing orthant skylines Algorithm DBM Extends Bitmap to compute DSL and OSLs at the same time Method: Compute bitmap representation Transform each point coordinates w.r.t. to query q Dominance test, point p, orthant o p not in OSL o and not in DSL p not in DSL, but in OSL o p in DSL and in OSL o
33 Dynamic skylines Via Caching Cache OSLs instead of DSLs Query cache contains (query point q j, OSLs) OSLs encode by bitmaps Algorithm cdbm OSL contains information about dominance test inside orthant Discard points inside orthants from dominance tests Method: Compute bitmap representation For each point p consider its position (orthant) w.r.t. to cache queries q j If p in the same orthant o w.r.t q j as q j w.r.t. q and p not in OSL o (q j ) then exclude it from OSL o (q), DSL(q)
34 Cache Replacement Policies General idea Limited cache space Identify least useful query point in cache Replace it with new one
35 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q Consider as input the query point cache Q Only query points in OSL of Q w.r.t. q are useful Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point
36 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q Consider as input the query point cache Q Only query points in OSL of Q w.r.t. q are useful Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point q q b q d q a q c
37 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q Consider as input the query point cache Q Only query points in OSL of Q w.r.t. q are useful Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point q q b q d q a Redundant queries q c
38 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q Consider as input the query points in cache Q Only query points in OSL of Q w.r.t. q are useful Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point q q b q d q a Redundant queries q c
39 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q Consider as input the query points in cache Q Only query points in OSL of Q w.r.t. q are useful Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point q q b q d q a Redundant queries q c
40 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query Great pruning power Probability that a query can prune points of dataset from DSL computation Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache remove Query point with less pruning power (LPP) q a q
41 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query Great pruning power Probability that a query can prune points of dataset from DSL computation Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache remove Query point with less pruning power (LPP) 2:2 4 3:2 4 q a 5:2 4 74:2 4 q
42 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query Great pruning power Probability that a query can prune points of dataset from DSL computation Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache remove Query point with less pruning power (LPP) 2:3 4 3:4 4 q a 5:7 4 74:88 4 q
43 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query Great pruning power Probability that a query can prune points of dataset from DSL computation Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache remove Query point with less pruning power (LPP) 2: :4 4 q a 5:7 4 74:88 4 q
44 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query Great pruning power Probability that a query can prune points of dataset from DSL computation Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache remove Query point with less pruning power (LPP) 2: :4 21 q a 5: : q
45 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query Great pruning power Probability that a query can prune points of dataset from DSL computation Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache remove Query point with less pruning power (LPP) 2: :4 21 q a 5: : q
46 Experimental Evaluation Synthetic datasets Distribution types Independent, correlated, anti-correlated Number of points N 10k, 20k, 50k, 100k, Dimensionality d = {2,3,4,5,6} Domain size for dimension D = {10,20,50} Compare Bitmap (NO-CACHE) cdbm with LFU,LRU,LPP cache replacement policies Query cache Q = {10,20,30,40,50} past query points Cache size is Q *N bits uncompressed
47 Varying query cache size Independent Anti-correlated Dataset: N = 50k points, with d = 4 dimensions of D = 20 domain size LFU,LRU cache queries not representative for future ones LPP caches queries with great pruning power
48 Effect of distribution parameters Correlated vary N Correlated vary d Relative improvement in running time over NO-CACHE Vary number of points N d = 4 dimensions of D = 20 domain size Vary number of dimensions d N = 50k, D = 20
49 Conclusions and Future work Conclusions Introduced orthant skylines (OSLs) and discussed its relationship with DSL Extended Bitmap to compute OSLs and DSL at the same time (DBM algorithm) Proposed caching mechanism of OSLs to reduce cost for future DSL queries LRU, LFU, LPP cache replacement policies Experimentally verified the efficiency of caching mechanism Future work Apply caching mechanism to index-based methods Further increase pruning power of cached queries
50 Questions?
Bitmap Index as Effective Indexing for Low Cardinality Column in Data Warehouse
Bitmap Index as Effective Indexing for Low Cardinality Column in Data Warehouse Zainab Qays Abdulhadi* * Ministry of Higher Education & Scientific Research Baghdad, Iraq Zhang Zuping Hamed Ibrahim Housien**
Data Warehousing und Data Mining
Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data
Efficient Processing of Joins on Set-valued Attributes
Efficient Processing of Joins on Set-valued Attributes Nikos Mamoulis Department of Computer Science and Information Systems University of Hong Kong Pokfulam Road Hong Kong [email protected] Abstract Object-oriented
Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries
Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries Kale Sarika Prakash 1, P. M. Joe Prathap 2 1 Research Scholar, Department of Computer Science and Engineering, St. Peters
Multi-dimensional index structures Part I: motivation
Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for
Indexing Techniques in Data Warehousing Environment The UB-Tree Algorithm
Indexing Techniques in Data Warehousing Environment The UB-Tree Algorithm Prepared by: Yacine ghanjaoui Supervised by: Dr. Hachim Haddouti March 24, 2003 Abstract The indexing techniques in multidimensional
Similarity Search in a Very Large Scale Using Hadoop and HBase
Similarity Search in a Very Large Scale Using Hadoop and HBase Stanislav Barton, Vlastislav Dohnal, Philippe Rigaux LAMSADE - Universite Paris Dauphine, France Internet Memory Foundation, Paris, France
2 Associating Facts with Time
TEMPORAL DATABASES Richard Thomas Snodgrass A temporal database (see Temporal Database) contains time-varying data. Time is an important aspect of all real-world phenomena. Events occur at specific points
Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)
Comp 5311 Database Management Systems 16. Review 2 (Physical Level) 1 Main Topics Indexing Join Algorithms Query Processing and Optimization Transactions and Concurrency Control 2 Indexing Used for faster
Indexing Techniques for Data Warehouses Queries. Abstract
Indexing Techniques for Data Warehouses Queries Sirirut Vanichayobon Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 739 [email protected] [email protected] Abstract Recently,
International Journal of Advance Research in Computer Science and Management Studies
Volume 3, Issue 11, November 2015 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.
Multimedia Databases Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 14 Previous Lecture 13 Indexes for Multimedia Data 13.1
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding
R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants
R-Trees: A Dynamic Index Structure For Spatial Searching A. Guttman R-trees Generalization of B+-trees to higher dimensions Disk-based index structure Occupancy guarantee Multiple search paths Insertions
KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS
ABSTRACT KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS In many real applications, RDF (Resource Description Framework) has been widely used as a W3C standard to describe data in the Semantic Web. In practice,
PartJoin: An Efficient Storage and Query Execution for Data Warehouses
PartJoin: An Efficient Storage and Query Execution for Data Warehouses Ladjel Bellatreche 1, Michel Schneider 2, Mukesh Mohania 3, and Bharat Bhargava 4 1 IMERIR, Perpignan, FRANCE [email protected] 2
FlexPref: A Framework for Extensible Preference Evaluation in Database Systems
1 FlexPref: A Framework for Extensible Preference Evaluation in Database Systems Justin J. Levandoski 1, Mohamed F. Mokbel 2, Mohamed E. Khalefa 3 Department of Computer Science and Engineering, University
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1
Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics
Survey On: Nearest Neighbour Search With Keywords In Spatial Databases
Survey On: Nearest Neighbour Search With Keywords In Spatial Databases SayaliBorse 1, Prof. P. M. Chawan 2, Prof. VishwanathChikaraddi 3, Prof. Manish Jansari 4 P.G. Student, Dept. of Computer Engineering&
Data Structures for Moving Objects
Data Structures for Moving Objects Pankaj K. Agarwal Department of Computer Science Duke University Geometric Data Structures S: Set of geometric objects Points, segments, polygons Ask several queries
SQL Query Evaluation. Winter 2006-2007 Lecture 23
SQL Query Evaluation Winter 2006-2007 Lecture 23 SQL Query Processing Databases go through three steps: Parse SQL into an execution plan Optimize the execution plan Evaluate the optimized plan Execution
Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices
Proc. of Int. Conf. on Advances in Computer Science, AETACS Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Ms.Archana G.Narawade a, Mrs.Vaishali Kolhe b a PG student, D.Y.Patil
CUBE INDEXING IMPLEMENTATION USING INTEGRATION OF SIDERA AND BERKELEY DB
CUBE INDEXING IMPLEMENTATION USING INTEGRATION OF SIDERA AND BERKELEY DB Badal K. Kothari 1, Prof. Ashok R. Patel 2 1 Research Scholar, Mewar University, Chittorgadh, Rajasthan, India 2 Department of Computer
Efficient Data Access and Data Integration Using Information Objects Mica J. Block
Efficient Data Access and Data Integration Using Information Objects Mica J. Block Director, ACES Actuate Corporation [email protected] Agenda Information Objects Overview Best practices Modeling Security
The Classical Architecture. Storage 1 / 36
1 / 36 The Problem Application Data? Filesystem Logical Drive Physical Drive 2 / 36 Requirements There are different classes of requirements: Data Independence application is shielded from physical storage
Big Data Paradigms in Python
Big Data Paradigms in Python San Diego Data Science and R Users Group January 2014 Kevin Davenport! http://kldavenport.com [email protected] @KevinLDavenport Thank you to our sponsors: Setting up
Chapter 5: Overview of Query Processing
Chapter 5: Overview of Query Processing Query Processing Overview Query Optimization Distributed Query Processing Steps Acknowledgements: I am indebted to Arturas Mazeika for providing me his slides of
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
Decision Trees from large Databases: SLIQ
Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values
low-level storage structures e.g. partitions underpinning the warehouse logical table structures
DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A
Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical
Analysis Services Step by Step
Microsoft' Microsoft SQL Server 2008 Analysis Services Step by Step Scott Cameron, Hitachi Consulting Table of Contents Acknowledgments Introduction xi xiii Part I Understanding Business Intelligence and
Graph Database Proof of Concept Report
Objectivity, Inc. Graph Database Proof of Concept Report Managing The Internet of Things Table of Contents Executive Summary 3 Background 3 Proof of Concept 4 Dataset 4 Process 4 Query Catalog 4 Environment
Clustering and scheduling maintenance tasks over time
Clustering and scheduling maintenance tasks over time Per Kreuger 2008-04-29 SICS Technical Report T2008:09 Abstract We report results on a maintenance scheduling problem. The problem consists of allocating
Fast Sequential Summation Algorithms Using Augmented Data Structures
Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik [email protected] Abstract This paper provides an introduction to the design of augmented data structures that offer
Operating Systems. Virtual Memory
Operating Systems Virtual Memory Virtual Memory Topics. Memory Hierarchy. Why Virtual Memory. Virtual Memory Issues. Virtual Memory Solutions. Locality of Reference. Virtual Memory with Segmentation. Page
Data warehousing with PostgreSQL
Data warehousing with PostgreSQL Gabriele Bartolini http://www.2ndquadrant.it/ European PostgreSQL Day 2009 6 November, ParisTech Telecom, Paris, France Audience
Chapter 3 Data Warehouse - technological growth
Chapter 3 Data Warehouse - technological growth Computing began with data storage in conventional file systems. In that era the data volume was too small and easy to be manageable. With the increasing
Cluster Description Formats, Problems and Algorithms
Cluster Description Formats, Problems and Algorithms Byron J. Gao Martin Ester School of Computing Science, Simon Fraser University, Canada, V5A 1S6 [email protected] [email protected] Abstract Clustering is
New Approach of Computing Data Cubes in Data Warehousing
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 14 (2014), pp. 1411-1417 International Research Publications House http://www. irphouse.com New Approach of
Cloud Service Placement via Subgraph Matching
Cloud Service Placement via Subgraph Matching Bo Zong, Ramya Raghavendra *, Mudhakar Srivatsa *, Xifeng Yan, Ambuj K. Singh, Kang-Won Lee * Dept. of Computer Science, University of California at Santa
Invited Applications Paper
Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK [email protected] [email protected]
EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES
ABSTRACT EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES Tyler Cossentine and Ramon Lawrence Department of Computer Science, University of British Columbia Okanagan Kelowna, BC, Canada [email protected]
Vector storage and access; algorithms in GIS. This is lecture 6
Vector storage and access; algorithms in GIS This is lecture 6 Vector data storage and access Vectors are built from points, line and areas. (x,y) Surface: (x,y,z) Vector data access Access to vector
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
Physical DB design and tuning: outline
Physical DB design and tuning: outline Designing the Physical Database Schema Tables, indexes, logical schema Database Tuning Index Tuning Query Tuning Transaction Tuning Logical Schema Tuning DBMS Tuning
Best Practices for DB2 on z/os Performance
Best Practices for DB2 on z/os Performance A Guideline to Achieving Best Performance with DB2 Susan Lawson and Dan Luksetich www.db2expert.com and BMC Software September 2008 www.bmc.com Contacting BMC
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
On Saying Enough Already! in MapReduce
On Saying Enough Already! in MapReduce Christos Doulkeridis and Kjetil Nørvåg Department of Computer and Information Science (IDI) Norwegian University of Science and Technology (NTNU) Trondheim, Norway
ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001
ICOM 6005 Database Management Systems Design Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001 Readings Read Chapter 1 of text book ICOM 6005 Dr. Manuel
UVA. Failure and Recovery. Failure and inconsistency. - transaction failures - system failures - media failures. Principle of recovery
Failure and Recovery Failure and inconsistency - transaction failures - system failures - media failures Principle of recovery - redundancy - DB can be protected by ensuring that its correct state can
Procedia Computer Science 00 (2012) 1 21. Trieu Minh Nhut Le, Jinli Cao, and Zhen He. [email protected], [email protected], [email protected].
Procedia Computer Science 00 (2012) 1 21 Procedia Computer Science Top-k best probability queries and semantics ranking properties on probabilistic databases Trieu Minh Nhut Le, Jinli Cao, and Zhen He
Clustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
5.4 Closest Pair of Points
5.4 Closest Pair of Points Closest Pair of Points Closest pair. Given n points in the plane, find a pair with smallest Euclidean distance between them. Fundamental geometric primitive. Graphics, computer
Fast Contextual Preference Scoring of Database Tuples
Fast Contextual Preference Scoring of Database Tuples Kostas Stefanidis Department of Computer Science, University of Ioannina, Greece Joint work with Evaggelia Pitoura http://dmod.cs.uoi.gr 2 Motivation
Learning in Abstract Memory Schemes for Dynamic Optimization
Fourth International Conference on Natural Computation Learning in Abstract Memory Schemes for Dynamic Optimization Hendrik Richter HTWK Leipzig, Fachbereich Elektrotechnik und Informationstechnik, Institut
Graphical Representation of Multivariate Data
Graphical Representation of Multivariate Data One difficulty with multivariate data is their visualization, in particular when p > 3. At the very least, we can construct pairwise scatter plots of variables.
RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE
RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE WANG Jizhou, LI Chengming Institute of GIS, Chinese Academy of Surveying and Mapping No.16, Road Beitaiping, District Haidian, Beijing, P.R.China,
Base Table. (a) Conventional Relational Representation Basic DataIndex Representation. Basic DataIndex on RetFlag
Indexing and Compression in Data Warehouses Kiran B. Goyal Computer Science Dept. Indian Institute of Technology, Bombay [email protected] Anindya Datta College of Computing. Georgia Institute of
Common Patterns and Pitfalls for Implementing Algorithms in Spark. Hossein Falaki @mhfalaki [email protected]
Common Patterns and Pitfalls for Implementing Algorithms in Spark Hossein Falaki @mhfalaki [email protected] Challenges of numerical computation over big data When applying any algorithm to big data
Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services
Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Length: Delivery Method: 3 Days Instructor-led (classroom) About this Course Elements of this syllabus are subject
Efficient and Scalable Method for Processing Top-k Spatial Boolean Queries
Efficient and Scalable Method for Processing Top-k Spatial Boolean Queries Ariel Cary 1, Ouri Wolfson 2, and Naphtali Rishe 1 1 School of Computing and Information Sciences, Florida International University,
Unsupervised Data Mining (Clustering)
Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in
Partitioning under the hood in MySQL 5.5
Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael Ronström, Partitioning author Who are we? Mikael is a founder of the technology behind NDB
Bounded Treewidth in Knowledge Representation and Reasoning 1
Bounded Treewidth in Knowledge Representation and Reasoning 1 Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien Luminy, October 2010 1 Joint work with G.
GameTime: A Toolkit for Timing Analysis of Software
GameTime: A Toolkit for Timing Analysis of Software Sanjit A. Seshia and Jonathan Kotker EECS Department, UC Berkeley {sseshia,jamhoot}@eecs.berkeley.edu Abstract. Timing analysis is a key step in the
The enhancement of the operating speed of the algorithm of adaptive compression of binary bitmap images
The enhancement of the operating speed of the algorithm of adaptive compression of binary bitmap images Borusyak A.V. Research Institute of Applied Mathematics and Cybernetics Lobachevsky Nizhni Novgorod
Arrangements And Duality
Arrangements And Duality 3.1 Introduction 3 Point configurations are tbe most basic structure we study in computational geometry. But what about configurations of more complicated shapes? For example,
Big Fast Data Hadoop acceleration with Flash. June 2013
Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional
Comparison of different image compression formats. ECE 533 Project Report Paula Aguilera
Comparison of different image compression formats ECE 533 Project Report Paula Aguilera Introduction: Images are very important documents nowadays; to work with them in some applications they need to be
A Note on Maximum Independent Sets in Rectangle Intersection Graphs
A Note on Maximum Independent Sets in Rectangle Intersection Graphs Timothy M. Chan School of Computer Science University of Waterloo Waterloo, Ontario N2L 3G1, Canada [email protected] September 12,
A Multi-layered Domain-specific Language for Stencil Computations
A Multi-layered Domain-specific Language for Stencil Computations Christian Schmitt, Frank Hannig, Jürgen Teich Hardware/Software Co-Design, University of Erlangen-Nuremberg Workshop ExaStencils 2014,
In-Memory BigData. Summer 2012, Technology Overview
In-Memory BigData Summer 2012, Technology Overview Company Vision In-Memory Data Processing Leader: > 5 years in production > 100s of customers > Starts every 10 secs worldwide > Over 10,000,000 starts
Indexing and Processing Big Data
Indexing and Processing Big Data Patrick Valduriez INRIA, Montpellier Why Big Data Today? Overwhelming amounts of data Generated by all kinds of devices, networks and programs E.g. sensors, mobile devices,
Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
Indexing Spatio-Temporal archive As a Preprocessing Alsuccession
The VLDB Journal manuscript No. (will be inserted by the editor) Indexing Spatio-temporal Archives Marios Hadjieleftheriou 1, George Kollios 2, Vassilis J. Tsotras 1, Dimitrios Gunopulos 1 1 Computer Science
Concept of Cache in web proxies
Concept of Cache in web proxies Chan Kit Wai and Somasundaram Meiyappan 1. Introduction Caching is an effective performance enhancing technique that has been used in computer systems for decades. However,
Image Compression through DCT and Huffman Coding Technique
International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul
Building Views and Charts in Requests Introduction to Answers views and charts Creating and editing charts Performing common view tasks
Oracle Business Intelligence Enterprise Edition (OBIEE) Training: Working with Oracle Business Intelligence Answers Introduction to Oracle BI Answers Working with requests in Oracle BI Answers Using advanced
Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.
Phases of database design Application requirements Conceptual design Database Management Systems Conceptual schema Logical design ER or UML Physical Design Relational tables Logical schema Physical design
Enterprise Applications
Enterprise Applications Chi Ho Yue Sorav Bansal Shivnath Babu Amin Firoozshahian EE392C Emerging Applications Study Spring 2003 Functionality Online Transaction Processing (OLTP) Users/apps interacting
Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
