Data Integration and Exchange. L. Libkin 1 Data Integration and Exchange
|
|
|
- Rosanna Banks
- 9 years ago
- Views:
Transcription
1 Data Integration and Exchange L. Libkin 1 Data Integration and Exchange
2 Traditional approach to databases A single large repository of data. Database administrator in charge of access to data. Users interact with the database through application programs. Programmers write those (embedded SQL, other ways of combining general purpose programming languages and DBMSs) Queries dominate; updates less common. DMBS takes care of lots of things for you such as query processing and optimisation concurrency control enforcing database integrity L. Libkin 2 Data Integration and Exchange
3 Traditional approach to databases cont d This model works very within a single organisation that either does not interact much with the outside world, or the interaction is heavily controlled by the DB administrators What do we expect from such a system? 1. Data is relatively clean; little incompleteness 2. Data is consistent (enforced by the DMBS) 3. Data is there (resides on the disk) 4. Well-defined semantics of query answering (if you ask a query, you know what you want to get) 5. Access to data is controlled L. Libkin 3 Data Integration and Exchange
4 The world is changing The traditional model still dominates, but the world is changing. Many huge repositories are publicly available In fact many are well-organised databases, e.g., imdb.com, the CIA World Factbook, many genome databases, the DBLP server of CS publications, etc etc etc) Many queries cannot be answered using a single source. Often data from various sources needs to be combined, e.g. company mergers restructuring databases within a single organisation combining data from several private and public sources L. Libkin 4 Data Integration and Exchange
5 Course info No text. Because there is no text at this time... Slides will be posted on the course webpage: Tutorials by Lenzerini and Kolaitis (see links on the webpage) 3 assignments final exam Office hours: by appointment (usually works better for UG4) L. Libkin 5 Data Integration and Exchange
6 Why do you need this course Databases are everywhere these days (> $ /year business whatever that means today) Every enterprise has a database; they merge, combine data hence data integration In addition, a lot of data is available on the web, but often one needs many sources to answer a query Hence (almost) everyone needs to integrate data Huge investment from leading companies, IBM, Oracle, Microsoft Very ad hoc solutions; but finally we understand what the real problems in data integration are, and have some solutions (but not all!) L. Libkin 6 Data Integration and Exchange
7 Background Requirement: Database Systems (3rd year) or fluency in relational databases: relational model relational algebra/calculus SQL An understanding of the basic mathematical tools that serve as the foundation of computer science: basic set theory, graph theory, theory of computation, first-order logic. L. Libkin 7 Data Integration and Exchange
8 Outline of the course Introduction to the problems of data integration and exchange. Key new components: incomplete information query rewriting certain answers Data integration scenarios: global-as-view, local-as-view, combined virtual vs materialized How to distinguish easy queries from hard queries? Query answering in data integration scenarios: view-based rewritings L. Libkin 8 Data Integration and Exchange
9 Outline of the course cont d Incomplete information in databases theory, tables, complexity practice (the ugly reality SQL) Open and closed worlds Data exchange: settings, source-to-target constraints, solutions Data exchange query answering: conjunctive (select-project-join) queries full relational algebra queries closed vs open worlds L. Libkin 9 Data Integration and Exchange
10 Outline of the course cont d Data exchange: XML data tree patterns consistency problems query answering Schema management: composition, other operations, schema evolution Inconsistent databases, repairs, query answering If time permits: ranking queries L. Libkin 10 Data Integration and Exchange
11 Query answering from multiple sources Data resides in several different databases They may have different structures, different access policies etc Our view of the world may be very different from the view of the databases we need to use. Only portions of the data from some database could be available. That is, the sources do not conform to the schema of the database into which the data will be loaded. L. Libkin 11 Data Integration and Exchange
12 What industry offers now: ETL tools ETL stands for Extract Transform Load Extract data from multiple sources Transform it so it is compatible with the schema Load it into a database Many self-built tools in the 80s and the 90s; through acquisition fewer products exist now The big players IBM, Microsoft, Oracle all have their ETL products; Microsoft and Oracle offer them with their database products. A few independent vendors, e.g. Informatica PowerCenter. Several open source products exist, e.g. Clover ETL. L. Libkin 12 Data Integration and Exchange
13 ETL tools Focus: Data profiling Data cleaning Simple transformations Bulk loading Latency requirements What they don t do yet: nontrivial transformations query answering But techniques now exist for interesting data integration and for query answering and we shall learn them. They soon will be reflected in products (IBM and Microsoft are particularly active in this area) L. Libkin 13 Data Integration and Exchange
14 Data profiling/cleaning Data profiling: gives the user a view of data: Samples over large tables statistics (how many different values etc) Graphical tools for exploring the database Cleaning: Same properties may have different names e.g. Last Name, L Name, LastName Same data may have different representations e.g. (0131) vs , George Str. vs George Street Some data may be just wrong L. Libkin 14 Data Integration and Exchange
15 Data transformation Most transformation rules tend to be simple: Copy attribute LName to Last Name Set age to be current year DOB Heavy emphasis on industry specific formats For example, Informatica B2B Data Exchange product offers versions for Healthcare and Financial services as well as specialised tools for formats including: MS Word, Excel, PDF, UN/EDIFACT (Data Interchange For Administration, Commerce, and Transport), RosettaNet for B2B, and many specialised healthcare and financial form. These are format/industry specific and have little to do with the general tasks of data integration. L. Libkin 15 Data Integration and Exchange
16 Data integration, scenario 1 DB1 DB2 DB3... DBn GLOBAL SCHEMA QUERY: Q? L. Libkin 16 Data Integration and Exchange
17 Data integration DB1 DB2 DB3... DBn Q 1 Q 2 Q 3 Q n V 1 A B C D A E B C F A C V V V n GLOBAL SCHEMA QUERY: Q? L. Libkin 17 Data Integration and Exchange
18 Data integration DB1 DB2 DB3... DBn Q 1 Q 2 Q 3 Q n V 1 A B C D A E B C F A C V V V n GLOBAL SCHEMA QUERY: Q? Answer to Q is obtained by querying the views V 1,..., V n L. Libkin 18 Data Integration and Exchange
19 Data integration, query answering We have our view of the world (the Global Schema) We can access (parts of) databases DB 1,...,DB n to get relevant data. It comes in the form of views, V 1,...,V n Our query against the global schema must be reformulated as a query against the views V 1,...,V n The approach is completely virtual: we never create a database the conforms to the global schema. L. Libkin 19 Data Integration and Exchange
20 Data integration, query answering, a toy example List courses taught by permanent teaching staff during Winter 2007 We have two databases: D 1 (name, age, salary) of permanent staff D 2 (teacher, course, semester, enrollment) of courses D 1 only publishes the value of the name attribute D 2 does not reveal enrollments The views: V 1 = π name (D 1 ) V 2 = π teacher,course,semester (D 2 ) Next step: establish correspondence between attributes name of V 1 and teacher of V 2 L. Libkin 20 Data Integration and Exchange
21 Data integration, query answering, a toy example cont d To answer query, we need to import the following data: V 1 W 2 = σ semester= Winter 2007 (V 2) Answering query: {course name,sem V 1 (name) W 2 (name,course,sem)} Or, in relational algebra π course (V 1 name=teacher W 2 ) L. Libkin 21 Data Integration and Exchange
22 Toy example, lessons learned We don t have access to all the data Some human intervention is essential (someone needs to tell us that teacher and name refer to the same entity) We don t run a query against a single database. Instead, we run queries against different databases based on restrictions they impose get results to use them locally run another query against those results L. Libkin 22 Data Integration and Exchange
23 Toy example, things getting more complicated Find informatics permanent staff who taught during the Winter 2007 semester, and their phone numbers We have additional personnel databases: an informatics database D 3 (employee, phone,office), and a university-wide database D 4 (employee, school, phone) for simplicity, assume all this information is public Now we have a choice: use D 3 to get information about phones use D 4 to get information about phones use both D 3 and D 4 to get information about phones L. Libkin 23 Data Integration and Exchange
24 Toy example cont d First, we need some human involvement to see that employee, name, and teacher refer to the same category of objects If one uses D 3, then the query is {name, phone sem,office V 1 (name) W 2 (name,course,sem) D 3 (name, phone,office)} If one uses D 4, then the query is {name, phone sem, school V 1 (name) W 2 (name,course,sem) D 4 (name, school,phone)} But what if one uses both D 3 and D 4? L. Libkin 24 Data Integration and Exchange
25 Toy example cont d We could insist on the phone number being: in either D 3 or D 4 in both D 3 and D 4, but not necessarily the same in both D 3 and D 4, and the same in both databases One can write queries for all the cases, but which one should we use? New lessons: databases that are being integrated are often inconsistent query answering is by no means unique there could be several ways to answer a query different possibilities for answering queries are a result of inconsistencies and incomplete information L. Libkin 25 Data Integration and Exchange
26 Toy example cont d Suppose phone numbers in D 3 and D 4 are different. What is a sensible query answer then? A common approach is to use certain answers these are guaranteed to be true. Another question: what if there is no record at all for the phone number in D 3 and D 4? Then we have an instance of incomplete information. L. Libkin 26 Data Integration and Exchange
27 A different scenario So far we looked at virtual integration: no database of the global schema was created. Sometimes we need such a database to be created, for example, if many queries are expected to be asked against it. In general, this is a common problem with data integration: materialize vs federate. Materialize = create a new database based on integrating data from different sources. Federate = the virtual approach: obtain data from various sources and use them to answer queries. L. Libkin 27 Data Integration and Exchange
28 Virtual vs Materialization A common situation for the materialization approach: merger of different organizations. A common situation for the federated approach: we don t have full access to the data, and the data changes often. L. Libkin 28 Data Integration and Exchange
29 Common tasks in data integration How do we represent information? Global schema, attributes, constraints data formats of attributes reconciling data from different sources abbreviations, terminology, ontologies How do we deal with imperfect information? resolve overlaps handling missing data handling inconsistencies L. Libkin 29 Data Integration and Exchange
30 Common tasks in data integration cont d How do we answer queries? what information is available? Can we get the answer? if not, what is the semantics of query answering? Is query answering feasible? Is it possible to compute query answers at all? If now, how do we approximate? Materialize or federate? L. Libkin 30 Data Integration and Exchange
31 Common tasks in data integration cont d Do it from scratch or use commercial tools? many are available (just google for data integration ) but do we fully understand them? lots of them are very ad hoc, with poorly defined semantics this is why it is so important to understand what really happens in data integration L. Libkin 31 Data Integration and Exchange
32 Data Exchange SOURCE DATABASE Source Schema S Target Schema T L. Libkin 32 Data Integration and Exchange
33 Data Exchange SOURCE DATABASE TARGET DATABASE????? Source Schema S Target Schema T L. Libkin 33 Data Integration and Exchange
34 Data Exchange SOURCE DATABASE TARGET DATABASE????? Source Schema S Target Schema T Query over the target schema: Q How to answer Q so that the answer is consistent with the data in the source database? L. Libkin 34 Data Integration and Exchange
35 Data exchange vs Data integration Data exchange appears to be an easier problem: there is only one source database; and one has complete access to the source data. But there could be many different target instances. Problem: which one to use for query answering? L. Libkin 35 Data Integration and Exchange
36 When do we have the need for data exchange A typical scenario: Two organizations have their legacy databases, schemas cannot be changed. Data from one organization 1 needs to be transfered to data from organization 2. Queries need to be answered against the transferred data. L. Libkin 36 Data Integration and Exchange
37 Data exchange towards multiple instances A simple example: we want to create a target database with the schema Flight(city1,city2,aircraft,departure,arrival) Served(city,country,population,agency) We don t start from scratch: there is a source database containing relations Route(source,destination,,departure) BG(country,city) We want to transfer data from the source to the target. L. Libkin 37 Data Integration and Exchange
38 Data exchange relationships between the source and the target How to specify the relationship? ROUTE Source Dest Departure city1 city2 aircraft departure arrival FLIGHT BG Country City city country population agency SERVED Semantics??? For example, arrows from city is the meaning and or or? L. Libkin 38 Data Integration and Exchange
39 Data exchange relationships between the source and the target Formal specification: we have a relational calculus query over both the source and the target schema. The query is of a restricted form, and can be thought of as a sequence of rules: Flight(c1, c2,, dept, ) : Route(c1, c2, dept) Served(city, country,, ) : Route(city,, ), BG(city, country) Served(city, country,, ) : Route(, city, ), BG(city, country) L. Libkin 39 Data Integration and Exchange
40 Data exchange targets Target instances should satisfy the rules. What does it mean to satisfy a rule? Formally, if we take: Flight(c1, c2,, dept, ) : Route(c1, c2, dept) then it is satisfied by a source S and a target T if the constraint ( c 1, c 2, d (Route(c 1, c 2, d) a 1,a 2 Flight(c1, c 2, a 1,d,a 2 ) )) This constraint is a relational calculus query that evaluates to true or false L. Libkin 40 Data Integration and Exchange
41 Data exchange targets What happens if there no values for some attributes, e.g. aircraft, arrival? We put in null values or some real values. But then we may have multiple solutions! L. Libkin 41 Data Integration and Exchange
42 Data exchange targets Source Database: ROUTE: Source Destination Departure Edinburgh Amsterdam 0600 Edinburgh London 0615 Edinburgh Frankfurt 0700 BG: Country UK UK NL GER City London Edinburgh Amsterdam Frankfurt Look at the rule Flight(c1, c2,, dept, ) : Route(c1, c2, dept) The right hand side is satisfied by But what can we put in the target? Route(Edinburgh, Amsterdam, 0600) L. Libkin 42 Data Integration and Exchange
43 Data exchange targets Rule: Flight(c1, c2,, dept, ) : Route(c1, c2, dept) Satisfied by: Route(Edinburgh, Amsterdam, 0600) Possible targets: Flight(Edinburgh, Amsterdam, 1, 0600, 2 ) Flight(Edinburgh, Amsterdam, B737, 0600, ) Flight(Edinburgh, Amsterdam,, 0600, 0845) Flight(Edinburgh, Amsterdam, B737, 0600, 0845) They all satisfy the constraints! L. Libkin 43 Data Integration and Exchange
44 Data exchange queries Now consider two queries: Q 1 : Is there a flight from Edinburgh to Amsterdam that departs before 7am? Q 2 : Is there a flight from Edinburgh to Amsterdam that arrives before 9am? What is the difference? Q 1 can be answered with certainty: in every solution we have a tuple Flight(Edinburgh, Amsterdam,, 0600, ) Q 2 cannot be answered with certainty: in some solutions we don t have a tuple Flight(Edinburgh, Amsterdam, a, t 1, t 2 ) with t 2 earlier than 9am. Our goal is to find certain answers. L. Libkin 44 Data Integration and Exchange
45 Data exchange queries But computing certain answers requires checking seemingly an infinite number of databases! How else can we do it? Create a good target instance T good so that: for a query Q we can define a query Q r (its rewriting) that satisfies the property: certain answers to Q = Q r (T good ) Questions: can we always find such a T good and a rewriting algorithm Q Q r? and if not, what restrictions do we impose on data exchange settings and/or queries? L. Libkin 45 Data Integration and Exchange
46 Inconsistencies in databases If we integrate data, we shall always have inconsistencies: One database says that we have John Smith with salary 20K in office 100 another says that we have John Smith with salary 30K in office 100 and the database must satisfy a key constraint: the name field is a key. Hence if we put Name Office Salary John Smith K John Smith K in our database, we have inconsistent data. L. Libkin 46 Data Integration and Exchange
47 Inconsistencies in databases: query answering Q 1 : Does John Smith sit in office 100? Q 2 : Does John Smith make 20K? Difference: Q 1 can be answered with certainty; Q 2 cannot be. What does it mean to answer a query with certainty? If we repair a database so that it satisfies the constraints, the answer is true no matter how we repair repair it. L. Libkin 47 Data Integration and Exchange
48 Inconsistencies in databases: query answering In our example, two ways to repair: R 1 : Name Office Salary John Smith K R 2 : Name Office Salary John Smith K Q 1 is always true, Q 2 is not. But the number of repairs could be very large (exponential why?). Hence prohibitively expensive query answering algorithm. Question: when can query answering be made efficient? Perhaps it involves a rewriting of the original query. The key idea: query rewriting to obtain certain answers. L. Libkin 48 Data Integration and Exchange
49 Schema mappings Last subject we deal with in this course. Still the least understood, but extremely important. Schema evolution: schema changes over time. Question how to transfer data? Single step data exchange. But what if we go through many steps? How do we transfer data, how do we answer queries? L. Libkin 49 Data Integration and Exchange
50 Schema mappings Two data exchange scenarios: Schema1 Schema2 Constraints12 Schema2 Schema3 Constraints23 Suppose we know how to move data from Schema1 to Schema2, and then from Schema2 to Schema3? Can we describe this by a single set of schema constraints: Schema1 Schema3 Constraints13 This turns out to be a very nontrivial task, but it occurs very often in database schema management. And there are other operations inverse, for example: (Schema1 Schema2 Constraints12) (Schema2 Schema1 Constraints21) L. Libkin 50 Data Integration and Exchange
Data exchange. L. Libkin 1 Data Integration and Exchange
Data exchange Source schema, target schema; need to transfer data between them. A typical scenario: Two organizations have their legacy databases, schemas cannot be changed. Data from one organization
ETL Tools. L. Libkin 1 Data Integration and Exchange
ETL Tools ETL = Extract Transform Load Typically: data integration software for building data warehouse Pull large volumes of data from different sources, in different formats, restructure them and load
CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?
CS2Bh: Current Technologies Introduction to XML and Relational Databases Spring 2005 Introduction to Databases CS2 Spring 2005 (LN5) 1 Why databases? Why not use XML? What is missing from XML: Consistency
Data Integration. Maurizio Lenzerini. Universitá di Roma La Sapienza
Data Integration Maurizio Lenzerini Universitá di Roma La Sapienza DASI 06: Phd School on Data and Service Integration Bertinoro, December 11 15, 2006 M. Lenzerini Data Integration DASI 06 1 / 213 Structure
Relational Database Basics Review
Relational Database Basics Review IT 4153 Advanced Database J.G. Zheng Spring 2012 Overview Database approach Database system Relational model Database development 2 File Processing Approaches Based on
Query Processing in Data Integration Systems
Query Processing in Data Integration Systems Diego Calvanese Free University of Bozen-Bolzano BIT PhD Summer School Bressanone July 3 7, 2006 D. Calvanese Data Integration BIT PhD Summer School 1 / 152
Data Quality in Information Integration and Business Intelligence
Data Quality in Information Integration and Business Intelligence Leopoldo Bertossi Carleton University School of Computer Science Ottawa, Canada : Faculty Fellow of the IBM Center for Advanced Studies
1 File Processing Systems
COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.
Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?
Files What s it all about? Information being stored about anything important to the business/individual keeping the files. The simple concepts used in the operation of manual files are often a good guide
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
Database Systems. Lecture 1: Introduction
Database Systems Lecture 1: Introduction General Information Professor: Leonid Libkin Contact: [email protected] Lectures: Tuesday, 11:10am 1 pm, AT LT4 Website: http://homepages.inf.ed.ac.uk/libkin/teach/dbs09/index.html
Objectives of Lecture 1. Labs and TAs. Class and Office Hours. CMPUT 391: Introduction. Introduction
Database Management Systems Winter 2003 CMPUT 391: Introduction Dr. Osmar R. Zaïane Objectives of Lecture 1 Introduction Get a rough initial idea about the content of the course: Lectures Resources Activities
ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001
ICOM 6005 Database Management Systems Design Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001 Readings Read Chapter 1 of text book ICOM 6005 Dr. Manuel
A Tutorial on Data Integration
A Tutorial on Data Integration Maurizio Lenzerini Dipartimento di Informatica e Sistemistica Antonio Ruberti, Sapienza Università di Roma DEIS 10 - Data Exchange, Integration, and Streaming November 7-12,
COMP5138 Relational Database Management Systems. Databases are Everywhere!
COMP5138 Relational Database Management Systems Week 1: COMP 5138 Intro to Database Systems Professor Joseph Davis and Boon Ooi Databases are Everywhere! Database Application Examples: Banking: all transactions
Number of hours in the semester L Ex. Lab. Projects SEMESTER I 1. Economy 45 18 27. 2. Philosophy 18 18. 4. Mathematical Analysis 45 18 27 Exam
Year 1 Lp. Course name Number of hours in the semester L Ex. Lab. Projects SEMESTER I 1. Economy 45 18 7. Philosophy 18 18 3. Linear Algebra 45 18 7 Exam 4. Mathematical Analysis 45 18 7 Exam 5. Economical
Data Integration and ETL Process
Data Integration and ETL Process Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, second
Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya
Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data
INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS
INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS Tadeusz Pankowski 1,2 1 Institute of Control and Information Engineering Poznan University of Technology Pl. M.S.-Curie 5, 60-965 Poznan
Web-Based Genomic Information Integration with Gene Ontology
Web-Based Genomic Information Integration with Gene Ontology Kai Xu 1 IMAGEN group, National ICT Australia, Sydney, Australia, [email protected] Abstract. Despite the dramatic growth of online genomic
ECS 165A: Introduction to Database Systems
ECS 165A: Introduction to Database Systems Todd J. Green based on material and slides by Michael Gertz and Bertram Ludäscher Winter 2011 Dept. of Computer Science UC Davis ECS-165A WQ 11 1 1. Introduction
Overview of Data Management
Overview of Data Management Grant Weddell Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Winter 2015 CS 348 (Intro to DB Mgmt) Overview of Data Management
Objectives of Lecture 1. Class and Office Hours. Labs and TAs. CMPUT 391: Introduction. Introduction
Database Management Systems Winter 2004 CMPUT 391: Introduction Dr. Osmar R. Zaïane Objectives of Lecture 1 Introduction Get a rough initial idea about the content of the course: Lectures Resources Activities
Databases. DSIC. Academic Year 2010-2011
Databases DSIC. Academic Year 2010-2011 1 Lecturer José Hernández-Orallo Office 236, 2nd floor DSIC. Email: [email protected] http://www.dsic.upv.es/~jorallo/docent/bda/bdaeng.html Attention hours On
Lecture 7: Concurrency control. Rasmus Pagh
Lecture 7: Concurrency control Rasmus Pagh 1 Today s lecture Concurrency control basics Conflicts and serializability Locking Isolation levels in SQL Optimistic concurrency control Transaction tuning Transaction
Course: CSC 222 Database Design and Management I (3 credits Compulsory)
Course: CSC 222 Database Design and Management I (3 credits Compulsory) Course Duration: Three hours per week for 15weeks with practical class (45 hours) As taught in 2010/2011 session Lecturer: Oladele,
Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1
Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1 Mark Rittman, Director, Rittman Mead Consulting for Collaborate 09, Florida, USA,
Scope of this Course. Database System Environment. CSC 440 Database Management Systems Section 1
CSC 440 Database Management Systems Section 1 Acknowledgment: Slides borrowed from Dr. Rada Chirkova. This presentation uses slides and lecture notes available from http://www-db.stanford.edu/~ullman/dscb.html#slides
Chapter 1: Introduction
Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases
Principles of Database. Management: Summary
Principles of Database Management: Summary Pieter-Jan Smets September 22, 2015 Contents 1 Fundamental Concepts 5 1.1 Applications of Database Technology.............................. 5 1.2 Definitions.............................................
SQL Query Evaluation. Winter 2006-2007 Lecture 23
SQL Query Evaluation Winter 2006-2007 Lecture 23 SQL Query Processing Databases go through three steps: Parse SQL into an execution plan Optimize the execution plan Evaluate the optimized plan Execution
Database Optimizing Services
Database Systems Journal vol. I, no. 2/2010 55 Database Optimizing Services Adrian GHENCEA 1, Immo GIEGER 2 1 University Titu Maiorescu Bucharest, Romania 2 Bodenstedt-Wilhelmschule Peine, Deutschland
The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1
The Relational Model Ramakrishnan&Gehrke, Chapter 3 CS4320 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. Legacy systems in older models
Schema Mappings and Data Exchange
Schema Mappings and Data Exchange Phokion G. Kolaitis University of California, Santa Cruz & IBM Research-Almaden EASSLC 2012 Southwest University August 2012 1 Logic and Databases Extensive interaction
Oracle BI Applications (BI Apps) is a prebuilt business intelligence solution.
1 2 Oracle BI Applications (BI Apps) is a prebuilt business intelligence solution. BI Apps supports Oracle sources, such as Oracle E-Business Suite Applications, Oracle's Siebel Applications, Oracle's
www.dotnetsparkles.wordpress.com
Database Design Considerations Designing a database requires an understanding of both the business functions you want to model and the database concepts and features used to represent those business functions.
Topics. Introduction to Database Management System. What Is a DBMS? DBMS Types
Introduction to Database Management System Linda Wu (CMPT 354 2004-2) Topics What is DBMS DBMS types Files system vs. DBMS Advantages of DBMS Data model Levels of abstraction Transaction management DBMS
Week 1 Part 1: An Introduction to Database Systems. Databases and DBMSs. Why Use a DBMS? Why Study Databases??
Week 1 Part 1: An Introduction to Database Systems Databases and DBMSs Data Models and Data Independence Concurrency Control and Database Transactions Structure of a DBMS DBMS Languages Databases and DBMSs
3. Relational Model and Relational Algebra
ECS-165A WQ 11 36 3. Relational Model and Relational Algebra Contents Fundamental Concepts of the Relational Model Integrity Constraints Translation ER schema Relational Database Schema Relational Algebra
CSE 132A. Database Systems Principles
CSE 132A Database Systems Principles Prof. Victor Vianu 1 Data Management An evolving, expanding field: Classical stand-alone databases (Oracle, DB2, SQL Server) Computer science is becoming data-centric:
Chapter 3: Data Mining Driven Learning Apprentice System for Medical Billing Compliance
Chapter 3: Data Mining Driven Learning Apprentice System for Medical Billing Compliance 3.1 Introduction This research has been conducted at back office of a medical billing company situated in a custom
XML Data Integration
XML Data Integration Lucja Kot Cornell University 11 November 2010 Lucja Kot (Cornell University) XML Data Integration 11 November 2010 1 / 42 Introduction Data Integration and Query Answering A data integration
Data Integration: A Theoretical Perspective
Data Integration: A Theoretical Perspective Maurizio Lenzerini Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza Via Salaria 113, I 00198 Roma, Italy [email protected] ABSTRACT
Database System Concepts
s Design Chapter 1: Introduction Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth
Writing a Requirements Document For Multimedia and Software Projects
Writing a Requirements Document For Multimedia and Software Projects Rachel S. Smith, Senior Interface Designer, CSU Center for Distributed Learning Introduction This guide explains what a requirements
Business Benefits From Microsoft SQL Server Business Intelligence Solutions How Can Business Intelligence Help You? PTR Associates Limited
Business Benefits From Microsoft SQL Server Business Intelligence Solutions How Can Business Intelligence Help You? www.ptr.co.uk Business Benefits From Microsoft SQL Server Business Intelligence (September
Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations
1 Topics for this week: 1. Good Design 2. Functional Dependencies 3. Normalization Readings for this week: 1. E&N, Ch. 10.1-10.6; 12.2 2. Quickstart, Ch. 3 3. Complete the tutorial at http://sqlcourse2.com/
www.gr8ambitionz.com
Data Base Management Systems (DBMS) Study Material (Objective Type questions with Answers) Shared by Akhil Arora Powered by www. your A to Z competitive exam guide Database Objective type questions Q.1
Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system
Introduction: management system Introduction s vs. files Basic concepts Brief history of databases Architectures & languages System User / Programmer Application program Software to process queries Software
Graham Kemp (telephone 772 54 11, room 6475 EDIT) The examiner will visit the exam room at 15:00 and 17:00.
CHALMERS UNIVERSITY OF TECHNOLOGY Department of Computer Science and Engineering Examination in Databases, TDA357/DIT620 Tuesday 17 December 2013, 14:00-18:00 Examiner: Results: Exam review: Grades: Graham
Foundations of Information Management
Foundations of Information Management - WS 2009/10 Juniorprofessor Alexander Markowetz Bonn Aachen International Center for Information Technology (B-IT) Alexander Markowetz Born 1976 in Brussels, Belgium
Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation
Objectives Distributed Databases and Client/Server Architecture IT354 @ Peter Lo 2005 1 Understand the advantages and disadvantages of distributed databases Know the design issues involved in distributed
EHR Standards Landscape
EHR Standards Landscape Dr Dipak Kalra Centre for Health Informatics and Multiprofessional Education (CHIME) University College London [email protected] A trans-national ehealth Infostructure Wellness
ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati
ISM 318: Database Systems Dr. Hamid R. Nemati Department of Information Systems Operations Management Bryan School of Business Economics Objectives Underst the basics of data databases Underst characteristics
Information Management
Information Management Dr Marilyn Rose McGee-Lennon [email protected] What is Information Management about Aim: to understand the ways in which databases contribute to the management of large amounts
Introduction: Database management system
Introduction Databases vs. files Basic concepts Brief history of databases Architectures & languages Introduction: Database management system User / Programmer Database System Application program Software
David Dye. Extract, Transform, Load
David Dye Extract, Transform, Load Extract, Transform, Load Overview SQL Tools Load Considerations Introduction David Dye [email protected] HTTP://WWW.SQLSAFETY.COM Overview ETL Overview Extract Define
Temporal Database System
Temporal Database System Jaymin Patel MEng Individual Project 18 June 2003 Department of Computing, Imperial College, University of London Supervisor: Peter McBrien Second Marker: Ian Phillips Abstract
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Journal of Information Technology Management SIGNS OF IT SOLUTIONS FAILURE: REASONS AND A PROPOSED SOLUTION ABSTRACT
Journal of Information Technology Management ISSN #1042-1319 A Publication of the Association of Management SIGNS OF IT SOLUTIONS FAILURE: REASONS AND A PROPOSED SOLUTION MAJED ABUSAFIYA NEW MEXICO TECH
What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World
COSC 304 Introduction to Systems Introduction Dr. Ramon Lawrence University of British Columbia Okanagan [email protected] What is a database? A database is a collection of logically related data for
Database Systems. Lecture Handout 1. Dr Paolo Guagliardo. University of Edinburgh. 21 September 2015
Database Systems Lecture Handout 1 Dr Paolo Guagliardo University of Edinburgh 21 September 2015 What is a database? A collection of data items, related to a specific enterprise, which is structured and
Introduction. Chapter 1. Introducing the Database. Data vs. Information
Chapter 1 Objectives: to learn The difference between data and information What a database is, the various types of databases, and why they are valuable assets for decision making The importance of database
The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3
The Relational Model Chapter 3 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase,
Integrating Pattern Mining in Relational Databases
Integrating Pattern Mining in Relational Databases Toon Calders, Bart Goethals, and Adriana Prado University of Antwerp, Belgium {toon.calders, bart.goethals, adriana.prado}@ua.ac.be Abstract. Almost a
The basic data mining algorithms introduced may be enhanced in a number of ways.
DATA MINING TECHNOLOGIES AND IMPLEMENTATIONS The basic data mining algorithms introduced may be enhanced in a number of ways. Data mining algorithms have traditionally assumed data is memory resident,
The ESB and Microsoft BI
Business Intelligence The ESB and Microsoft BI The role of the Enterprise Service Bus in Microsoft s BI Framework Gijsbert Gijs in t Veld CTO, BizTalk Server MVP [email protected] About motion10
Chapter 1: Introduction. Database Management System (DBMS) University Database Example
This image cannot currently be displayed. Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Database Management System (DBMS) DBMS contains information
How To Program A Computer
Object Oriented Software Design Introduction Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa September 23, 2010 G. Lipari (Scuola Superiore Sant Anna) OOSD September 23, 2010
CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model
CS2Bh: Current Technologies Introduction to XML and Relational Databases Spring 2005 The Relational Model CS2 Spring 2005 (LN6) 1 The relational model Proposed by Codd in 1970. It is the dominant data
Overview of Database Management Systems
Overview of Database Management Systems Goals: DBMS basic concepts Introduce underlying managerial issues Prepare for discussion of uses of DBMS, such as OLAP and database mining 1 Overview of Database
DATA MINING AND WAREHOUSING CONCEPTS
CHAPTER 1 DATA MINING AND WAREHOUSING CONCEPTS 1.1 INTRODUCTION The past couple of decades have seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation
MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012
MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Description: This five-day instructor-led course teaches students how to design and implement a BI infrastructure. The
Is ETL Becoming Obsolete?
Is ETL Becoming Obsolete? Why a Business-Rules-Driven E-LT Architecture is Better Sunopsis. All rights reserved. The information contained in this document does not constitute a contractual agreement with
The Relational Model. Why Study the Relational Model? Relational Database: Definitions
The Relational Model Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Microsoft, Oracle, Sybase, etc. Legacy systems in
Reverse Engineering in Data Integration Software
Database Systems Journal vol. IV, no. 1/2013 11 Reverse Engineering in Data Integration Software Vlad DIACONITA The Bucharest Academy of Economic Studies [email protected] Integrated applications
II. PREVIOUS RELATED WORK
An extended rule framework for web forms: adding to metadata with custom rules to control appearance Atia M. Albhbah and Mick J. Ridley Abstract This paper proposes the use of rules that involve code to
Answers to Review Questions
Tutorial 2 The Database Design Life Cycle Reference: MONASH UNIVERSITY AUSTRALIA Faculty of Information Technology FIT1004 Database Rob, P. & Coronel, C. Database Systems: Design, Implementation & Management,
Database 2 Lecture I. Alessandro Artale
Free University of Bolzano Database 2. Lecture I, 2003/2004 A.Artale (1) Database 2 Lecture I Alessandro Artale Faculty of Computer Science Free University of Bolzano Room: 221 [email protected] http://www.inf.unibz.it/
Distribution transparency. Degree of transparency. Openness of distributed systems
Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science [email protected] Chapter 01: Version: August 27, 2012 1 / 28 Distributed System: Definition A distributed
Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery
Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery Dimitrios Kourtesis, Iraklis Paraskakis SEERC South East European Research Centre, Greece Research centre of the University
DATABASE MANAGEMENT SYSTEM
REVIEW ARTICLE DATABASE MANAGEMENT SYSTEM Sweta Singh Assistant Professor, Faculty of Management Studies, BHU, Varanasi, India E-mail: [email protected] ABSTRACT Today, more than at any previous
Database Management Systems. Chapter 1
Database Management Systems Chapter 1 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2 What Is a Database/DBMS? A very large, integrated collection of data. Models real-world scenarios
There are five fields or columns, with names and types as shown above.
3 THE RELATIONAL MODEL Exercise 3.1 Define the following terms: relation schema, relational database schema, domain, attribute, attribute domain, relation instance, relation cardinality, andrelation degree.
Turning ClearPath MCP Data into Information with Business Information Server. White Paper
Turning ClearPath MCP Data into Information with Business Information Server White Paper 1 Many Unisys ClearPath MCP Series customers have Enterprise Database Server (DMSII) databases to support a variety
MDM and Data Warehousing Complement Each Other
Master Management MDM and Warehousing Complement Each Other Greater business value from both 2011 IBM Corporation Executive Summary Master Management (MDM) and Warehousing (DW) complement each other There
Overview of Database Management
Overview of Database Management M. Tamer Özsu David R. Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Fall 2012 CS 348 Overview of Database Management
The Relational Model. Why Study the Relational Model?
The Relational Model Chapter 3 Instructor: Vladimir Zadorozhny [email protected] Information Science Program School of Information Sciences, University of Pittsburgh 1 Why Study the Relational Model?
Introduction to Database Systems CS4320. Instructor: Christoph Koch [email protected] CS 4320 1
Introduction to Database Systems CS4320 Instructor: Christoph Koch [email protected] CS 4320 1 CS4320/1: Introduction to Database Systems Underlying theme: How do I build a data management system? CS4320
THE SEMANTIC WEB AND IT`S APPLICATIONS
15-16 September 2011, BULGARIA 1 Proceedings of the International Conference on Information Technologies (InfoTech-2011) 15-16 September 2011, Bulgaria THE SEMANTIC WEB AND IT`S APPLICATIONS Dimitar Vuldzhev
THE BCS PROFESSIONAL EXAMINATIONS Professional Graduate Diploma. Advanced Database Management Systems
THE BCS PROFESSIONAL EXAMINATIONS Professional Graduate Diploma April 2007 EXAMINERS' REPORT Advanced Database Management Systems General Comment The performance of this paper is similar to previous years.
myschool Web Based Education Management Solution
myschool Web Based Education Management Solution 4 Emad El-Deen Kamel St. Nasr City, Cairo Egypt Ph (202) 403-4407 myschool is a highly interactive web based School Management System that provides a complete
Using distributed database system and consolidation resources for Data server consolidation
Using distributed database system and consolidation resources for Data server consolidation Alexander Bogdanov 1 1 High-performance computing Institute and the integrated systems, e-mail: [email protected],
The Import & Export of Data from a Database
The Import & Export of Data from a Database Introduction The aim of these notes is to investigate a conceptually simple model for importing and exporting data into and out of an object-relational database,
