What Goes Around Comes Around

Size: px
Start display at page:

Download "What Goes Around Comes Around"

Transcription

1 What Goes Around Comes Around Michael Stonebraker Joseph M. Hellerstein Abstract This paper provides a summary of 35 years of data model proposals, grouped into 9 different eras. We discuss the proposals of each era, and show that there are only a few basic data modeling ideas, and most have been around a long time. Later proposals inevitably bear a strong resemblance to certain earlier proposals. Hence, it is a worthwhile exercise to study previous proposals. In addition, we present the lessons learned from the exploration of the proposals in each era. Most current researchers were not around for many of the previous eras, and have limited (if any) understanding of what was previously learned. There is an old adage that he who does not understand history is condemned to repeat it. By presenting ancient history, we hope to allow future researchers to avoid replaying history. Unfortunately, the main proposal in the current XML era bears a striking resemblance to the CODASYL proposal from the early 1970 s, which failed because of its complexity. Hence, the current era is replaying history, and what goes around comes around. Hopefully the next era will be smarter. I Introduction Data model proposals have been around since the late 1960 s, when the first author came on the scene. Proposals have continued with surprising regularity for the intervening 35 years. Moreover, many of the current day proposals have come from researchers too young to have learned from the discussion of earlier ones. Hence, the purpose of this paper is to summarize 35 years worth of progress and point out what should be learned from this lengthy exercise. We present data model proposals in nine historical epochs: Hierarchical (IMS): late 1960 s and 1970 s Network (CODASYL): 1970 s Relational: 1970 s and early 1980 s Entity-Relationship: 1970 s Extended Relational: 1980 s Semantic: late 1970 s and 1980 s Object-oriented: late 1980 s and early 1990 s Object-relational: late 1980 s and early 1990 s

2 What Goes Around Comes Around 3 Semi-structured (XML): late 1990 s to the present In each case, we discuss the data model and associated query language, using a neutral notation. Hence, we will spare the reader the idiosyncratic details of the various proposals. We will also attempt to use a uniform collection of terms, again in an attempt to limit the confusion that might otherwise occur. Throughout much of the paper, we will use the standard example of suppliers and parts, from [CODD70], which we write for now in relational form in Figure 1. Supplier (sno, sname, scity, sstate) Part (pno, pname, psize, pcolor) Supply (sno, pno, qty, price) A Relational Schema Figure 1 Here we have Supplier information, Part information and the Supply relationship to indicate the terms under which a supplier can supply a part. II IMS Era IMS was released around 1968, and initially had a hierarchical data model. It understood the notion of a record type, which is a collection of named fields with their associated data types. Each instance of a record type is forced to obey the data description indicated in the definition of the record type. Furthermore, some subset of the named fields must uniquely specify a record instance, i.e. they are required to be a key. Lastly, the record types must be arranged in a tree, such that each record type (other than the root) has a unique parent record type. An IMS data base is a collection of instances of record types, such that each instance, other than root instances, has a single parent of the correct record type. This requirement of tree-structured data presents a challenge for our sample data, because we are forced to structure it in one of the two ways indicated in Figure 2. These representations share two common undesirable properties: 1) Information is repeated. In the first schema, Part information is repeated for each Supplier who supplies the part. In the second schema, Supplier information is repeated for each part he supplies. Repeated information is undesirable, because it offers the possibility for inconsistent data. For example, a repeated data element could be changed in some, but not all, of the places it appears, leading to an inconsistent data base. 2) Existence depends on parents. In the first schema it is impossible for there to be a part that is not currently supplied by anybody. In the second schema, it is impossible to have a supplier which does not currently supply anything. There is no support for these corner cases in a strict hierarchy.

3 4 Chapter 1: Data Models and DBMS Architecture Supplier (sno, sname, scity, sstate) Part (pno, pname, psize, pcolor) Part (pno, pname, psize, pcolor, qty, price) Supplier (sno, sname, scity, sstate, qty, price) Two Hierarchical Organizations Figure 2 IMS chose a hierarchical data base because it facilitates a simple data manipulation language, DL/1. Every record in an IMS data base has a hierarchical sequence key (HSK). Basically, an HSK is derived by concatenating the keys of ancestor records, and then adding the key of the current record. HSK defines a natural order of all records in an IMS data base, basically depth-first, left-to-right. DL/1 intimately used HSK order for the semantics of commands. For example, the get next command returns the next record in HSK order. Another use of HSK order is the get next within parent command, which explores the subtree underneath a given record in HSK order. Using the first schema, one can find all the red parts supplied by Supplier 16 as: Get unique Supplier (sno = 16) Until failure do Get next within parent (color = red) Enddo The first command finds Supplier 16. Then we iterate through the subtree underneath this record in HSK order, looking for red parts. When the subtree is exhausted, an error is returned. Notice that DL/1 is a record-at-a-time language, whereby the programmer constructs an algorithm for solving his query, and then IMS executes this algorithm. Often there are multiple ways to solve a query. Here is another way to solve the above specification:

4 What Goes Around Comes Around 5 Until failure do Get next Part (color = red) Enddo Although one might think that the second solution is clearly inferior to the first one; in fact if there is only one supplier in the data base (number 16), the second solution will outperform the first. The DL/1 programmer must make such optimization tradeoffs. IMS supported four different storage formats for hierarchical data. Basically root records can either be: Stored sequentially Indexed in a B-tree using the key of the record Hashed using the key of the record Dependent records are found from the root using either Physical sequentially Various forms of pointers. Some of the storage organizations impose restrictions on DL/1 commands. For example the purely sequential organization will not support record inserts. Hence, it is appropriate only for batch processing environments in which a change list is sorted in HSK order and then a single pass of the data base is made, the changes inserted in the correct place, and a new data base written. This is usually referred to as old-master-new-master processing. In addition, the storage organization that hashes root records on a key cannot support get next, because it has no easy way to return hashed records in HSK order. These various quirks in IMS are designed to avoid operations that would have impossibly bad performance. However, this decision comes at a price: One cannot freely change IMS storage organizations to tune a data base application because there is no guarantee that the DL/1 programs will continue to run. The ability of a data base application to continue to run, regardless of what tuning is performed at the physical level will be called physical data independence. Physical data independence is important because a DBMS application is not typically written all at once. As new programs are added to an application, the tuning demands may change, and better DBMS performance could be achieved by changing the storage organization. IMS has chosen to limit the amount of physical data independence that is possible. In addition, the logical requirements of an application may change over time. New record types may be added, because of new business requirements or because of new government requirements. It may also be desirable to move certain data elements from one record type to another. IMS supports a certain level of logical data independence, because DL/1 is actually defined on a logical data base, not on the actual physical data base that is stored. Hence, a DL/1 program can be written initially by defining the logical

5 6 Chapter 1: Data Models and DBMS Architecture data base to be exactly same as the physical data base. Later, record types can be added to the physical data base, and the logical data base redefined to exclude them. Hence, an IMS data base can grow with new record types, and the initial DL/1 program will continue to operate correctly. In general, an IMS logical data base can be a subtree of a physical data base. It is an excellent idea to have the programmer interact with a logical abstraction of the data, because this allows the physical organization to change, without compromising the runability of DL/1 programs. Logical and physical data independence are important because DBMS application have a much longer lifetime (often a quarter century or more) than the data on which they operate. Data independence will allow the data to change without requiring costly program maintenance. One last point should be made about IMS. Clearly, our sample data is not amenable to a tree structured representation as noted earlier. Hence, there was quickly pressure on IMS to represent our sample data without the redundancy or dependencies mentioned above. IMS responded by extending the notion of logical data bases beyond what was just described. Supplier (sno, sname, scity, sstate) Part (pno, pname, psize, pcolor) Supply (pno, qty, price) Two IMS Physical Data Bases Figure 3 Suppose one constructs two physical data bases, one containing only Part information and the second containing Supplier and Supply information as shown in the diagram of Figure 3. Of course, DL/1 programs are defined on trees; hence they cannot be used directly on the structures of Figure 3. Instead, IMS allowed the definition of the logical data base shown in Figure 4. Here, the Supply and Part record types from two different data bases are fused (joined) on the common value of part number into the hierarchical structure shown.

6 What Goes Around Comes Around 7 Basically, the structure of Figure 3 is actually stored, and one can note that there is no redundancy and no bad existence dependencies in this structure. The programmer is presented with the hierarchical view shown in Figure 4, which supports standard DL/1 programs. Supplier (sno, sname, scity, sstate) Supply(pno, qty, price) Part (pno, pname, psize, pcolor) An IMS Logical Data Base Figure 4 Speaking generally, IMS allow two different tree-structured physical data bases to be grafted together into a logical data base. There are many restrictions (for example in the use of the delete command) and considerable complexity to this use of logical data bases, but it is a way to represent non-tree structured data in IMS. The complexity of these logical data bases will be presently seen to be pivotial in determining how IBM decided to support relational data bases a decade later. We will summarize the lessons learned so far, and then turn to the CODASYL proposal. Lesson 1: Physical and logical data independence are highly desirable Lesson 2: Tree structured data models are very restrictive Lesson 3: It is a challenge to provide sophisticated logical reorganizations of tree structured data Lesson 4: A record-at-a-time user interface forces the programmer to do manual query optimization, and this is often hard.

7 8 Chapter 1: Data Models and DBMS Architecture III CODASYL Era In 1969 the CODASYL (Committee on Data Systems Languages) committee released their first report [CODA69], and then followed in 1971 [CODA71] and 1973 [CODA73] with language specifications. CODASYL was an ad-hoc committee that championed a network data model along with a record-at-a-time data manipulation language. This model organized a collection of record types, each with keys, into a network, rather than a tree. Hence, a given record instance could have multiple parents, rather than a single one, as in IMS. As a result, our Supplier-Parts-Supply example could be represented by the CODASYL network of Figure 5. Supplier (sno, sname, scity, sstate) Part (pno, pname, psize, pcolor) Supplies Supplied_by Supply(qty, price) A CODASYL Network Figure 5 Here, we notice three record types arranged in a network, connected by two named arcs, called Supplies and Supplied_by. A named arc is called a set in CODASYL, though it is not technically a set at all. Rather it indicates that for each record instance of the owner record type (the tail of the arrow) there is a relationship with zero or more record instances of the child record type (the head of the arrow). As such, it is a 1-to-n relationship between owner record instances and child record instances. A CODASYL network is a collection of named record types and named set types that form a connected graph. Moreover, there must be at least one entry point (a record type that is not a child in any set). A CODASYL data base is a collection of record instances and set instances that obey this network description.

8 What Goes Around Comes Around 9 Notice that Figure 5 does not have the existence dependencies present in a hierarchical data model. For example, it is ok to have a part that is not supplied by anybody. This will merely be an empty instance of the Supplied_by set. Hence, the move to a network data model solves many of the restrictions of a hierarchy. However, there are still situations that are hard to model in CODASYL. Consider, for example, data about a marriage ceremony, which is a 3-way relationship between a bride, a groom, and a minister. Because CODASYL sets are only two-way relationships, one is forced into the data model indicated in Figure 6. Bride Groom Participates-1 Participates-2 Ceremony Participates-3 Minister A CODASYL Solution Figure 6 This solution requires three binary sets to express a three-way relationship, and is somewhat unnatural. Although much more flexible than IMS, the CODASYL data model still had limitations. The CODASYL data manipulation language is a record-at-a-time language whereby one enters the data base at an entry point and then navigates to desired data by following sets. To find the red parts supplied by Supplier 16 in CODASYL, one can use the following code:

9 10 Chapter 1: Data Models and DBMS Architecture Find Supplier (SNO = 16) Until no-more { Find next Supply record in Supplies Find owner Part record in Supplied_by Get current record -check for red } One enters the data base at supplier 16, and then iterates over the members of the Supplies set. This will yield a collection of Supply records. For each one, the owner in the Supplied_by set is identified, and a check for redness performed. The CODASYL proposal suggested that the records in each entry point be hashed on the key in the record. Several implementations of sets were proposed that entailed various combinations of pointers between the parent records and child records. The CODASYL proposal provided essentially no physical data independence. For example, the above program fails if the key (and hence the hash storage) of the Supplier record is changed from sno to something else. In addition, no logical data independence is provided, since the schema cannot change without affecting application programs. The move to a network model has the advantage that no kludges are required to implement graph-structured data, such as our example. However, the CODASYL model is considerably more complex than the IMS data model. In IMS a programmer navigates in a hierarchical space, while a CODASYL programmer navigates in a multi-dimensional hyperspace. In IMS the programmer must only worry about his current position in the data base, and the position of a single ancestor (if he is doing a get next within parent ). In contrast, a CODASYL programmer must keep track of the: The last record touched by the application The last record of each record type touched The last record of each set type touched The various CODASYL DML commands update these currency indicators. Hence, one can think of CODASYL programming as moving these currency indicators around a CODASYL data base until a record of interest is located. Then, it can be fetched. In addition, the CODASYL programmer can suppress currency movement if he desires. Hence, one way to think of a CODASYL programmer is that he should program looking at a wall map of the CODASYL network that is decorated with various colored pins indicating currency. In his 1973 Turing Award lecture, Charlie Bachmann called this navigating in hyperspace [BACH73].

10 What Goes Around Comes Around 11 Hence, the CODASYL proposal trades increased complexity for the possibility of easily representing non-hierarchical data. CODASYL offers poorer logical and physical data independence than IMS. There are also some more subtle issues with CODASYL. For example, in IMS each data base could be independently bulk-loaded from an external data source. However, in CODASYL, all the data was typically in one large network. This much larger object had to be bulk-loaded all at once, leading to very long load times. Also, if a CODASYL data base became corrupted, it was necessary to reload all of it from a dump. Hence, crash recovery tended to be more involved than if the data was divided into a collection of independent data bases. In addition, a CODASYL load program tended to be complex because large numbers of records had to be assembled into sets, and this usually entailed many disk seeks. As such, it was usually important to think carefully about the load algorithm to optimize performance. Hence, there was no general purpose CODASYL load utility, and each installation had to write its own. This complexity was much less important in IMS. Hence, the lessons learned in CODASYL were: Lesson 5: Networks are more flexible than hierarchies but more complex Lesson 6: Loading and recovering networks is more complex than hierarchies IV Relational Era Against this backdrop, Ted Codd proposed his relational model in 1970 [CODD70]. In a conversation with him years later, he indicated that the driver for his research was the fact that IMS programmers were spending large amounts of time doing maintenance on IMS applications, when logical or physical changes occurred. Hence, he was focused on providing better data independence. His proposal was threefold: Store the data in a simple data structure (tables) Access it through a high level set-at-a-time DML No need for a physical storage proposal With a simple data structure, one has a better change of providing logical data independence. With a high level language, one can provide a high degree of physical data independence. Hence, there is no need to specify a storage proposal, as was required in both IMS and CODASYL. Moreover, the relational model has the added advantage that it is flexible enough to represent almost anything. Hence, the existence dependencies that plagued IMS can be easily handled by the relational schema shown earlier in Figure 1. In addition, the three-

11 12 Chapter 1: Data Models and DBMS Architecture way marriage ceremony that was difficult in CODASYL is easily represented in the relational model as: Ceremony (bride-id, groom-id, minister-id, other-data) Codd made several (increasingly sophisticated) relational model proposals over the years [CODD79, CODDXX]. Moreover, his early DML proposals were the relational calculus (data language/alpha) [CODD71a] and the relational algebra [CODD72a]. Since Codd was originally a mathematician (and previously worked on cellular automata), his DML proposals were rigorous and formal, but not necessarily easy for mere mortals to understand. Codd s proposal immediately touched off the great debate, which lasted for a good part of the 1970 s. This debate raged at SIGMOD conferences (and it predecessor SIGFIDET). On the one side, there was Ted Codd and his followers (mostly researchers and academics) who argued the following points: a) Nothing as complex as CODASYL can possibly be a good idea b) CODASYL does not provide acceptable data independence c) Record-at-a-time programming is too hard to optimize d) CODASYL and IMS are not flexible enough to easily represent common situations (such as marriage ceremonies) On the other side, there was Charlie Bachman and his followers (mostly DBMS practitioners) who argued the following: a) COBOL programmers cannot possibly understand the new-fangled relational languages b) It is impossible to implement the relational model efficiently c) CODASYL can represent tables, so what s the big deal? The highlight (or lowlight) of this discussion was an actual debate at SIGMOD 74 between Codd and Bachman and their respective seconds [RUST74]. One of us was in the audience, and it was obvious that neither side articulated their position clearly. As a result, neither side was able to hear what the other side had to say. In the next couple of years, the two camps modified their positions (more or less) as follows: Relational advocates a) Codd is a mathematician, and his languages are not the right ones. SQL [CHAM74] and QUEL [STON76] are much more user friendly.

12 What Goes Around Comes Around 13 b) System R [ASTR76] and INGRES [STON76] prove that efficient implementations of Codd s ideas are possible. Moreover, query optimizers can be built that are competitive with all but the best programmers at constructing query plans. c) These systems prove that physical data independence is achievable. Moreover, relational views [STON75] offer vastly enhanced logical data independence, relative to CODASYL. d) Set-at-a-time languages offer substantial programmer productivity improvements, relative to record-at-a-time languages. CODASYL advocates a) It is possible to specify set-at-a-time network languages, such as LSL [TSIC76], that provide complete physical data independence and the possibility of better logical data independence. b) It is possible to clean up the network model [CODA78], so it is not so arcane. Hence, both camps responded to the criticisms of the other camp. The debate then died down, and attention focused on the commercial marketplace to see what would happen. Fortuitously for the relational camp, the minicomputer revolution was occurring, and VAXes were proliferating. They were an obvious target for the early commercial relational systems, such as Oracle and INGRES. Happily for the relational camp, the major CODASYL systems, such as IDMS from Culinaine Corp. were written in IBM assembler, and were not portable. Hence, the early relational systems had the VAX market to themselves. This gave them time to improve the performance of their products, and the success of the VAX market went hand-in-hand with the success of relational systems. On mainframes, a very different story was unfolding. IBM sold a derivative of System R on VM/370 and a second derivative on VSE, their low end operating system. However, neither platform was used by serious business data processing users. All the action was on MVS, the high-end operating system. Here, IBM continued to sell IMS, Cullinaine successfully sold IDMS, and relational systems were nowhere to be seen. Hence, VAXes were a relational market and mainframes were a non-relational market. At the time all serious data management was done on mainframes. This state of affairs changed abruptly in 1984, when IBM announced the upcoming release of DB/2 on MVS. In effect, IBM moved from saying that IMS was their serious DBMS to a dual data base strategy, in which both IMS and DB/2 were declared strategic. Since DB/2 was the new technology and was much easier to use, it was crystal clear to everybody who the long-term winner was going to be.

13 14 Chapter 1: Data Models and DBMS Architecture IBM s signal that it was deadly serious about relational systems was a watershed moment. First, it ended once-and-for-all the great debate. Since IBM held vast marketplace power at the time, they effectively announced that relational systems had won and CODASYL and hierarchical systems had lost. Soon after, Cullinaine and IDMS went into a marketplace swoon. Second, they effectively declared that SQL was the de facto standard relational language. Other (substantially better) query languages, such as QUEL, were immediately dead. For a scathing critique of the semantics of SQL, consult [DATE84]. A little known fact must be discussed at this point. It would have been natural for IBM to put a relational front end on top of IMS, as shown in Figure 7. This architecture would have allowed IMS customers to continue to run IMS. New application could be written to the relational interface, providing an elegant migration path to the new technology. Hence, over time a gradual shift from DL/1 to SQL would have occurred, all the while preserving the high-performance IMS underpinnings In fact, IBM attempted to execute exactly this strategy, with a project code-named Eagle. Unfortunately, it proved too hard to implement SQL on top of the IMS notion of logical data bases, because of semantic issues. Hence, the complexity of logical data bases in IMS came back to haunt IBM many years later. As a result, IBM was forced to move to the dual data base strategy, and to declare a winner of the great debate. Old programs new programs Relational interface IMS The Architecture of Project Eagle Figure 7 In summary, the CODASL versus relational argument was ultimately settled by three events:

14 What Goes Around Comes Around 15 a) the success of the VAX b) the non-portability of CODASYL engines c) the complexity of IMS logical data bases The lessons that were learned from this epoch are: Lesson 7: Set-a-time languages are good, regardless of the data model, since they offer much improved physical data independence. Lesson 8: Logical data independence is easier with a simple data model than with a complex one. Lesson 9: Technical debates are usually settled by the elephants of the marketplace, and often for reasons that have little to do with the technology. Lesson 10: Query optimizers can beat all but the best record-at-a-time DBMS application programmers. V The Entity-Relationship Era In the mid 1970 s Peter Chen proposed the entity-relationship (E-R) data model as an alternative to the relational, CODASYL and hierarchical data models [CHEN76]. Basically, he proposed that a data base be thought of a collection of instances of entities. Loosely speaking these are objects that have an existence, independent of any other entities in the data base. In our example, Supplier and Parts would be such entities. In addition, entities have attributes, which are the data elements that characterize the entity. In our example, the attributes of Part would be pno, pname, psize, and pcolor. One or more of these attributes would be designated to be unique, i.e. to be a key. Lastly, there could be relationships between entities. In our example, Supply is a relationship between the entities Part and Supplier. Relationships could be 1-to-1, 1-to-n, n-to-1 or m-to-n, depending on how the entities participate in the relationship. In our example, Suppliers can supply multiple parts, and parts can be supplied by multiple suppliers. Hence, the Supply relationship is m-to-n. Relationships can also have attributes that describe the relationship. In our example, qty and price are attributes of the relationship Supply. A popular representation for E-R models was a boxes and arrows notation as shown in Figure 8. The E-R model never gained acceptance as the underlying data model that is implemented by a DBMS. Perhaps the reason was that in the early days there was no query language proposed for it. Perhaps it was simply overwhelmed by the interest in the relational model in the 1970 s. Perhaps it looked too much like a cleaned up version of the CODASYL model. Whatever the reason, the E-R model languished in the 1970 s.

15 16 Chapter 1: Data Models and DBMS Architecture Part Pno, pname, psize, pcolor. Supply qty, price Supplier Sno, sname, scity, sstate An E-R Diagram Figure 8 There is one area where the E-R model has been wildly successful, namely in data base (schema) design. The standard wisdom from the relational advocates was to perform data base design by constructing an initial collection of tables. Then, one applied normalization theory to this initial design. Throughout the decade of the 1970 s there were a collection of normal forms proposed, including second normal form (2NF) [CODD71b], third normal form [CODD71b], Boyce-Codd normal form (BCNF) [CODD72b], fourth normal form (4NF) [FAGI77a], and project-join normal form [FAGI77b]. There were two problems with normalization theory when applied to real world data base design problems. First, real DBAs immediately asked How do I get an initial set of tables? Normalization theory had no answer to this important question. Second, and perhaps more serious, normalization theory was based on the concept of functional dependencies, and real world DBAs could not understand this construct. Hence, data base design using normalization was dead in the water. In contrast, the E-R model became very popular as a data base design tool. Chen s papers contained a methodology for constructing an initial E-R diagram. In addition, it was straightforward to convert an E-R diagram into a collection of tables in third normal form [WONG79]. Hence, a DBA tool could perform this conversion automatically. As such, a DBA could construct an E-R model of his data, typically using a boxes and arrows drawing tool, and then be assured that he would automatically get a good relational schema. Essentially all data base design tools, such as Silverrun from Magna Solutions, ERwin from Computer Associates, and ER/Studio from Embarcadero work in this fashion. Lesson 11: Functional dependencies are too difficult for mere mortals to understand. Another reason for KISS (Keep it simple stupid). VI R++ Era Beginning in the early 1980 s a (sizeable) collection of papers appeared which can be described by the following template:

16 What Goes Around Comes Around 17 Consider an application, call it X Try to implement X on a relational DBMS Show why the queries are difficult or why poor performance is observed Add a new feature to the relational model to correct the problem Many X s were investigated including mechanical CAD [KATZ86], VLSI CAD [BATO85], text management [STON83], time [SNOD85] and computer graphics [SPON84]. This collection of papers formed the R++ era, as they all proposed additions to the relational model. In our opinion, probably the best of the lot was Gem [ZANI83]. Zaniolo proposed adding the following constructs to the relational model, together with corresponding query language extensions: 1) set-valued attributes. In a Parts table, it is often the case that there is an attribute, such as available_colors, which can take on a set of values. It would be nice to add a data type to the relational model to deal with sets of values. 2) aggregation (tuple-reference as a data type). In the Supply relation noted above, there are two foreign keys, sno and pno, that effectively point to tuples in other tables. It is arguably cleaner to have the Supply table have the following structure: Supply (PT, SR, qty, price) Here the data type of PT is tuple in the Part table and the data type of SR is tuple in the Supplier table. Of course, the expected implementation of these data types is via some sort of pointer. With these constructs however, we can find the suppliers who supply red parts as: Select Supply.SR.sno From Supply Where Supply.PT.pcolor = red This cascaded dot notation allowed one to query the Supply table and then effectively reference tuples in other tables. This cascaded dot notation is similar to the path expressions seen in high level network languages such as LSL. It allowed one to traverse between tables without having to specify an explicit join. 3) generalization. Suppose there are two kinds of parts in our example, say electrical parts and plumbing parts. For electrical parts, we record the power consumption and the voltage. For plumbing parts we record the diameter and the material used to make the part. This is shown pictorially in Figure 9, where we see a root part with two specializations. Each specialization inherits all of the data attributes in its ancestors. Inheritance hierarchies were put in early programming languages such as Planner [HEWI69] and Conniver [MCDO73]. The same concept has been included in more recent programming languages, such as C++. Gem merely applied this well known concept to data bases.

17 18 Chapter 1: Data Models and DBMS Architecture Part (pno, pname, psize, pcolor Electrical (power, voltage) Plumbing (diameter, material) An Inheritance Hierarchy Figure 9 In Gem, one could reference an inheritance hierarchy in the query language. For example to find the names of Red electrical parts, one would use: Select E.pname From Electrical E Where E.pcolor = red In addition, Gem had a very elegant treatment of null values. The problem with extensions of this sort is that while they allowed easier query formulation than was available in the conventional relational model, they offered very little performance improvement. For example, primary-key-foreign-key relationships in the relational model easily simulate tuple as a data type. Moreover, since foreign keys are essentially logical pointers, the performance of this construct is similar to that available from some other kind of pointer scheme. Hence, an implementation of Gem would not be noticeably faster than an implementation of the relational model In the early 1980 s, the relational vendors were singularly focused on improving transaction performance and scalability of their systems, so that they could be used for large scale business data processing applications. This was a very big market that had major revenue potential. In contrast, R++ ideas would have minor impact. Hence, there was little technology transfer of R++ ideas into the commercial world, and this research focus had very little long-term impact.

18 What Goes Around Comes Around 19 Lesson 12: Unless there is a big performance or functionality advantage, new constructs will go nowhere. VII The Semantic Data Model Era At around the same time, there was another school of thought with similar ideas, but a different marketing strategy. They suggested that the relational data model is semantically impoverished, i.e. it is incapable of easily expressing a class of data of interest. Hence, there is a need for a post relational data model. Post relational data models were typically called semantic data models. Examples included the work by Smith and Smith [SMIT77] and Hammer and McLeod [HAMM81]. SDM from Hammer and McLeod is arguably the more elaborate semantic data model, and we focus on its concepts in this section. SDM focuses on the notion of classes, which are a collection of records obeying the same schema. Like Gem, SDM exploited the concepts of aggregation and generalization and included a notion of sets. Aggregation is supported by allowing classes to have attributes that are records in other classes. However, SDM generalizes the aggregation construct in Gem by allowing an attribute in one class to be a set of instances of records in some class. For example, there might be two classes, Ships and Countries. The Countries class could have an attribute called Ships_registered_here, having as its value a collection of ships. The inverse attribute, country_of_registration can also be defined in SDM. In addition, classes can generalize other classes. Unlike Gem, generalization is extended to be a graph rather than just a tree. For example, Figure 10 shows a generalization graph where American_oil_tankers inherits attributes from both Oil_tankers and American_ships. This construct is often called multiple inheritance. Classes can also be the union, intersection or difference between other classes. They can also be a subclass of another class, specified by a predicate to determine membership. For example, Heavy_ships might be a subclass of Ships with weight greater than 500 tons. Lastly, a class can also be a collection of records that are grouped together for some other reason. For example Atlantic_convoy might be a collection of ships that are sailing together across the Atlantic Ocean. Lastly, classes can have class variables, for example the Ships class can have a class variable which is the number of members of the class. Most semantic data models were very complex, and were generally paper proposals. Several years after SDM was defined, Univac explored an implementation of Hammer and McLeod s ideas. However, they quickly discovered that SQL was an intergalactic standard, and their incompatible system was not very successful in the marketplace.

19 20 Chapter 1: Data Models and DBMS Architecture Ships Oil_tankers American_ship American_Oil_tankers A Example of Multiple Inheritance Figure 10 In our opinion, SDMs had the same two problems that faced the R++ advocates. Like the R++ proposals, they were a lot of machinery that was easy to simulate on relational systems. Hence, there was very little leverage in the constructs being proposed. The SDM camp also faced the second issue of R++ proposals, namely that the established vendors were distracted with transaction processing performance. Hence, semantic data models had little long term influence. VIII OO Era Beginning in the mid 1980 s there was a tidal wave of interest in Object-oriented DBMSs (OODB). Basically, this community pointed to an impedance mismatch between relational data bases and languages like C++. In practice, relational data bases had their own naming systems, their own data type systems, and their own conventions for returning data as a result of a query. Whatever programming language was used alongside a relational data base also had its own version of all of these facilities. Hence, to bind an application to the data base required a conversion from programming language speak to data base speak and back. This was like gluing an apple onto a pancake, and was the reason for the so-called impedance mismatch.

20 What Goes Around Comes Around 21 For example, consider the following C++ snippet which defines a Part Structure and then allocates an Example_part. Struct Part { Int number; Char* name; Char* bigness; Char* color; } Example_part; All SQL run-time systems included mechanisms to load variables in the above Struct from values in the data base. For example to retrieve part 16 into the above Struct required the following stylized program: Define cursor P as Select * From Part Where pno = 16; Open P into Example_part Until no-more{ Fetch P (Example_part.number = pno, Example_name = pname Example_part.bigness = psize Example_part.color = pcolor) } First one defined a cursor to range over the answer to the SQL query. Then, one opened the cursor, and finally fetched a record from the cursor and bound it to programming language variables, which did not need to be the same name or type as the corresponding data base objects. If necessary, data type conversion was performed by the run-time interface. The programmer could now manipulate the Struct in the native programming language. When more than one record could result from the query, the programmer had to iterate the cursor as in the above example. It would seem to be much cleaner to integrate DBMS functionality more closely into a programming language. Specifically, one would like a persistent programming language, i.e. one where the variables in the language could represent disk-based data as well as main memory data and where data base search criteria were also language constructs. Several prototype persistent languages were developed in the late 1970 s, including Pascal-R [SCHM77], Rigel [ROWE79], and a language embedding for PL/1 [DATE76]. For example, Rigel allowed the above query to be expressed as:

What Goes Around Comes Around

What Goes Around Comes Around What Goes Around Comes Around By Michael Stonebraker Joey Hellerstein Abstract This paper provides a summary of 35 years of data model proposals, grouped into 9 different eras. We discuss the proposals

More information

Overview of Database Management

Overview of Database Management Overview of Database Management M. Tamer Özsu David R. Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Fall 2012 CS 348 Overview of Database Management

More information

Overview of Data Management

Overview of Data Management Overview of Data Management Grant Weddell Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Winter 2015 CS 348 (Intro to DB Mgmt) Overview of Data Management

More information

Chapter 2. Data Model. Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel

Chapter 2. Data Model. Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel Chapter 2 Data Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel 1 In this chapter, you will learn: Why data models are important About the basic data-modeling

More information

Fundamentals of Database Systems, 4 th Edition By Ramez Elmasri and Shamkant Navathe. Table of Contents. A. Short Table of Contents

Fundamentals of Database Systems, 4 th Edition By Ramez Elmasri and Shamkant Navathe. Table of Contents. A. Short Table of Contents Fundamentals of Database Systems, 4 th Edition By Ramez Elmasri and Shamkant Navathe Table of Contents A. Short Table of Contents (This Includes part and chapter titles only) PART 1: INTRODUCTION AND CONCEPTUAL

More information

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

Chapter 1: Introduction. Database Management System (DBMS) University Database Example This image cannot currently be displayed. Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Database Management System (DBMS) DBMS contains information

More information

CS352 Lecture - Object-Based Databases

CS352 Lecture - Object-Based Databases CS352 Lecture - Object-Based Databases Objectives: Last revised 10/7/08 1. To elucidate fundamental differences between OO and the relational model 2. To introduce the idea of adding persistence to an

More information

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system Introduction: management system Introduction s vs. files Basic concepts Brief history of databases Architectures & languages System User / Programmer Application program Software to process queries Software

More information

DATABASE MANAGEMENT SYSTEM

DATABASE MANAGEMENT SYSTEM REVIEW ARTICLE DATABASE MANAGEMENT SYSTEM Sweta Singh Assistant Professor, Faculty of Management Studies, BHU, Varanasi, India E-mail: sweta.v.singh27@gmail.com ABSTRACT Today, more than at any previous

More information

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML? CS2Bh: Current Technologies Introduction to XML and Relational Databases Spring 2005 Introduction to Databases CS2 Spring 2005 (LN5) 1 Why databases? Why not use XML? What is missing from XML: Consistency

More information

Introduction: Database management system

Introduction: Database management system Introduction Databases vs. files Basic concepts Brief history of databases Architectures & languages Introduction: Database management system User / Programmer Database System Application program Software

More information

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001 ICOM 6005 Database Management Systems Design Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001 Readings Read Chapter 1 of text book ICOM 6005 Dr. Manuel

More information

1 File Processing Systems

1 File Processing Systems COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.

More information

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to: 14 Databases 14.1 Source: Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: Define a database and a database management system (DBMS)

More information

www.gr8ambitionz.com

www.gr8ambitionz.com Data Base Management Systems (DBMS) Study Material (Objective Type questions with Answers) Shared by Akhil Arora Powered by www. your A to Z competitive exam guide Database Objective type questions Q.1

More information

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè.

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè. CMPT-354-Han-95.3 Lecture Notes September 10, 1995 Chapter 1 Introduction 1.0 Database Management Systems 1. A database management system èdbmsè, or simply a database system èdbsè, consists of æ A collection

More information

Overview of Database Management Systems

Overview of Database Management Systems Overview of Database Management Systems Goals: DBMS basic concepts Introduce underlying managerial issues Prepare for discussion of uses of DBMS, such as OLAP and database mining 1 Overview of Database

More information

BCA. Database Management System

BCA. Database Management System BCA IV Sem Database Management System Multiple choice questions 1. A Database Management System (DBMS) is A. Collection of interrelated data B. Collection of programs to access data C. Collection of data

More information

6.830 Lecture 3 9.16.2015 PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

6.830 Lecture 3 9.16.2015 PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization 6.830 Lecture 3 9.16.2015 PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization Animals(name,age,species,cageno,keptby,feedtime) Keeper(id,name)

More information

Foreign and Primary Keys in RDM Embedded SQL: Efficiently Implemented Using the Network Model

Foreign and Primary Keys in RDM Embedded SQL: Efficiently Implemented Using the Network Model Foreign and Primary Keys in RDM Embedded SQL: Efficiently Implemented Using the Network Model By Randy Merilatt, Chief Architect - January 2012 This article is relative to the following versions of RDM:

More information

Introduction to Object-Oriented and Object-Relational Database Systems

Introduction to Object-Oriented and Object-Relational Database Systems , Professor Uppsala DataBase Laboratory Dept. of Information Technology http://www.csd.uu.se/~udbl Extended ER schema Introduction to Object-Oriented and Object-Relational Database Systems 1 Database Design

More information

Complex Data and Object-Oriented. Databases

Complex Data and Object-Oriented. Databases Complex Data and Object-Oriented Topics Databases The object-oriented database model (JDO) The object-relational model Implementation challenges Learning objectives Explain what an object-oriented data

More information

1. INTRODUCTION TO RDBMS

1. INTRODUCTION TO RDBMS Oracle For Beginners Page: 1 1. INTRODUCTION TO RDBMS What is DBMS? Data Models Relational database management system (RDBMS) Relational Algebra Structured query language (SQL) What Is DBMS? Data is one

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases

More information

2. Basic Relational Data Model

2. Basic Relational Data Model 2. Basic Relational Data Model 2.1 Introduction Basic concepts of information models, their realisation in databases comprising data objects and object relationships, and their management by DBMS s that

More information

Object Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar

Object Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar Object Oriented Databases OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar Executive Summary The presentation on Object Oriented Databases gives a basic introduction to the concepts governing OODBs

More information

10. Creating and Maintaining Geographic Databases. Learning objectives. Keywords and concepts. Overview. Definitions

10. Creating and Maintaining Geographic Databases. Learning objectives. Keywords and concepts. Overview. Definitions 10. Creating and Maintaining Geographic Databases Geographic Information Systems and Science SECOND EDITION Paul A. Longley, Michael F. Goodchild, David J. Maguire, David W. Rhind 005 John Wiley and Sons,

More information

Ques 1. Define dbms and file management system? Ans- Database management system (DBMS) A file management system

Ques 1. Define dbms and file management system? Ans- Database management system (DBMS) A file management system UNIT-1 Ques 1. Define dbms and file management system? Ans- Database management system (DBMS) is a collection of interrelated data and a set of programs to access those data. Some of the very well known

More information

CSE 132A. Database Systems Principles

CSE 132A. Database Systems Principles CSE 132A Database Systems Principles Prof. Victor Vianu 1 Data Management An evolving, expanding field: Classical stand-alone databases (Oracle, DB2, SQL Server) Computer science is becoming data-centric:

More information

Contents RELATIONAL DATABASES

Contents RELATIONAL DATABASES Preface xvii Chapter 1 Introduction 1.1 Database-System Applications 1 1.2 Purpose of Database Systems 3 1.3 View of Data 5 1.4 Database Languages 9 1.5 Relational Databases 11 1.6 Database Design 14 1.7

More information

DATABASE MANAGEMENT SYSTEMS. Question Bank:

DATABASE MANAGEMENT SYSTEMS. Question Bank: DATABASE MANAGEMENT SYSTEMS Question Bank: UNIT 1 1. Define Database? 2. What is a DBMS? 3. What is the need for database systems? 4. Define tupule? 5. What are the responsibilities of DBA? 6. Define schema?

More information

Database Design Patterns. Winter 2006-2007 Lecture 24

Database Design Patterns. Winter 2006-2007 Lecture 24 Database Design Patterns Winter 2006-2007 Lecture 24 Trees and Hierarchies Many schemas need to represent trees or hierarchies of some sort Common way of representing trees: An adjacency list model Each

More information

History of Database Systems

History of Database Systems History of Database Systems By Kaushalya Dharmarathna(030087) Sandun Weerasinghe(040417) Early Manual System Before-1950s Data was stored as paper records. Lot of man power involved. Lot of time was wasted.

More information

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London.. 1 21.01 20 2 23.01 2 Martin Paris.. 1 26.10 25 3 Deen London.. 2 29.

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London.. 1 21.01 20 2 23.01 2 Martin Paris.. 1 26.10 25 3 Deen London.. 2 29. 4. Normalisation 4.1 Introduction Suppose we are now given the task of designing and creating a database. How do we produce a good design? What relations should we have in the database? What attributes

More information

Database Management. Chapter Objectives

Database Management. Chapter Objectives 3 Database Management Chapter Objectives When actually using a database, administrative processes maintaining data integrity and security, recovery from failures, etc. are required. A database management

More information

WHITE PAPER A COMPARISON BETWEEN RELATIONAL AND OBJECT ORIENTED DATABASES FOR OBJECT ORIENTED APPLICATION DEVELOPMENT JONATHAN ROBIE DIRK BARTELS

WHITE PAPER A COMPARISON BETWEEN RELATIONAL AND OBJECT ORIENTED DATABASES FOR OBJECT ORIENTED APPLICATION DEVELOPMENT JONATHAN ROBIE DIRK BARTELS WHITE PAPER A COMPARISON BETWEEN RELATIONAL AND OBJECT ORIENTED DATABASES FOR OBJECT ORIENTED APPLICATION DEVELOPMENT BY: JONATHAN ROBIE DIRK BARTELS POET SOFTWARE CORPORATION 800-950- 8845 Introduction

More information

Week 1 Part 1: An Introduction to Database Systems. Databases and DBMSs. Why Use a DBMS? Why Study Databases??

Week 1 Part 1: An Introduction to Database Systems. Databases and DBMSs. Why Use a DBMS? Why Study Databases?? Week 1 Part 1: An Introduction to Database Systems Databases and DBMSs Data Models and Data Independence Concurrency Control and Database Transactions Structure of a DBMS DBMS Languages Databases and DBMSs

More information

Study Notes for DB Design and Management Exam 1 (Chapters 1-2-3) record A collection of related (logically connected) fields.

Study Notes for DB Design and Management Exam 1 (Chapters 1-2-3) record A collection of related (logically connected) fields. Study Notes for DB Design and Management Exam 1 (Chapters 1-2-3) Chapter 1 Glossary Table data Raw facts; that is, facts that have not yet been processed to reveal their meaning to the end user. field

More information

The ObjectStore Database System. Charles Lamb Gordon Landis Jack Orenstein Dan Weinreb Slides based on those by Clint Morgan

The ObjectStore Database System. Charles Lamb Gordon Landis Jack Orenstein Dan Weinreb Slides based on those by Clint Morgan The ObjectStore Database System Charles Lamb Gordon Landis Jack Orenstein Dan Weinreb Slides based on those by Clint Morgan Overall Problem Impedance mismatch between application code and database code

More information

DATABASE SYSTEM CONCEPTS AND ARCHITECTURE CHAPTER 2

DATABASE SYSTEM CONCEPTS AND ARCHITECTURE CHAPTER 2 1 DATABASE SYSTEM CONCEPTS AND ARCHITECTURE CHAPTER 2 2 LECTURE OUTLINE Data Models Three-Schema Architecture and Data Independence Database Languages and Interfaces The Database System Environment DBMS

More information

Building Applications Using Micro Focus COBOL

Building Applications Using Micro Focus COBOL Building Applications Using Micro Focus COBOL Abstract If you look through the Micro Focus COBOL documentation, you will see many different executable file types referenced: int, gnt, exe, dll and others.

More information

ECS 165A: Introduction to Database Systems

ECS 165A: Introduction to Database Systems ECS 165A: Introduction to Database Systems Todd J. Green based on material and slides by Michael Gertz and Bertram Ludäscher Winter 2011 Dept. of Computer Science UC Davis ECS-165A WQ 11 1 1. Introduction

More information

OBJECTS AND DATABASES. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 21

OBJECTS AND DATABASES. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 21 OBJECTS AND DATABASES CS121: Introduction to Relational Database Systems Fall 2015 Lecture 21 Relational Model and 1NF 2 Relational model specifies that all attribute domains must be atomic A database

More information

Database Design and Programming

Database Design and Programming Database Design and Programming Peter Schneider-Kamp DM 505, Spring 2012, 3 rd Quarter 1 Course Organisation Literature Database Systems: The Complete Book Evaluation Project and 1-day take-home exam,

More information

Object Oriented Database Management System for Decision Support System.

Object Oriented Database Management System for Decision Support System. International Refereed Journal of Engineering and Science (IRJES) ISSN (Online) 2319-183X, (Print) 2319-1821 Volume 3, Issue 6 (June 2014), PP.55-59 Object Oriented Database Management System for Decision

More information

Gov 1008 Introduction to Geographical Information Systems

Gov 1008 Introduction to Geographical Information Systems Gov 1008 Introduction to Geographical Information Systems Lecture 5: Creating Data and Maintaining Databases Sumeeta Srinivasan References include: Bolstad; Worboys; NCGIA Core Curriculum: csiss.org/learning_resources/content/giscc/giscc_contents.html;

More information

Course Notes on A Short History of Database Technology

Course Notes on A Short History of Database Technology Course Notes on A Short History of Database Technology Traditional File-Based Approach Three Eras of Database Technology (1) Prehistory file systems hierarchical and network systems (2) The revolution:

More information

Course Notes on A Short History of Database Technology

Course Notes on A Short History of Database Technology Course Notes on A Short History of Database Technology Three Eras of Database Technology (1) Prehistory file systems hierarchical and network systems (2) The revolution: relational database technology

More information

Maintaining Stored Procedures in Database Application

Maintaining Stored Procedures in Database Application Maintaining Stored Procedures in Database Application Santosh Kakade 1, Rohan Thakare 2, Bhushan Sapare 3, Dr. B.B. Meshram 4 Computer Department VJTI, Mumbai 1,2,3. Head of Computer Department VJTI, Mumbai

More information

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd. The Entity- Relationship Model R &G - Chapter 2 A relationship, I think, is like a shark, you know? It has to constantly move forward or it dies. And I think what we got on our hands is a dead shark. Woody

More information

Stay Tuned for Today s Session! NAVIGATING THE DATABASE UNIVERSE"

Stay Tuned for Today s Session! NAVIGATING THE DATABASE UNIVERSE Stay Tuned for Today s Session! NAVIGATING THE DATABASE UNIVERSE" Dr. Michael Stonebraker and Scott Jarr! Navigating the Database Universe" A Few Housekeeping Items! Remember to mute your line! Type your

More information

Index Selection Techniques in Data Warehouse Systems

Index Selection Techniques in Data Warehouse Systems Index Selection Techniques in Data Warehouse Systems Aliaksei Holubeu as a part of a Seminar Databases and Data Warehouses. Implementation and usage. Konstanz, June 3, 2005 2 Contents 1 DATA WAREHOUSES

More information

Information Systems SQL. Nikolaj Popov

Information Systems SQL. Nikolaj Popov Information Systems SQL Nikolaj Popov Research Institute for Symbolic Computation Johannes Kepler University of Linz, Austria popov@risc.uni-linz.ac.at Outline SQL Table Creation Populating and Modifying

More information

DBMS / Business Intelligence, SQL Server

DBMS / Business Intelligence, SQL Server DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.

More information

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file? Files What s it all about? Information being stored about anything important to the business/individual keeping the files. The simple concepts used in the operation of manual files are often a good guide

More information

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati ISM 318: Database Systems Dr. Hamid R. Nemati Department of Information Systems Operations Management Bryan School of Business Economics Objectives Underst the basics of data databases Underst characteristics

More information

2) What is the structure of an organization? Explain how IT support at different organizational levels.

2) What is the structure of an organization? Explain how IT support at different organizational levels. (PGDIT 01) Paper - I : BASICS OF INFORMATION TECHNOLOGY 1) What is an information technology? Why you need to know about IT. 2) What is the structure of an organization? Explain how IT support at different

More information

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added?

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added? DBMS Questions 1.) Which type of file is part of the Oracle database? A.) B.) C.) D.) Control file Password file Parameter files Archived log files 2.) Which statements are use to UNLOCK the user? A.)

More information

Module 4 Creation and Management of Databases Using CDS/ISIS

Module 4 Creation and Management of Databases Using CDS/ISIS Module 4 Creation and Management of Databases Using CDS/ISIS Lesson 1 Introduction to Concepts of Database Design UNESCO EIPICT Module 4. Lesson 1 1 Rationale Keeping up with library automation technology

More information

The Import & Export of Data from a Database

The Import & Export of Data from a Database The Import & Export of Data from a Database Introduction The aim of these notes is to investigate a conceptually simple model for importing and exporting data into and out of an object-relational database,

More information

B.Com(Computers) II Year RELATIONAL DATABASE MANAGEMENT SYSTEM Unit- I

B.Com(Computers) II Year RELATIONAL DATABASE MANAGEMENT SYSTEM Unit- I B.Com(Computers) II Year RELATIONAL DATABASE MANAGEMENT SYSTEM Unit- I 1 1. What is Data? A. Data is a collection of raw information. 2. What is Information? A. Information is a collection of processed

More information

Network Model APPENDIXD. D.1 Basic Concepts

Network Model APPENDIXD. D.1 Basic Concepts APPENDIXD Network Model In the relational model, the data and the relationships among data are represented by a collection of tables. The network model differs from the relational model in that data are

More information

[Refer Slide Time: 05:10]

[Refer Slide Time: 05:10] Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture

More information

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1 The Relational Model Ramakrishnan&Gehrke, Chapter 3 CS4320 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. Legacy systems in older models

More information

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

low-level storage structures e.g. partitions underpinning the warehouse logical table structures DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures

More information

Modern Databases. Database Systems Lecture 18 Natasha Alechina

Modern Databases. Database Systems Lecture 18 Natasha Alechina Modern Databases Database Systems Lecture 18 Natasha Alechina In This Lecture Distributed DBs Web-based DBs Object Oriented DBs Semistructured Data and XML Multimedia DBs For more information Connolly

More information

2 Associating Facts with Time

2 Associating Facts with Time TEMPORAL DATABASES Richard Thomas Snodgrass A temporal database (see Temporal Database) contains time-varying data. Time is an important aspect of all real-world phenomena. Events occur at specific points

More information

A Software and Hardware Architecture for a Modular, Portable, Extensible Reliability. Availability and Serviceability System

A Software and Hardware Architecture for a Modular, Portable, Extensible Reliability. Availability and Serviceability System 1 A Software and Hardware Architecture for a Modular, Portable, Extensible Reliability Availability and Serviceability System James H. Laros III, Sandia National Laboratories (USA) [1] Abstract This paper

More information

Introduction to database management systems

Introduction to database management systems Introduction to database management systems Database management systems module Myself: researcher in INRIA Futurs, Ioana.Manolescu@inria.fr The course: follows (part of) the book "", Fourth Edition Abraham

More information

In-memory Tables Technology overview and solutions

In-memory Tables Technology overview and solutions In-memory Tables Technology overview and solutions My mainframe is my business. My business relies on MIPS. Verna Bartlett Head of Marketing Gary Weinhold Systems Analyst Agenda Introduction to in-memory

More information

Microsoft Access is an outstanding environment for both database users and professional. Introduction to Microsoft Access and Programming SESSION

Microsoft Access is an outstanding environment for both database users and professional. Introduction to Microsoft Access and Programming SESSION 539752 ch01.qxd 9/9/03 11:38 PM Page 5 SESSION 1 Introduction to Microsoft Access and Programming Session Checklist Understanding what programming is Using the Visual Basic language Programming for the

More information

Lesson 8: Introduction to Databases E-R Data Modeling

Lesson 8: Introduction to Databases E-R Data Modeling Lesson 8: Introduction to Databases E-R Data Modeling Contents Introduction to Databases Abstraction, Schemas, and Views Data Models Database Management System (DBMS) Components Entity Relationship Data

More information

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre NoSQL systems: introduction and data models Riccardo Torlone Università Roma Tre Why NoSQL? In the last thirty years relational databases have been the default choice for serious data storage. An architect

More information

Chapter 2 Database System Concepts and Architecture

Chapter 2 Database System Concepts and Architecture Chapter 2 Database System Concepts and Architecture Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Outline Data Models, Schemas, and Instances Three-Schema Architecture

More information

Data and Business Systems

Data and Business Systems Data and Business Systems Dr J A Briginshaw 21 May 2008 Silhouette Solutions Limited www.silhouettesolutions.com j.briginshaw@silhouettesolutions.com Abstract We discuss the problems businesses typically

More information

Introduction to Database Systems. Chapter 1 Introduction. Chapter 1 Introduction

Introduction to Database Systems. Chapter 1 Introduction. Chapter 1 Introduction Introduction to Database Systems Winter term 2013/2014 Melanie Herschel melanie.herschel@lri.fr Université Paris Sud, LRI 1 Chapter 1 Introduction After completing this chapter, you should be able to:

More information

16.1 MAPREDUCE. For personal use only, not for distribution. 333

16.1 MAPREDUCE. For personal use only, not for distribution. 333 For personal use only, not for distribution. 333 16.1 MAPREDUCE Initially designed by the Google labs and used internally by Google, the MAPREDUCE distributed programming model is now promoted by several

More information

Basic Concepts of Database Systems

Basic Concepts of Database Systems CS2501 Topic 1: Basic Concepts 1.1 Basic Concepts of Database Systems Example Uses of Database Systems - account maintenance & access in banking - lending library systems - airline reservation systems

More information

Object Oriented Databases (OODBs) Relational and OO data models. Advantages and Disadvantages of OO as compared with relational

Object Oriented Databases (OODBs) Relational and OO data models. Advantages and Disadvantages of OO as compared with relational Object Oriented Databases (OODBs) Relational and OO data models. Advantages and Disadvantages of OO as compared with relational databases. 1 A Database of Students and Modules Student Student Number {PK}

More information

Relational Database Basics Review

Relational Database Basics Review Relational Database Basics Review IT 4153 Advanced Database J.G. Zheng Spring 2012 Overview Database approach Database system Relational model Database development 2 File Processing Approaches Based on

More information

Oracle8i Spatial: Experiences with Extensible Databases

Oracle8i Spatial: Experiences with Extensible Databases Oracle8i Spatial: Experiences with Extensible Databases Siva Ravada and Jayant Sharma Spatial Products Division Oracle Corporation One Oracle Drive Nashua NH-03062 {sravada,jsharma}@us.oracle.com 1 Introduction

More information

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation

More information

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation

More information

Database Management System

Database Management System ISSN: 2349-7637 (Online) RESEARCH HUB International Multidisciplinary Research Journal Research Paper Available online at: www.rhimrj.com Database Management System Viral R. Dagli Lecturer, Computer Science

More information

Database Implementation: SQL Data Definition Language

Database Implementation: SQL Data Definition Language Database Systems Unit 5 Database Implementation: SQL Data Definition Language Learning Goals In this unit you will learn how to transfer a logical data model into a physical database, how to extend or

More information

Database Design and Normalization

Database Design and Normalization Database Design and Normalization 3 CHAPTER IN THIS CHAPTER The Relational Design Theory 48 46 Database Design Unleashed PART I Access applications are database applications, an obvious statement that

More information

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3 The Relational Model Chapter 3 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase,

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Transaction Management Storage Management Database Administrator Database

More information

Database System Concepts

Database System Concepts s Design Chapter 1: Introduction Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Facilitating Efficient Data Management by Craig S. Mullins

Facilitating Efficient Data Management by Craig S. Mullins Facilitating Efficient Data Management by Craig S. Mullins Most modern applications utilize database management systems (DBMS) to create, store and manage business data. The DBMS software enables end users

More information

Search help. More on Office.com: images templates

Search help. More on Office.com: images templates Page 1 of 14 Access 2010 Home > Access 2010 Help and How-to > Getting started Search help More on Office.com: images templates Access 2010: database tasks Here are some basic database tasks that you can

More information

Data Modeling Basics

Data Modeling Basics Information Technology Standard Commonwealth of Pennsylvania Governor's Office of Administration/Office for Information Technology STD Number: STD-INF003B STD Title: Data Modeling Basics Issued by: Deputy

More information

12 File and Database Concepts 13 File and Database Concepts A many-to-many relationship means that one record in a particular record type can be relat

12 File and Database Concepts 13 File and Database Concepts A many-to-many relationship means that one record in a particular record type can be relat 1 Databases 2 File and Database Concepts A database is a collection of information Databases are typically stored as computer files A structured file is similar to a card file or Rolodex because it uses

More information

SQL Server. 1. What is RDBMS?

SQL Server. 1. What is RDBMS? SQL Server 1. What is RDBMS? Relational Data Base Management Systems (RDBMS) are database management systems that maintain data records and indices in tables. Relationships may be created and maintained

More information

Databases in Organizations

Databases in Organizations The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron

More information

Chapter 14: Databases and Database Management Systems

Chapter 14: Databases and Database Management Systems 15 th Edition Understanding Computers Today and Tomorrow Comprehensive Chapter 14: Databases and Database Management Systems Deborah Morley Charles S. Parker Copyright 2015 Cengage Learning Learning Objectives

More information

IT2304: Database Systems 1 (DBS 1)

IT2304: Database Systems 1 (DBS 1) : Database Systems 1 (DBS 1) (Compulsory) 1. OUTLINE OF SYLLABUS Topic Minimum number of hours Introduction to DBMS 07 Relational Data Model 03 Data manipulation using Relational Algebra 06 Data manipulation

More information

Using In-Memory Computing to Simplify Big Data Analytics

Using In-Memory Computing to Simplify Big Data Analytics SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed

More information

Object-Relational Query Processing

Object-Relational Query Processing Object-Relational Query Processing Johan Petrini Department of Information Technology Uppsala University, Sweden Johan.Petrin@it.uu.se 1. Introduction In the beginning, there flat files of data with no

More information

PRACTICE BOOK COMPUTER SCIENCE TEST. Graduate Record Examinations. This practice book contains. Become familiar with. Visit GRE Online at www.gre.

PRACTICE BOOK COMPUTER SCIENCE TEST. Graduate Record Examinations. This practice book contains. Become familiar with. Visit GRE Online at www.gre. This book is provided FREE with test registration by the Graduate Record Examinations Board. Graduate Record Examinations This practice book contains one actual full-length GRE Computer Science Test test-taking

More information