1 Databases 2 File and Database Concepts A database is a collection of information Databases are typically stored as computer files A structured file is similar to a card file or Rolodex because it uses a uniform format to store data for each person or thing in the file 3 File and Database Concepts A field contains the smallest unit of meaningful information, so you might call it the basic building block for a structured database Each field has a unique field name that describes its contents A field can be either variable length or fixed length. 4 File and Database Concepts A field can be either variable length or fixed length. A variable-length length field is like an accordion it it expands to fit the data you enter, up to some maximum number of characters A fixed-length field contains a predetermined number of characters (bytes) The data that you enter in a fixed-length field cannot exceed the allocated field length 5 File and Database Concepts 6 File and Database Concepts A record refers to a collection of data fields 7 File and Database Concepts The person who creates a data file defines the fields it contains Each kind of record is referred to as a record type A record type is similar to a blank form, and it is usually shown without any data in the fields A record that contains data is referred to as a record occurrence, e, or simply a record. 8 File and Database Concepts 9 File and Database Concepts 10 File and Database Concepts A data file that contains only one record type is often referred to as a flat file In contrast, a database can contain a variety of different record types 11 File and Database Concepts In database jargon, a relationship is an association between data that s stored in different record types One important aspect of the relationship between record types is cardinality Cardinality refers to the number of associations that can exist between two record types A particular order cannot be placed jointly by two customers When one record is related to many records, the relationship is referred to as a one-to-many relationship
12 File and Database Concepts 13 File and Database Concepts A many-to-many relationship means that one record in a particular record type can be related to many records in another record type, and vice versa A one-to-one relationship means that a record in one record type is related to only one record in another record type Relationships between record types can be graphically depicted using diagramming techniques, such as an entity-relationship diagram,, sometimes called an ER diagram or ERD. 14 File and Database Concepts 15 File and Database Concepts Several database models exist Some models work with all of the relationships described earlier in this section, whereas other models work with only a subset of the relationships The four main types of database models in use today are hierarchical, ical, network, relational, and object oriented 16 File and Database Concepts The simplest database model arranges record types as hierarchy In a hierarchical database,, a record type is referred to as a node or segment. The top node of the hierarchy is referred to as the root node A parent node can have more than one child node A child node can have only one parent node 17 File and Database Concepts 18 File and Database Concepts The network database model allows many-to-many relationships in addition to one- to-many relationships Related record types are referred to as a network set,, or simply a set. A set contains an owner and members An owner is similar to a parent record in a hierarchical database A member is roughly equivalent to the child records in a hierarchical database 19 File and Database Concepts 20 File and Database Concepts It produces databases that are difficult to create, manipulate, and maintain A relational database stores data in a collection of related tables Each table (also called a relation ) is a sequence, or list, of records. All of the records in a table are of the same record type 21 File and Database Concepts 22 File and Database Concepts A row of the table is called a tuple,, and is equivalent to a record The columns in the table are called attributes,, and are equivalent to fields
In a relational database, relationships are specified through the use of common data stored in the fields of records in different tables 23 File and Database Concepts 24 File and Database Concepts An object-oriented oriented database stores data as objects, which can be grouped into object classes, and defined by attributes and method An object class specifies the attributes and methods that are shared by all objects in a class An object attribute is equivalent to a field, and contains the smallest unit of data. a. An object method is any behavior that the object is capable of performing 25 File and Database Concepts 26 Databases 27 Data Management Tools The simplest tools for managing data are software packages designed for a specific data management task, such as keeping track of appointments or managing your checking cking account To use one of these tools, you just enter your data The software provides menus that allow you to manipulate your data after it is entered 28 Data Management Tools 29 Data Management Tools Several popular applications provide simple file design and management capabilities Your word processing software probably allows you to maintain data as a set of records Some spreadsheet software also includes simple data management commands The simple file management tools provided by word processing and spreadsheet software are popular for individuals who want to maintain flat files that contain hundreds, not thousands, of records 30 Data Management Tools 31 Data Management Tools It is possible to simply enter data as an ASCII text file, then use a programming language to write any routines that you need to access the data Custom software can be created to accommodate hierarchical, network, relational, and object- oriented databases, as well as flat file management Custom software requires skilled programmers 32 Data Management Tools 33 Data Management Tools Poorly designed custom software can result in data dependence a term that refers to the undesirable situation in which data and program modules are so tightly interrelated that they become difficult to modify Modern file and database management software supports data independence endence which means keeping data separated from the programs that manipulate the data A single data management tool can be used to maintain many different files and databases
34 Data Management Tools 35 Data Management Tools The term DBMS (database management system) refers to software that is designed d to manage data stored in a database An XML DBMS,, for example, is optimized for handling data that exists in XML format An OODBMS (object-oriented oriented database management system) is optimized for the object-oriented oriented database model, allowing you to store and manipulate data classes, s, attributes, and methods 36 Data Management Tools An RDBMS (relational database management system) allows you to create, update, and administer a relational database Today s most popular RDBMS software also provides capability to handle object classes and XML data, making it unnecessary to purchase separate OODBMS or XML DBMS 37 Data Management Tools Most database projects are implemented with a relational database system The particular RDBMS package that you choose depends on the scope of your project, the number of people, and the expected volume of records, queries, and updates Entry-level RDBMS software is designed for personal and small business uses An entry-level DBMS typically includes all of the tools that you need to manipulate the data in a database, create data entry forms, query the database, and generate reports 38 Data Management Tools 39 Data Management Tools 40 Data Management Tools Database client software allows any remote computer or network workstation to access data a in a data base An entry-level DBMS that resides on a network server may be able to handle many simultaneous searches In situations with many users who make simultaneous updates, it is usually necessary to move to database server software Database server software is designed to manage billions of records, and several hundred transactions every second 41 Data Management Tools It may also handle a distributed database,, in which a database is stored on different computers, on different networks, or in different locations Database server software replaces an entry-level DBMS, and users continue to communicate with the server by means of client software 42 Data Management Tools 43 Data Management Tools The Web provides both opportunities and challenges for accessing the information in a database The Web provides an opportunity for many people to gain access to data from many locations Web access is constrained by the stateless nature of HTTP, and the necessity to provide access by using a browser as client software
Web access to hierarchical, network, relational, and object-oriented oriented databases is possible though access to relational databases is most common 44 Data Management Tools A technique called statistic Web publishing is a simple way to display the data in a database by converting a database report into an HTML document, which can be displayed as a Web page by a browser Data on the Web page cannot be manipulated, except to be searched in a rudimentary way by the Find feature of your Web browser The advantages of static publishing include security and simplicityity 45 Data Management Tools 46 Data Management Tools A dynamic Web publishing process generates customized Web pages as needed, or on the fly. Dynamic Web publishing relies on a program or script, referred to as a server-side side program,, that resides on a Web server, and acts as an intermediary between your browser and a DBMS The architecture for dynamic publishing requires a Web server, in addition to a database server, a database, and a browser 47 Data Management Tools 48 Data Management Tools In several situations, it is important for people to use a browser to add or update the records in a database These dynamic database updates require an architecture similar to that used for dynamic Web publishing, plus the use of forms A form usually exists on a Web server, which sends it to your browser An emerging technology called XForms provides an alternative to HTML forms 49 Data Management Tools 50 Data Management Tools XForms are designed to provide more flexibility than HTML forms, and they interface to XML documents Use of XForms requires an XForms-enabled browser A form created with HTML or XForms can collect data, or it can collect the specifications for a query A completed form is sent from your browser to the Web server, which sends it to the DBMS Results are sent to the Web server, formatted into an HTML document, and sent back to your browser 51 Data Management Tools Several tools, including ASP, CGI, and PHP help you create server-side side programs ASP (Active Server Pages) technology can be used to generate an HTML document that contains scripts, which are run before the document is displayed as a Web page These scripts are small embedded programs that can be designed to get user input, run queries, and display query results CGI (Common Gateway Interface) provides a non-proprietary way to create HTML pages based on data in a database 52 Data Management Tools
53 Data Management Tools A CGI script can be written in a variety of different programming g languages such as C, C++, Java, and PERL PHP (Personal HomePage) is a cross-platform scripting language that can be used to accomplish same tasks as CGI scripts Specialized Web database development tools provide a way to link HTML pages to a database without programming or scripting XML Documents 54 Data Management Tools XML is a markup language that allows field tags, data, and tables to be incorporated into a Web document HTML documents contain lots of information, that information is not in context XML provides tags that can be embedded in an XML document to put data in context 55 Data Management Tools 56 Data Management Tools XML s most positive contributions to data management is the ability ity to add context to the information contained in a widely diverse pool of documents on the Web An XML document can also contain structured data organized into records and fields Storing data in an XML document provides several advantages It is portable. 57 Data Management Tools 58 Data Management Tools XML documents are not, however, optimized for many operations that you would customarily associate with databases, such as fast sorts, searches, and updates To get the best out of XML and relational databases, some experts recommend storing data in a relational database, managing it with RDBMS software, and using server-side side software to generate XML documents for exchanging data via the Web 59 Databases 60 Database Design The first step in designing a relational database is to determine e what data must be collected and stored A database designer might begin by listing available data, as well as any additional data that is necessary to produce on-screen output or printed reports The next step is to organize that data into fields 61 Database Design The treatment of first and last names illustrates the concept of breaking data into fields With the entire name in one field, the database would not be able to access individual parts of the name, making it difficult to alphabetize customers by last name, or produce a report in which names appear in a format like Grape, Gilbert B 62 Database Design
63 Database Design The data that can be entered into a field depends on the field s data type From a technical perspective, a data type specifies the way data is represented on the disk and in RAM From a user perspective, the data type determines the way that data can be manipulated The two most common data types are numeric and character A database designer can assign a numeric data type to fields containing numbers that will be manipulated mathematically by adding, averaging, multiplying, and so forth 64 Database Design 65 Database Design There are two main numeric types: real and integer A real number is formatted to include decimal places An integer is a whole number For fields that contain data that would not be used for calculations, a database designer can specify a character data type,, which is also referred to as a string data type. Character fields sometimes hold data that looks like numbers, but doesn t need to be mathematically manipulated 66 Database Design Some file and database management systems provide additional data a types such as date, logical, and memo The date data type is used to store dates in a format that allows them to be manipulated The logical data type is used to store true/false or yes/no data using minimal storage space A memo data type usually provides a variable-length length field into which users can enter comments 67 Database Design Some file and database management systems also include additional data types, such as images or BLOBs An image is a picture or graphic A BLOB (binary large object) is a collection of binary data that is stored in a single field of a database 68 Database Design A computed field is a calculation that a DBMS performs during processing, and then temporarily stores in a memory location. An efficiently designed database uses computed fields whenever possible because they do not require disk storage space 69 Database Design 70 Database Design The information supplied by reports and processing routines is only as accurate as the information in the database Data entry errors can compromise the accuracy and validity of a database Most DBMS tools that the database designer can use to prevent some, but not all, data entry errors In a case sensitive database,, uppercase letters are not equivalent to their lowercase counterparts 71 Database Design Most, but not all, DBMSs give the database designer an option to turn case sensitivity on or off They may also provide the option to force data to all uppercase or all lowercase as it is entered
People who enter data may not be consistent about the way they enter numbers To prevent this sort of inconsistent formatting, a database desi gner can specify a field format A field format is a picture of how the data is supposed to look when it s entered 72 Database Design If someone attempts to enter data in the wrong format, the database ase rejects that entry Sometimes people who enter data simply make a mistake and press the wrong keys It is possible to catch some of these errors by using field validation rules, list boxes, or lookups A field validation rule is a specification that the database designer sets up to filter the data entered into a particular field 73 Database Design 74 Database Design Another technique that prevents typographical and case-sensitivity sensitivity errors is to limit data entry to the items on a specified list Most databases allow the database designer to specify a list of acceptable entries for each field Database designers can also prevent entry errors by using lookup routines A lookup routine validates an entry by searching for the same data in a file or database table 75 Database Design When designing a database, a process called normalization helps the database designer create a database structure that can save storage space and increase processing efficiency The goal of normalization is to minimize data redundancy The first step to grouping fields is to get an idea of the big picture of the data Groupings correspond to the physical items, or entities, that are tracked in the database It is necessary to use two tables to store data 76 Database Design 77 Database Design This data redundancy not only requires extra storage space, but also may lead to storing inconsistent or inaccurate data The solution is to create separate tables If the designer provides fields for ordering ten items, the database cannot handle large orders for more than ten 78 Database Design 79 Database Design If a customer orders fewer than ten, space is wasted by having empty fields in each record A one-to-many relationship exists between an order and the ordered items The database designer should separate the data into two tables such as Orders and OrderDetails 80 Database Design 81 Database Design Database tables can be organized in different ways depending on how people want to use them
No single way of organizing the data accommodates everyone s needs, but tables can be sorted or indexed in multiple ways 82 Database Design A table s sort order is the order in which records are stored on disk Sorted tables typically produce faster queries and updates Queries and updates within an unsorted database are slow A table s sort key is one or more fields that are used to specify where new records are inserted in a table A table can have only one sort key at a time, but the sort key can be changed 83 Database Design A database index contains a list of keys, and each key provides a pointer to the record that contains the rest of the fields related to that key An index has no bearing on the physical sequence of records on disk A table can have multiple indexes, but only one sort order Database tables should be indexed by any field or fields that are commonly used as search fields. The database designer typically creates indexes at the time the database structure is designed 84 Database Design 85 Database Design The way that database records, queries, and reports appear on the screen depends on its user interface A professional user interface designer typically creates and maintains the user interface Large databases may even require a group of user interface designers to maintain the user interface The user interface for smaller databases is most likely created by the database designer 86 Database Design A well-defined user interface for a database should be clear, intuitive, and efficient 87 Database Design 88 Database Design 89 Database Design A report generator is a software tool that provides the ability to create report templates for a database A report template contains the outline or general specifications for a report The template does not, however, contain data from the database Data is merged into the template when you actually run a report When you actually produce a report, it is based on the data currently contained in the database table 90 Database Design 91 Database Design 92 Database Design The database designer can create templates for reports that effectively present information by observing the following guidelines: Present only the information required Present information in a usable format
93 Database Design Information should be timely Information should be presented in a clear, unambiguous format, and include necessary titles, page numbers, dates, dates, labels, and column headings Present information in the format most appropriate for the audience 94 Database Design Data can be loaded into a database manually by using generic data a entry tools supplied with the DBMS, or by using a customized data entry module created by the database designer If the data exists electronically in another type of database file or in flat files, it is usually possible to transfer the data using a custom-written conversion routine, or import and export routines 95 Database Design A conversion routine converts the data from its current format into a format that can be automatically incorporated into the new database An import routine brings data into a database An export routine copies data out of a software package, and into the database Typically, you would use either an import routine or an export routine, but not both 96 Databases 97 SQL 98 SQL 99 SQL 100 SQL 101 SQL Query languages like SQL typically work behind the scenes as an intermediary between the database client software provided to users, and the database itself The client software collects your input, then converts it into an SQL query,, which can operate directly on the database to carry out your instructions An SQL query is a sequence of words, much like a sentence SELECT TrackTitle FROM Tracks WHERE TrackTitle = Fly Away The SQL query language provides a collection of special command words called SQL keywords, such as SELECT, FROM, INSERT, and WHERE, which issue instructions s to the database Most SQL queries can be divided into three simple elements that specify an action, the name of a database table, and a set of parameters An SQL query typically begins with an action keyword, or command, which specifies the operation that you want carried out For example, the command word DELETE removes a record from a table 102 SQL SQL keywords such as USE, FROM, or INTO can be used to construct a clause specifying the table that you want to access.
103 SQL 104 SQL 105 SQL The clause consists of the keyword followed by the name of the table The term parameter is technical jargon that refers to the detailed ed specifications for a command Keywords such as WHERE usually begin an SQL clause that contains the parameters for a command The client software that you use collects the data that you enter in the form and generates an SQL statement using the INSERT command, which adds your data to the database 106 SQL One of the most common database operations is to query for a particular record or group of records using the SELECT command 107 SQL 108 SQL 109 SQL Yes, SQL uses search operators to form complex queries Because search operators were originally the idea of mathematician an George Boole, they are also referred to as Boolean operators AND (sometimes indicated by a + sign) is used when you want to retrieve records that meet more than one criteria The AND operator specifies that both of the search criteria must be true for the record to be selected SELECT CDName FROM CompactDisks WHERE ArtistName = Natalie Merchant AND DiscountPrice < 10.00 110 SQL OR means to pick each record that meets one criteria or the other, but not both SELECT CDName FROM CompactDisks WHERE (ArtistName = Natalie Merchant OR ArtistName = 10,000 Maniacs ) AND DiscoutPrice <10.00 Note the use of parentheses around the OR clause 111 SQL Parentheses tell the DBMS to process this part of the search criteria first The placement of parentheses can change the results of a query, sometimes drastically SELECT CDName FROM CompactDisks WHERE ArtistName = Natalie Merchant OR (ArtistName = 10,000 Maniacs AND DiscountPrice < 10.00) 112 SQL The NOT operator can be used to specify a not-equal relationship
113 SQL 114 SQL Select CDName from CompactDisks WHERE NOT(ArtistName = Natalie Merchant ) Sometimes NOT relationships are specified using a not-equal operator, like <> or!=, depending on the specifications of the query language Select CDName from CompactDisks WHERE ArtistName <> Natalie Merchant You can change records in a database only if you have the rights to do so UPDATE CompactDisks SET QtyInStock = QtyInStock 1 WHERE CDName = Tigerlily In addition to changing the data in a single record, SQL can perform a global update that changes the data in more than one record at a time It would be easier to change all of the records with a single command UPDATE CompactDisks SET DiscountPrice = 12.00 WHERE ArtistName = The Rolling Stones The UPDATE command means that you want to change the data in some or all of the records 115 SQL It only works on records that have similar characteristics Custom programming is required if it is necessary to perform global operations on information that does not have any similar characteristics Such custom programs are typically written by the database designer, and supplied as a menu option within the database client software 116 SQL In SQL terminology, creating a relationship between tables is referred to as joining tables To take advantage of the relationship between two tables, you first have to join the tables The SQL JOIN command allows you to temporarily join and simultaneously access the data in more than one table 117 SQL 118 SQL 119 SQL 120 SQL A single SQL query can retrieve data from the two tables SQL is a very extensive and powerful language that can be used not only to manipulate data, but to create databases, tables, and reports