Vendor Solutions to Web Enabled Data Warehousing Prepackaged Solutions Database vendors are moving towards the delivery of prepackaged products rather than individual technologies. The majority of vendors are focusing on data mart implementation even though there are prepackaged high-end systems available. These prepackaged systems have been coined "data warehouses-in-a-box". An increasing number of data warehouse frameworks, templates, suites, and packages on the market promise to make it quicker, easier, and cheaper to implement data warehouse systems (Whiting). It is predicted that prepackaged data warehouse implementations will account for approximately 40% of the market within the next three years while the more conventional data warehouse systems will account for the remaining 60% percent. One of the major reasons for this movement relates to the cost of a data warehouse project. Typical data warehouse costs approximately $3 million dollars while smaller prepackaged systems can be implemented for around $50 thousand. The data warehouse price tag immediately eliminates small businesses from the consideration of implementing this technology. On the other hand, prepackaged systems enable these smaller companies to try out data warehousing technology without a huge initial outlay of funds. The following outlines the benefits of prepackaged solutions: Lower cost Quicker implementation Not as complex Lower IS resource requirements Lower risk Easier to sell to management ROI can be achieved quickly
Vendors are reacting by either enhancing their current product offerings or purchasing technologies. For example, large vendors are moving aggressively into OLAP and data mining data mining space by either developing their own or aggressively acquiring best-of-breed solutions (Berson). Examples include Informix and their acquisition of Red Brick and Microsoft and their integration of OLAP processing into their SQL Server product. Prepackaged solutions that are available include the following: Visual Warehouse by IBM RightStart by NCR Warehouse Studio by Sybase Decision Frontier Solution Suite by Informix With the complexities of Internet/Intranet delivery, vendors are also integrating Webbased technologies into their products. For example, both Oracle and IBM offer Web servers with their solutions. In addition, IBM is delivering several e-commerce technologies that are targeted toward specific areas of business such as retail and finance. Prepackaged solutions are not data warehousing utopia. Prepackaged solutions are typically not as scalable as other technologies. In addition, because of the tight integration between the DBMS and additional products such as data transformation tools, prepackaged solutions may not be as flexible. Lastly, prepackaged solutions may be limited to the number and type of data sources that are supported. Relational versus Object-Oriented Warehousing Because of the emerging issue of multimedia data, two views have evolved as to the handling of complex data. Some vendors feel that it should be stored outside of the database while others feel that it should be merged within the database. Vendors must also incorporate options for user-defined objects in their products. Page 2
What are the big six vendors doing? Oracle, Sybase, Informix, IBM, and Microsoft continue to leapfrog each other in several areas of database technology, with major efforts directed toward the Internet transactions; improved scalability to support very large databases and advanced processing techniques such as OLAP; sophisticated data replication aimed at mobile users like salesforce automation; and the ability to handle (not just access but perform data manipulation operations) on complex data types (Berson). Vendors like IBM and Oracle are enabling their databases to work closely with Web servers. Oracle, Informix, Sybase, and IBM are creating extensions to their core relational databases to perform online analytical processing (OLAP) with browsers, as well as handle different data types (Greenberg). For data warehousing, industry analysts predict that the top market players for the next few years will be the major database vendors (Berson). Computer Associates Computer Associates partners with other organizations in developing tools to manage data warehouses on the Web. Computer Associates has a two-database strategy. It is maintaining its relational database (Ingres) while developing in parallel a pure object database (Jasmine). Just recently, OpenIngres/ICE (Internet-based electronic commerce) was integrated with Spyglass WebServer. Ingres supports HTTP, but does not include provisions for complex datatype extensibility. Jasmine, and OODBMS, includes multimedia and Internet-enablement and supports Java and ActiveX. For security, Informix offers Unicenter TNG Web Management. It provides secure access through TCP/IP ports and Web servers and the use of Secure Sockets Protocol (SSP). Unicenter monitors Web server performance, storage management, and event management. For Web Management, Web Traffic Analysis is offered. Page 3
IBM IBM offers DB2 Universal Database which, when integrated into WebSphere Application Server, is a complete package for designing, developing, debugging, and deploying e-business applications. To enable an open environment, WebSphere supports XML and JDBC. Our focus here is to provide DB2 as the infrastructure for business transformation to E- business. Operating over the Internet requires high availability, reliable and scalable performance, and seamless access to enterprise information. That s what we re delivering. -- Janet Perna, General Manager of Data Management /IBM Software Solutions Other products from IBM are NetData, for connectivity to corporate data, VisualAge for developing Java applets, and Web Control Center for remote management of multiple DB2 servers. IBM s approach to nontraditional data is to provide links to it through the database. DB2 Universal Database Extenders are utilized for complex types of data. They can be manipulated though an SQL query. Informix Informix has just recently reorganized their organizational structure. They are vying to be a force in the data warehouse market after purchasing Red Brick and creating two divisions for Data Warehousing and Web Division and an E-Commerce Division. Informix relies on third party development for products and tools. The Informix Dynamic server provides BLOB support and security measures. Recently, their Universal Server (relational) was merged with Illustra s database (object-oriented). This will enable the use of DataBlades modules for multimedia data access. DataBlades will have 25 Page 4
more modules added soon, and the user can also develop their own formats. For management of databases, Enterprise Command Center can be utilized. Oracle Of the vendors reviewed, Oracle is doing the most towards enabling their products for the Web. Oracle is currently transforming all of its database and development tool products to be web-enabled. Oracle it tying everything that it does is tied to the Internet. Oracle8i is the cornerstone of Oracle s Web-based initiatives. Oracle8i offers significant improvements in very large database support, including increased scalability, robust data partitioning, and enhanced availability. The enhancements contained within Oracle8i will spur more companies to deploy databases in the Web. Oracle's chairman Larry Ellison stated "It's not just a database, it's a complete unified extensible platform for running applications and distributed applications worldwide -- where the only things you need are Oracle8I, a browser, and nothing else" (Cornetto). Oracle8i will incorporate the following technologies: Java Virtual Machine (JVM) Support for Java stored procedures Hyper Text Transfer Protocol (HTTP) Web development tools internet File System (ifs) XML Support Oracle8i has been specifically designed for Internet computing by moving applications back onto the server while distributing the access to data. With Oracle8I, companies will be able to reconsolidate their data while maintaining and even extending data access through the Web (Rosen). Oracle8i is an object relational database. Object relational allows developers to create business objects and store these objects within the database in a relational form. Oracle8i is intended to simplify the development and deployment of Internet-based applications. Page 5
Oracle8i will allow developers to use Java for database programming rather than requiring the use of Oracle's PL-SQL language. In addition, Oracle's transaction processing, data warehousing, mobile computing and availability have been expanded in Oracle8i. The Java support within Orcale8i will enable Java applications to query the Oracle databases and support the deployment of these applications on any kind of client. Oracle states that their Java support is the first database resident JVM. Java applications will be able to run inside the Oracle8i environment. The JVM will allow for developers to extend the functionality of Oracle8i by allowing for the development of components that can run inside the database engine. internet File System (ifs) allows for users to drag and drop data from Windows-based apps into the database, where it can be indexed, managed and queried (Foley). As a result a query could be initiated that spans e-mail, text files, and your data warehouse. Proprietary file formats such as Word and Excel can be imported into the database where they can be subsequently viewed using a Web browser. Oracle announced that it would support XML in Oracle8i, Oracle Application Server, and its development tools. Oracle8i will include a XML parser for processing XML documents and ifs will automate rendering of data from XML and the database. The XML support within Oracle8i and ifs will allow for the support of non-relational data files. Oracle's XML support will be made up of three components (Walsh): XML parser to process the XML document inernet File System (ifs) for parsing and rendering the XML to the database intermedia to provide better search tools for processing XML documents Oracle Application Server 4.0 is a Web development platform. Oracle's Web application server aims to act as an application server that will create a live link between back-end data and a browser enabled client so that users can query and scroll through databases (Greenberg). Oracle Application Server 4.0 supplies a mechanism for storing your application logic along with the Page 6
underlying communications that allows client workstations to interface with this application logic. In addition, Oracle Application Server 4.0 offers services that simplifies the development of Web-enabled applications An additional database development tool offered by Oracle is JDeveloper 2.0. JDeveloper is a development tool set for building multi-tier database applications using Java. JDeveloper will offer tight integration with Oracle8i and Oracle Application Server. In addition JDeveloper will also support Java Database Connectivity (JDBC), JavaBeans, Enterprise JavaBeans, and CORBA. Oracle recently announced a new product called Warehouse Builder 2.0 that is a Javabased data warehouse design product. Warehouse Builder includes visual modeling, design, aggregation, metadata management, data extraction, and loading capabilities. Using graphical wizards, Warehouse Builder supports the designing, creating and managing of all data warehouse activities. The wizards reduce the amount of code that has to be generated by developers to build the data warehouse and should reduce the cost of the implementation. Warehouse Builder is targeted for release in early 1999. To help smaller companies Oracle will offer Data Mart Suite 2.0 which is a turnkey package for building data marts. Oracle President and Chief Operating Officer Ray Lane drove home the message that Oracle s products provide the foundation for an Internet platform on which businesses can deploy Internet-based applications and services (Niccolai). In support of this comment Oracle is offering Web enabled updates to the following products: Oracle Reports Oracle Discover Oracle Express (OLAP Engine) Oracle is even considering bypassing the operating system by running their database on top of a microkernel. This would in effect eliminate the need for an underlying operating system Page 7
such as Unix or Windows NT. Oracle feels that for some applications that the operating system even hampers the performance of the system. Oracle has approached Compaq, Hewlett-Packard, Intel, and sun about building a microkernel database server which it calls Raw Iron (Johnston). Interestingly, nine of the top ten business sites on the Internet use Oracle databases. The tenth business site is IBM who uses its own DB2 database. The businesses that are using Oracle products include Amazon.com Inc. and ETrade Group Inc. Oracle is placing all bets on the Internet. Larry Ellison stated, "If the Internet turns out not to be the future of computing, we're toast" (Foley). Sybase Sybase products function independently with products from Sybase and other vendors, allowing organizations to build flexible, open information systems (Sybase). The goal is to supply information to users when they need it, however they want it, and wherever they are. Currently, Sybase has three business objectives: On-line Transaction Processing (OLTP) Data Warehouse and Decision Support systems Mass deployment Sybase Adaptive Server Enterprise 11.9.2 is the product targeted for OLTP systems. Sybase version 11.9.2 is to include improved query optimization to support better performance in distributed environments along with new recovery capabilities that will maximize database availability. Sybase Adaptive Server IQ is designed for high-performance data analysis and is targeted for data warehousing. Adaptive Server IQ supports access to heterogeneous data sources such as Oracle, DB2, Informix, and flat files. Page 8
The mass deployment strategy is not so much focused on the Internet but on personal computers and personal digital assistants (PDA). The SQL Anywhere product line is geared towards this market. Sybase is trying to address offline data access to support a more mobile work force. The mass deployment strategy targets workers that are geographically dispersed, on the road, and at customer locations. The key to supporting this new workforce is to enable information access without requiring the user to connect to a network. Sybase feels that the users will gain greater accessibility to their data and will see a performance increase because they are not limited to modem or network speeds. The ability for users to operate off-line is made possible through data replication. As a result, the SQL Anywhere products have strong replication capabilities. Through replication users can retrieve information from a central location and at the end of the day forward updates to the central location. Sybase is enhancing their Adaptive Server Enterprise product lines to better support the integration of data from disparate data sources. The following products are being offered by Sybase to improve database integration: EnterpriseConnect DirectConnect Replication Server MainframeConnect To aid in the development of a data warehouse Sybase offers Warehouse Studio. Warehouse Studio includes the following products that can aid in the development of a data warehouse or data mart: WarehouseArchitect for design PowerStage for integration Page 9
Adaptive Server for DBMS support PowerDiminsions for visualization Warehouse Control Center for metadata management Sybase offers jconnect, which is a JDBC driver that can be used to connect to Sybase data sources. The JDBC driver is completely written in Java and is fully compliant. jconnect will enable Java applications to easily connect to Sybase data sources. In addition, the JDBC driver eliminates any client workstation software installation. The driver will be automatically download once the data source is referenced. Sybase also offers a product called Dynamo that acts as a middle-tier link between Web servers and ODBC compliant database management systems. Dynamo converts requests received from the Web server into SQL and passes the SQL on to the DBMS. Once the query is satisfied Dynamo converts the results into HTML and returns the information back to the Web server. Dynamo is a technology that can bring many data sources together through a common access mechanism. Microsoft Recently Microsoft has received an enormous amount of press associated with their latest database management system release SQL Server 7.0 with the primary focus being on the integrated OLAP server code name Plato. Microsoft is going to bring OLAP to the masses and could drive the acceptance of OLAP within the business community. Microsoft plans to overcome the greatest barriers to OLAP implementation: cost of systems and complexity of development and administration (Angus). With OLAP functions so tightly tied in with the relational back end, administrators will be able to provide usable summary reporting capabilities to users. OLAP specific vendors will be impacted the most with the Microsoft entry into the OLAP market. Microsoft's online analytical processing (OLAP) services may be the most Page 10
important to happen to the OLAP market since its inception (Cornetto). What Microsoft offers is the ability to sale in volume at a cheaper price that it competitors IBM and Oracle which will now make OLAP available to small and midsize organizations. Microsoft s SQL Server 7.0 will change the landscape of data warehousing, bringing it within the reach of more IT organizations (Schwartz). Another possible use of Microsoft s OLAP server is to install it in conjunction with another DBMS. The OLAP service is a separate component that can support other industry DBMS. SQL Server also offers enhanced performance than its predecessor. Queries are executing up to ten times faster than previous versions. In addition, batch performance is approximately four times faster. SQL Server is offering better usability, performance, scalability, and reliability. Also with this release SQL Server 7.0 has the capabilities of managing up to one terabyte of data. On the other hand, SQL Server does not support database clustering, object database design, and Java database access and programming. Both Oracle8 and DB2 support these features. In addition, SQL Server 7 will likely not match IBM. Informix, Oracle, and Sybase for their enterprise-level functions and performance, even in the economy end of the market (Surveyer). For data transformation Microsoft offers Data Transformation Services (DTS). DTS is used to import and export data between SQL Server and other data formats. DTS helps to bring data from alternative databases and data sources into the data warehouse or data mart. Microsoft also offers Repository 2.0 for metadata management. Microsoft s strategy is to reduce the cost and complexity of data warehousing while making the technology accessible to a wider audience. Our comprehensive approach to the process of data warehousing enables customers to build cost-effective solutions via a combination of vendor alliances, services and technologies (SQL). SQL Server is tightly integrated with Microsoft s Internet Information Server, Site Server and Proxy Server. Microsoft s Internet strategy is based upon this tight integration. To ensure Page 11
that products work with SQL Server 7.0, Microsoft has formed a Data warehouse Alliance with vendors. This alliance is intended to ensure that products that work with SQL Server 7.0 also work with each other. Microsoft currently does not support embedded Java or native JDBC drivers. Page 12