A Migration Methodology of Transferring Database Structures and Data Database migration is needed occasionally when copying contents of a database or subset to another DBMS instance, perhaps due to changing the DBMS product or edition, or business restructuring, company merges or outsourcing. Often part of contents of a new database can be copied from some existing data source. Contents copy can also be a regularly repeated operation in case of replication or extract-transform-load (ETL) collection of data into a data warehouse, for which some ETL tool is typically used. For one-time migrations some advanced tools may be available, but it is possible the generic tools don't always support the required new DBMS versions. Instead of presenting migration or ETL tools, in this tutorial we will focus on technical parts of an export/import operation using SQL, typical basic utilities, and some sample Java code of our own, which introduces a not-so-typical use of JDBC metadata. For loading existing data from other data sources, DBMS products typically include tools, such as LOAD or IMPORT utilities of DB2, BCP utility or IMPORT wizards of SQL server, or SQL*Loader of Oracle. Often some application programs are written to sort out some mapping problems between the structures in the data source and in the destination database. The Migration Methodology Transfer of structures/objects: Logical model 2. Select destination DBMS and generate DDL scripts of the physical model Reverse Engineering 1. types views Transfer of data: table definition constraints procedures triggers XML Schemas (?) indexes SQL editor 3. Create table 9. Apply constraints, indexes Migrate types, procedures, triggers, source 5. 8. destination 4. Create (filtering) View exportdata to files in \transfer directory <schema.table>.del xml clob blob importdata 7. Transport and unzip the files into \transfer directory 6. Zip the files Figure 1 Transfer of database structures/object and data into another database Figure 1 present the migration methodology of our case study, the migration steps in copying a single table from a source database in an SQL Server instance into an existing destination database in a DB2 instance.
Steps 1-2. Using some more advanced tool, such as Oracle Data Modeler, we could use reverse engineering and build first a logical model of the source database, and after selecting the destination DBMS product editions and current version, we could generate the proper DDL commands for re-creating the possible data types, the table with alter table commands for constraints, views, indexes, stored procedures, and triggers. Using some reverse engineering tool, in our case the SQL Server Management Studio, we copy the selected table structure with its related structures, such as user defined data types (UDTs), constraints, views, procedures, triggers, and indexes as DDL commands in DDL script files, which we need to update manually into SQL DDL format of the destination DBMS product. Step 3. As first operation in the destination database we create the table including primary key, unique and check constraints. In case we need a clustered index for the table, this is the time to create it. In case of SQL Server, clustered index is generated automatically for the primary key as default. Step 4. In case we need to make changes in the table columns or collect and join data from multiple tables, we create a temporary filtering view for the table(s) and in step 5 we will use this view instead of the actual base table. In case of Oracle, the SELECT statement of the view can also define the new order of the exported data. Step 5. The data contents of the selected table/view is exported using the Java program "exportdata" into a commadelimited data file named as "<schema>.<table>.del" in which data of non-numeric data type columns is enclosed in quotes. Contents of LOB columns, such as XML, CLOB, and BLOB, are written into separate data files (LOB files) which are pointed by XDF-elements "<XDS FIL='<column>.<LOB type>' />" in the DEL data file. The format of the DEL data file we have adopted from DB2. All these files are generated in local transfer directory. Steps 6-7. If the destination database is on a remote server, we ZIP all generated files into a ZIP archive file which we transfer and UNZIP into corresponding transfer directory in the destination server. Step 8. Using the "importdata" program, the contents in the DEL data file and the referenced LOB files are loaded into the table created in the destination database. Successful transfer of data need to be verified at least comparing number of rows, and numbers of non-null values by columns, possibly also sums of values in some numeric columns, between the source and the destination database. Also a selected sample of rows needs to be compared manually. In case we need to transfer multiple tables it is recommended that transferred LOB files are deleted after successful transfer. Steps 1-8 are then repeated for every table to be transferred. Step 9. After all necessary tables have been transferred to the destination database, all relevant constraints will be added using the ALTER TABLE commands generated and possibly modified in steps 2, all necessary UDT types, user-defined functions, views, indexes, stored procedures, and triggers are created. Note that these may need to be modified first manually for the SQL syntax and object types in the destination DBMS. The same applies to XML indexes and XML Schema management solutions, since the XML implementations of the mainstream DBMS products are very different as pointed out in our XML and Databases tutorial at http://www.dbtechnet.org /papers/xmlanddatabasestutorial.pdf.
. Finally before the QA test of the application, we need to update statistics of the imported tables and indexes. Note: In case we want load the data in a special order supporting, for example, clustering index of the table, we need to make some special arrangements, since the DEL file structure, with variable length fields, does not support any sort processing. If the source database is Oracle, then we can sort the data using ORDER BY clause in the filtering view in step 4. A more general solution is to copy content of the original table into a new temporary table in the source database using following type of INSERT command INSERT INTO <temptable> SELECT * FROM <sourcetable> ORDER BY <clusteringkey> In case of SQL Server the temporary table can be built in the same command as follows SELECT * INTO <temptable> FROM <sourcetable> ORDER BY <clusteringkey> SQL Server to DB2 Case Study In this case study we export just a table from AdwentureWorks2008 sample database of SQL Server 2008 and import it into DB2 Express-C database named TEST. We will transfer the table without major changes in the structure and contents. We will apply the methodology in a different order, starting from Step 4 Step 4: To keep the case simple, but for including some LOB examples, we will create in AdwentureWorks2008 database a temporary schema, view and table as follows: CREATE SCHEMA Temp; CREATE VIEW Temp.ProductSample AS SELECT P.ProductID, P.Name, P.ProductNumber, P.ListPrice, P.ProductModelID,P.SellStartDate, CAST(PM.CatalogDescription AS NVARCHAR(MAX)) AS Description, PM.Instructions, PP.ThumbNailPhoto FROM Production.Product P LEFT JOIN Production.ProductModel PM ON P.ProductModelID=PM.ProductModelID LEFT JOIN (Production.ProductProductPhoto PPP JOIN Production.ProductPhoto PP ON PPP.ProductPhotoID=PP.ProductPhotoID) ON P.ProductID=PPP.ProductID WHERE P.ProductID IN (740,771,850, 886); We cast the CatalogDescription column to NVARCHAR(MAX) to include also some "CLOB" data in our test. The following command will create us a new table structure which maps the original data types with the view Temp.ProductSample from which we will export the test data
SELECT * INTO Temp.Products FROM Temp.ProductSample WHERE ProductID IS NULL; Let s see what kind of rows we get as test data from our view to be migrated to DB2: SELECT * FROM Temp.ProductSample ; Step 5: We will now export the sample data using the exportdata program entering our parameter values as Windows environment variables and passing then the values as command line parameters as follows: C:\WORK\transfer>SET DRIVER="com.microsoft.sqlserver.jdbc.SQLServerDriver" C:\WORK\transfer>SET URL="jdbc:sqlserver://SERVER1;DatabaseName=AdventureWorks2008" C:\WORK\transfer>SET USER="user1" C:\WORK\transfer>SET PSW="salasana" C:\WORK\transfer>SET SCHEMA_TABLE="Temp.ProductSample" C:\WORK\transfer>SET CLASSPATH=.;sqljdbc4.jar C:\WORK\transfer>java exportdata %DRIVER% %URL% %USER% %PSW% %SCHEMA_TABLE% exportdata Version 0.6 sql=select * FROM Temp.ProductSample colcount=9 i=1 name=productid type=4 typename=int i=2 name=name type=-9 typename=nvarchar i=3 name=productnumber type=-9 typename=nvarchar i=4 name=listprice type=3 typename=money i=5 name=productmodelid type=4 typename=int i=6 name=sellstartdate type=93 typename=datetime i=7 name=description type=-9 typename=nvarchar i=8 name=instructions type=-16 typename=xml i=9 name=thumbnailphoto type=-3 typename=varbinary processing the rows.. 740,"HL Mountain Frame - Silver, 44","FR-M94S-44",1364.5000,5,"2001-07-01",,,"<X DS FIL='ThumbNailPhoto.del.001.BLOB' />" 771,"Mountain-100 Silver, 38","BK-M82S-38",3399.9900,19,"2001-07-01","<XDS FIL=' Description.del.002.CLOB' />",,"<XDS FIL='ThumbNailPhoto.del.002.BLOB' />" 850,"Men's Sports Shorts, L","SH-M897-L",59.9900,13,"2002-07-01",,,"<XDS FIL='Th umbnailphoto.del.003.blob' />" 886,"LL Touring Frame - Yellow, 62","FR-T67Y-62",333.4200,10,"2003-07-01",,"<XDS FIL='Instructions.del.004.XML' />","<XDS FIL='ThumbNailPhoto.del.004.BLOB' />" Total number of rows: 4 closing.. C:\WORK\transfer> The program lists first the JDBC metadata for every column in the selected table/view and then prints max 9 first data lines to be written in the generated DEL (comma-delimited) file. Note: Based on the JDBC metadata it is not possible to decide which columns contain derived data and which persistent data, nor which columns are based on user-defined (UDT) types. The only way to avoid derived columns is to eliminate them in the filtering view of step 4. Also in some cases we need to test both type and typename to decide how the program should handle the content.
Step 1: We will now apply the step 1 using the SQL Server Management Studio (SSMS) selecting the source database using alternate mouse button and then Generate Scripts.. from the pop-up menu presented in figure 2. Figure 2 Start reverse engineering of the table/view structure Figure 3 Selecting the table
Figure 4. Selecting the script file Selecting Next until we get to Finish button, and finally we get the following DDL script file for the table USE [AdventureWorks2008] /****** Object: Table [Temp].[Products] Script Date: 05/23/2011 17:54:53 ******/ SET ANSI_NULLS ON SET QUOTED_IDENTIFIER ON SET ANSI_PADDING ON CREATE TABLE [Temp].[Products]( [ProductID] [int] NOT NULL, [Name] [dbo].[name] NOT NULL, [ProductNumber] [nvarchar](25) NOT NULL, [ListPrice] [money] NOT NULL, [ProductModelID] [int] NULL, [SellStartDate] [datetime] NOT NULL, [Description] [nvarchar](max) NULL, [Instructions] [xml] (CONTENT [Production].[ManuInstructionsSchemaCollection]) NULL, [ThumbNailPhoto] [varbinary](max) NULL ) ON [PRIMARY] SET ANSI_PADDING ON Step 2: The generated DDL commands in Transact-SQL language and data types cannot be processed by other mainstram DBMS products, so we need to modify the script manually, for example, into the following form in which we have eliminated extra commands and the square brackets surrounding the identifiers and data types. For the non-standard data types we have tried select corresponding standard data types which are compatible with the SQL implementation of DB2. To point out some differences between Transact-SQL and
SQL dialect of DB2, we have written the new data types in upper case letters and we have left the some original clauses as row-level comments. In the following we have also changed the schema and the table name. Note: The Name column in our test is originally of UDT data type dbo.name based on the built-in data type VARCHAR(50) and if we don't migrate this UDT into the destination database, we need to change the data type manually into the corresponding built-in data type. In case the table to be migrated contains IDENTITY columns or derived columns, these will have values in the exported data files, and we need to eliminate the IDENTITY definitions from the new table definition, and treat the derived columns as persistent data columns (which can be changed after the migration process). CREATE TABLE Production.Products( ProductID int NOT NULL, name VARCHAR(50) NOT NULL, ProductNumber VARCHAR(25) NOT NULL, ListPrice DECIMAL(7,4) NOT NULL, ProductModelID int, -- NULL, SellStartDate DATE NOT NULL, Description CLOB, -- NULL, Instructions XML, -- (CONTENT Production.ManuInstructionsSchemaCollection) NULL, ThumbNailPhoto BLOB -- NULL ) ; In our case study we create the table in TEST database in our local DB2 Express-C instance and COMMIT the transaction. Step 8 We copy the files from Step 5 into a local directory and rename Temp.ProductSample.DEL as Production.Products.DEL. Then we will import the data into the Production.Products table in DB2 database using the following script: SET DRIVER="com.ibm.db2.jcc.DB2Driver" SET URL="jdbc:db2://localhost:50000/TEST" SET USER="user1" SET PSW="salasana" SET SCHEMA_TABLE="Production.Products" SET CLASSPATH=.;C:\IBM\SQLLIB\java\db2jcc4.jar java importdata %DRIVER% %URL% %USER% %PSW% %SCHEMA_TABLE% C:\work\transfer>java importdata %DRIVER% %URL% %USER% %PSW% %SCHEMA_TABLE% importdata Version 0.7 Import of rows to jdbc:db2://localhost:50000/test Database product name: DB2/NT Database product version: SQL09070 sql=select * FROM Production.Products colcount=9 column#=1 name=productid type=4 typename=integer column#=2 name=name type=12 typename=varchar column#=3 name=productnumber type=12 typename=varchar column#=4 name=listprice type=3 typename=decimal column#=5 name=productmodelid type=4 typename=integer column#=6 name=sellstartdate type=91 typename=date column#=7 name=description type=2005 typename=clob column#=8 name=instructions type=2009 typename=xml column#=9 name=thumbnailphoto type=2004 typename=blob processing the rows of Production.Products.DEL Total number of rows: 4
To verify the migration we look at the table contents in DB2 TEST database in following forms Fig 5 Part of the scalar, CLOB and XML columns Fig 6 The BLOB columns
Fig 7 Result seen in the grid format of Command Editor and an XML document as seen by the XML Document Viewer We have also tested the migration into an other SQL Server database using the following script Import into a remote TEST database in SQL Server instance using script SET DRIVER="com.microsoft.sqlserver.jdbc.SQLServerDriver" SET URL="jdbc:sqlserver://SERVER1;DatabaseName=TEST" SET USER="user1" SET PSW="salasana" SET SCHEMA_TABLE="Production.Products" SET CLASSPATH=.;sqljdbc4.jar java importdata %DRIVER% %URL% %USER% %PSW% %SCHEMA_TABLE% into table Production.Products CREATE TABLE Production.Products( ProductID int NOT NULL, name VARCHAR(50) NOT NULL, ProductNumber VARCHAR(25) NOT NULL, ListPrice DECIMAL(7,4) NOT NULL, ProductModelID int, -- NULL, SellStartDate DATE NOT NULL, Description VARCHAR(MAX), -- NULL, Instructions XML, --(CONTENT Production.ManuInstructionsSchemaCollection) NULL, ThumbNailPhoto VARBINARY(MAX) -- NULL ) ;
resulting contents in Figure 8 Figure 8 We have also exported the imported contents from DB2 using following run SET DRIVER="com.ibm.db2.jcc.DB2Driver" SET URL="jdbc:db2://localhost:50000/TEST" SET USER="user1" SET PSW="salasana" SET SCHEMA_TABLE="Production.Products" SET CLASSPATH=.;C:\IBM\SQLLIB\java\db2jcc4.jar java exportdata %DRIVER% %URL% %USER% %PSW% %SCHEMA_TABLE% and using following commands imported the contents to a local SQL Server 2005 database where the DATE column was defined as DATETIME: SET DRIVER="com.microsoft.sqlserver.jdbc.SQLServerDriver" SET URL="jdbc:sqlserver://localhost;instanceName=SQL2005;DatabaseName=TEST" SET USER="user1" SET PSW="salasana" SET SCHEMA_TABLE="Production.Products" SET CLASSPATH=.;sqljdbc4.jar java importdata %DRIVER% %URL% %USER% %PSW% %SCHEMA_TABLE% C:\WORK\transferDB2>java importdata %DRIVER% %URL% %USER% %PSW% %SCHEMA_TABLE% importdata Version 0.7 Import of rows to jdbc:sqlserver://localhost;instancename=sql2005;databasename=t EST Database product name: Microsoft SQL Server Database product version: 9.00.3042 sql=select * FROM Production.Products colcount=9 column#=1 name=productid type=4 typename=int column#=2 name=name type=12 typename=varchar column#=3 name=productnumber type=12 typename=varchar column#=4 name=listprice type=3 typename=decimal column#=5 name=productmodelid type=4 typename=int column#=6 name=sellstartdate type=93 typename=datetime column#=7 name=description type=-1 typename=varchar column#=8 name=instructions type=-16 typename=xml column#=9 name=thumbnailphoto type=-4 typename=varbinary processing the rows of Production.Products.DEL Total number of rows: 4 and the result is exactly the same as in Figure 8.