An Oracle White Paper August 2013. Express Mode Loading with SQL*Loader in Oracle Database 12c



Similar documents
An Oracle White Paper July Introducing the Oracle Home User in Oracle Database 12c for Microsoft Windows

An Oracle White Paper June Creating an Oracle BI Presentation Layer from Imported Oracle OLAP Cubes

An Oracle White Paper June RESTful Web Services for the Oracle Database Cloud - Multitenant Edition

October Oracle Application Express Statement of Direction

A Joint Oracle Teradata White Paper March Configuring Oracle Business Intelligence Enterprise Edition for Teradata Temporal Tables

How To Load Data Into An Org Database Cloud Service - Multitenant Edition

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

G Cloud 7 Pricing Document

G Cloud 7 Pricing Document

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

March Oracle Business Intelligence Discoverer Statement of Direction

Driving the Business Forward with Human Capital Management. Five key points to consider before you invest

Migrating Non-Oracle Databases and their Applications to Oracle Database 12c O R A C L E W H I T E P A P E R D E C E M B E R

An Oracle White Paper January Oracle Database 12c: Full Transportable Export/Import

An Oracle White Paper January Using Oracle's StorageTek Search Accelerator

An Oracle White Paper June Migrating Applications and Databases with Oracle Database 12c

THE NEW BUSINESS OF BUSINESS LEADERS. Hiring and Onboarding

An Oracle White Paper June Security and the Oracle Database Cloud Service

Oracle Utilities Customer Care and Billing Release Utility Reference Model Process Customer Request For Literature and Forms

An Oracle Technical Article November Certification with Oracle Linux 6

A Framework for Implementing World-Class Talent Management. The highest performing businesses are re-focusing on talent management

Oracle s BigMachines Solutions. Cloud-Based Configuration, Pricing, and Quoting Solutions for Enterprises and Fast-Growing Midsize Companies

An Oracle White Paper November Oracle Business Intelligence Standard Edition One 11g

An Oracle White Paper December Cloud Candidate Selection Tool: Guiding Cloud Adoption

An Oracle White Paper September Lowering Storage Costs with the World's Fastest, Highest Capacity Tape Drive

An Oracle White Paper July Oracle Linux and Oracle VM Remote Lab User Guide

Primavera Unifier Integration Overview: A Web Services Integration Approach O R A C L E W H I T E P A P E R F E B R U A R Y

An Oracle Communications White Paper December Serialized Asset Lifecycle Management and Property Accountability

An Oracle Technical Article March Certification with Oracle Linux 7

Incorporating Behavioral Analytics into Exception-Based Reporting O R A C L E W H I T E P A P E R D E C E M B E R

An Oracle White Paper September Oracle WebLogic Server 12c on Microsoft Windows Azure

An Oracle White Paper February, Oracle Database In-Memory Advisor Best Practices

Managed Storage Services

An Oracle White Paper March Oracle Data Guard Broker. Best Practices for Configuring Redo Transport for Data Guard and Active Data Guard 12c

Driving Down the High Cost of Storage. Pillar Axiom 600

One View Report Samples Warehouse Management

An Oracle White Paper February Integration with Oracle Fusion Financials Cloud Service

An Oracle White Paper August Oracle Database Auditing: Performance Guidelines

An Oracle White Paper May Distributed Development Using Oracle Secure Global Desktop

PRODUCT HUB STREAMLINED ITEM BATCH USER INTERFACE DEFINE IMPORT FORMATS FOR SPREADSHEET IMPORT CONSOLIDATION OF DIGITAL ASSETS THROUGH THE ITEM BATCH

FAQ: How to create Effective Messages

Oracle SQL Developer Migration

An Oracle White Paper April, Effective Account Origination with Siebel Financial Services Customer Order Management for Banking

Oracle Taleo Enterprise Cloud Service. Talent Intelligence for Employee Insight

2011 Customer Experience Impact Report. Getting to the Heart of the Consumer and Brand Relationship

An Oracle Technical Article October Certification with Oracle Linux 5

An Oracle White Paper February Oracle Data Integrator 12c Architecture Overview

An Oracle Technical White Paper June Oracle VM Windows Paravirtual (PV) Drivers 2.0: New Features

An Oracle White Paper March Oracle s Single Server Solution for VDI

An Oracle Best Practice Guide April Best Practices for Knowledgebase and Search Effectiveness

An Oracle White Paper November Metrics: What They Are and Why They Matter

An Oracle White Paper October Oracle Data Integrator 12c New Features Overview

An Oracle White Paper May Creating Custom PDF Reports with Oracle Application Express and the APEX Listener

An Oracle White Paper May 2011 BETTER INSIGHTS AND ALIGNMENT WITH BUSINESS INTELLIGENCE AND SCORECARDS

An Oracle White Paper September SOA Maturity Model - Guiding and Accelerating SOA Success

Top Ten Reasons for Deploying Oracle Virtual Networking in Your Data Center

An Oracle White Paper June How to Install and Configure a Two-Node Cluster

An Oracle White Paper October BI Publisher 11g Scheduling & Apache ActiveMQ as JMS Provider

SIX QUESTIONS TO ASK ANY VENDOR BEFORE SIGNING A SaaS E-COMMERCE CONTRACT

OpenLDAP Oracle Enterprise Gateway Integration Guide

Field Service Management in the Cloud

Oracle Financial Management Analytics

Improve your Customer Experience with High Quality Information

An Oracle White Paper September Oracle Database and the Oracle Database Cloud

An Oracle White Paper March Managing Metadata with Oracle Data Integrator

An Oracle White Paper December Tutor Top Ten List: Implement a Sustainable Document Management Environment

An Oracle Technical Article November Certification with Oracle Linux 7

An Oracle White Paper December Integrating Oracle Enterprise Single Sign-On Suite Plus with Strong Authentication

Contract Lifecycle Management for Public Sector A Procure to Pay Management System

April Oracle Higher Education Investment Executive Brief

SOCIAL NETWORKING WITH A PURPOSE. Building Your Referral Machine

One View Report Samples Health and Safety Incident Management

An Oracle White Paper April How to Install the Oracle Solaris 10 Operating System on x86 Systems

The new Manage Requisition Approval task provides a simple and user-friendly interface for approval rules management. This task allows you to:

Oracle Sales Cloud for Consumer Goods

June, 2015 Oracle s Siebel CRM Statement of Direction Client Platform Support

Oracle Fusion Applications Splitting Topology from Single to Multiple Host Servers

Oracle s Primavera Prime Capital Plan Management

Oracle JD Edwards EnterpriseOne Mobile Sales Order Entry

An Oracle White Paper July Corporate Treasury: The Role of Technology during and Post-Crisis

An Oracle White Paper May Oracle Database Cloud Service

January A Better RAID Strategy for High Capacity Drives in Mainframe Storage

An Oracle White Paper August Automatic Data Optimization with Oracle Database 12c

Business Driven Process Optimization

Oracle Fusion Middleware

Oracle Primavera Gateway

An Oracle White Paper September Oracle Database Smart Flash Cache

PeopleSoft Compensation

Oracle Sales Cloud for High Tech and Manufacturing

Unbreakable Linux Network An Overview

Enterprise Manager 10g Backup, Recovery and Disaster Recovery Considerations. An Oracle White Paper March 2004

Oracle Hyperion Financial Close Management

The Yin and Yang of Enterprise Project Portfolio Management and Agile Software Development: Combining Creativity and Governance

Guide to Instantis EnterpriseTrack for Multi- Initiative EPPM:

Microsoft Active Directory Oracle Enterprise Gateway Integration Guide

An Oracle White Paper August Oracle Hyperion Project Financial Planning: Aligning Financial and Project Plans

An Oracle White Paper August Higher Security, Greater Access with Oracle Desktop Virtualization

An Oracle White Paper November Hybrid Columnar Compression (HCC) on Exadata

An Oracle White Paper July Oracle Desktop Virtualization Simplified Client Access for Oracle Applications

Oracle Sales Cloud Analytics

Transcription:

An Oracle White Paper August 2013 Express Mode Loading with SQL*Loader in Oracle Database 12c

Introduction... 3 Benefits of Using Express Mode... 3 Overview of CSV Files... 3 CSV files are text files where every record in the file contains data for one row in a table... 3 Each record ends with a newline... 3 A character, usually a comma, terminates the value for each field in the record... 4 CSV files can have another character that is used to enclose the field values... 4 Use of the enclosure character can be optional or mandatory 4 A field can contain the enclosure character if stutter syntax is used... 4 Loading with Express Mode... 4 Customizing SQL*Loader Express Mode... 5 Changing the Field Terminator... 5 Specifying the Order of the Fields in the Data File... 5 Using a Different Data File... 6 Using an Enclosure Character... 7 Using an Enclosure Character with for Fields Containing NEWLINE7 Other Customizations... 8 SQL*Loader Log File for Express Mode... 9 Conclusion... 9

Introduction A common problem for users of SQL*Loader is generating the control file that SQL*Loader uses to load a data file. While SQL*Loader provides flexibility for loading different types of data files, that flexibility can make creating even simply control files for SQL*Loader more difficult than users expect. The more fields there are in the data file, the more room there is for errors when creating the control file. Luckily, most data files are in comma-separated values (CSV) format. In Oracle Database 12c, SQL*Loader has a new feature called express mode that makes loading CSV files faster and easier. With express mode, there is no need to write a control file for most CSV files you load. Instead, you can load the CSV file with just a few parameters on the SQL*Loader command line. This white paper shows you how to get started with express mode. Benefits of Using Express Mode The main benefit of SQL*Loader express mode is the savings for time and effort that results from not needing to write and test a SQL*Loader control file. Instead, you specify a single SQL*Loader command with a few parameters and the load starts. Another benefit of express mode is that it will try to use the fastest mechanism for loading data files: external tables using parallel inserts with the append hint. The append hint on an insert statement tells the database to use direct path for loading table data. Executing the insert in parallel means that multiple processes can load data from the data file at once, reducing the elapsed time for the load. If express mode finds that external tables cannot be used for a particular load operation, it will fall back to the next fastest mechanism, which is a direct path load in SQL*Loader. Overview of CSV Files Express mode for SQL*Loader loads data only from CSV files. CSV files have the following characteristics: CSV files are text files where every record in the file contains data for one row in a table The fields in each record are in the same order in every record. Each record ends with a newline On Unix-like operating systems, a newline is the linefeed character. On Windows, a newline is a carriage return and a linefeed character. 3

A character, usually a comma, terminates the value for each field in the record Use of a comma is not a requirement. Other characters such as colon or vertical bar are also used commonly as terminators. For simplicity, we still call these files CSV files even if the terminator character is something other than a comma. The terminator after the last field in the record is optional. CSV files can have another character that is used to enclose the field values This character is usually the double quote character, but it can be another character. Enclosing a value allows you to have a newline or the field terminator character included as part of the value of the field. Use of the enclosure character can be optional or mandatory CSV files that use the enclosure character can require them around every field or let the use of enclosures be optional. If the use of enclosure is optional, then they are typically used only for field values that contain a newline or the terminator character. A field can contain the enclosure character if stutter syntax is used If a CSV file uses enclosure characters, and if a field value contains the enclosure character, then the enclosure character in the field value has to be duplicated. This is called stutter syntax. It is a way to distinguish the occurrence of the enclosure character in the middle of a field from an enclosure character marking the end of the field. Loading with Express Mode In Oracle Database 12c, SQL*Loader has a new parameter, TABLE, that turns on express mode. The value of the TABLE parameter is the name of the table that SQL*Loader will load. If TABLE is the only parameter specified, then SQL* loader will do the following: 1. Look for a data file in the current directory with the same name as the table being loaded and with an extension of.dat. The case of the name of the data file is the same as the case of the table name specified for the TABLE parameter 2. Assume the order of the fields in the data file matches the order of the columns in the table 3. The fields are terminated by commas, but there is no enclosure character Here is a simple example of a table, a data file, and the SQL*Loader command that will load the table. The table is created with the following command: CREATE TABLE EMP (EMPNO number(4) not null, ENAME varchar2(10), HIREDATE date, DEPTNO number(2)); 4

The data file used to load the table is derived from the table name, emp, and is emp.dat. The content of the data file in this example is shown here: 7782,Clark,09-Jun-81,10 7839,King,17-Nov-81,12 Note that the order of the fields in the data file matches the order of the columns in the table. The following SQL*Loader command will load the table from the data file. sqlldr userid=scott table=emp Note that no control file is needed for this example. After executing the SQL*Loader command, a SELECT from the table will show the following: EMPNO ENAME HIREDATE DEPTNO ---------- ---------- --------- ---------- 7782 Clark 09-JUN-81 10 7839 King 17-NOV-81 10 Customizing SQL*Loader Express Mode While the SQL*Loader command in the previous example was simple, it made some assumptions that may not always be true. For example, CSV files can use a character other than a comma to terminate fields or there might be multiple data files to load, none of which use the name of the table in the file names. For these reasons, SQL*Loader Express Mode has other command line parameters that allow you to modify the load. Changing the Field Terminator The TERMINATED_BY parameter for the SQL*Loader command line specifies the character that is used to terminate the fields. If the data file in the example used vertical bars for separators rather than commas, it would look like this: 7782 Clark 09-Jun-81 10 7839 King 17-Nov-81 12 The command used to load this data file is: sqlldr userid=scott table=emp terminated_by= Specifying the Order of the Fields in the Data File When SQL*Loader Express Mode reads fields the record in the data file, it uses the order of the columns in the table as the order of the fields in the data file. However, it is possible that the order of the fields in the data file is different than the order of the columns in the table. For those data files, use the new FIELD_NAMES parameter. This 5

parameter tells SQL Loader that the first record in the data file specifies the order of the fields in the data file. This example shows a data file emp.dat used to load table EMP where the field order does not match the order of the columns in the table. deptno,empno,ename,hiredate 10,7782,Clark,09-Jun-81 12,7839,King,17-Nov-81 The command used to load this data file is: sqlldr userid=scott table=emp field_names=first The FIELD_NAMES parameter has other options to customize loading of multiple data files. In this example, FIELD_NAMES=FIRST tells SQL*Loader that only the first data file has the field name list as its first record. If the load uses multiple data files, none of the other data files will have a field name list. Specifying FIELD_NAMES=ALL tells SQL*Loader that all data files have the list of field names as its first record. Using a Different Data File The default name for the data file is the name of the table with a.dat extension. To load a file with a different name or to load multiple files, use the DATA parameter. The DATA parameter accepts a list of values, so you can specify multiple files as part of the load. Starting in Oracle Database 12c, SQL*Loader also supports wildcard characters, * and? in file names. The * wild card character matches one or more characters in a file name. The? wild card character matches just one character. This greatly simplifies the command string when loading dozens or hundreds of files with similar names. Assume we have two files to load into table emp. The first file is jan_2012_emp.dat: 7782,Clark,09-Jan-12,10 7839,King,17-Jan-12,12 The second file is feb_2012_emp.dat: 8109,Baby,12-Feb-12,10 8299,Lee,24-Feb-12,12 You could load these files with either of the two following commands for SQL*Loader: sqlldr userid=scott table=emp data=jan_2012_emp.dat,feb_2012_emp.dat sqlldr userid=scott table=emp data=*_2012_emp.dat 6

The first command lists the files to be loaded. The second command uses wild cards in the filenames for the data parameter. Using an Enclosure Character A data file may need to use an enclosure character around fields because the fields might contain the terminator character or newline characters as part of the value of the field. Consider the following table definition: create table part_info (part_number varchar2(10), part_description varchar2(40) ); File part_info.dat contains the following: 10-1002,"size 12 widget, red" 10-1003,"size 12 widget, blue" Table PART_INFO could be loaded with the following command: sqlldr userid=scott table=part_info optionally_enclosed_by=\ \ \ After the load, the table would have the following rows. PART_NUMBE PART_DESCRIPTION ---------- ---------------------------------------- 10-1002 size 12 widget, red 10-1003 size 12 widget, blue This example used OPTIONALLY_ENCLOSED_BY because only some of the field values were enclosed by quotes. Use the ENCLOSED_BY parameter if every field is enclosed. Using an Enclosure Character with for Fields Containing NEWLINE The enclosure character can be used to enclose fields that also contain newlines. For example, here is a table definition with a large text field. CREATE TABLE EMP_RESUMES (EMPNO number(4) not null, EMP_RESUME varchar2(1000)); Here is the data file that needs to be loaded into the table. 7782,"line 1 of resume for 7782 line 2 of resume 7

last line of resume" 7839,"line 1 of resume for 7839 line 2 of resume last line of resume" Note that the EMP_RESUME field is enclosed by double quotes and has newlines embedded in the data. If a data file does contain such a field, then the SQL*Loader command needs to specify the CSV parameter with a value of WITH_EMBEDDED. SQL Loader needs to do extra processing to handle files that contain newline characters inside of fields. This extra processing can slow the load significantly for data files that do not require it. So, rather than make all loads pay the price for this processing, SQL*Loader requires the user to specify CSV=WITH_EMBEDDED to do the special processing when loading data files that need it. The SQL Loader command to load table EMP_RESUMES with the data file shown above is: sqlldr userid=scott table=emp_resumes csv=with_embedded Selecting from the table after the load shows the data from the file: SQL> select * from emp_resumes; EMPNO ---------- EMP_RESUME --------------------------------------------------------------------- 7782 line 1 of resume for 7782 line 2 of resume last line of resume 7839 line 1 of resume for 7839 line 2 of resume last line of resume Other Customizations SQL*Loader Express Mode supports some additional parameters to customize a load. These are listed below. The documentation for SQL*Loader express mode has more information about these options. BAD the file where SQL*Loader writes records that could not be loaded. CHARACTERSET the name of the character set used to encode the data files. DATE_FORMAT the format string to use when interpreting dates in the data file. DEGREE_OF_PARALLELISM the degree of parallelism to use when loading with external tables. 8

DIRECT force the load to use direct path or conventional path load instead of external tables. EXTERNAL_TABLE force the load to use external tables to load the data file, or tell SQL*Loader to write the SQL statements used to load the external tables to the log file without executing those statements. LOAD the number of records to load. The default is to load all records. NULLIF the value for fields in the record that will cause SQL*Loader to insert a NULL for that field. SILENT control what messages are written to the SQL Loader log file. TIMESTAMP_FORMAT the format string to use when interpreting timestamp fields in the data file. TRIM how white space should be trimmed from the beginning and end of a field. SQL*Loader Log File for Express Mode The log file for SQL*Loader Express Mode contains the same information as other SQL*Loader operations, but it also has additional information. SQL*Loader Express Mode writes the contents of a SQL*Loader control file that would perform the same load as if express mode were not being used. If the customization options of express mode are not sufficient for a load, then you can use the control file information in the log files as a starting point for writing and customizing a control file to load the data. For example, the DATE_FORMAT parameter does not help if there are different data formats used by different date fields in the data file. In that case, you can Use SQL*Loader express mode to write the control file information to the log file, Create a new control file with that information, and then Modify the data format strings to the correct values for the date fields. This makes SQL*Loader express mode a good way to learn how to write more complicated SQL*Loader control files. Conclusion Express mode in SQL*Loader greatly simplifies the work needed to load most data files. Use express mode for loading CSV files, the most common file format. Express mode tries to use the fastest mechanism possible for loading the data, and has optional parameters that allow users to customize the load. Getting your load started and completed is faster with express mode. 9

Express Mode Loading with SQL*Loader in Oracle Database 12c June 2013 Author: Jim Stenoish Contributing Authors: Ray Pfau, Roy Swonger Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065 U.S.A. Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200 oracle.com \ Copyright 2013, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group. 0612