SQL*Loader
Maximazing SQL*Loader
Performance
·
Use Direct Path Loads - The conventional path
loader essentially loads the data by using standard insert
statements. The direct path loader (direct=true) loads
directly into the Oracle data files and creates blocks in Oracle
database block format. There are certain cases, however, in which
direct path loads cannot be used (clustered tables). To prepare
the database for direct path loads, the script
$ORACLE_HOME/rdbms/admin/catldr.sql.sql must be executed.
·
Disable Indexes and Constraints. For
conventional data loads only, the disabling of indexes and constraints
can greatly enhance the performance.
·
Use a Larger Bind Array. For conventional
data loads only, larger bind arrays limit the number of calls to the
database and increase performance. The size of the bind array is
specified using the bindsize
parameter. The bind array's size is equivalent to the number of
rows it contains (rows=) times the maximum length of each row.
·
Use ROWS=n to Commit Less Frequently. For
conventional data loads only, the rows parameter specifies the
number of rows per commit. Issuing fewer commits will
enhance performance.
·
Use Parallel Loads. Available with direct
path data loads only, this option allows multiple SQL*Loader jobs to
execute concurrently.
$ sqlldr
control=first.ctl parallel=true direct=true
$ sqlldr control=second.ctl parallel=true direct=true
·
Use Fixed Width Data. Fixed width data
format saves Oracle some processing when parsing the data. The
savings can be tremendous.
·
Disable Archiving During Load. While this
may not be feasible in certain environments, disabling database
archiving can increase performance considerably.
·
Use unrecoverable. The
unrecoverable option (unrecoverable load data) disables the writing of
the data to the redo logs. This option is available for direct
path loads only.
Using the table table_with_one_million_rows,
the following benchmark tests were performed with the various
SQL*Loader options. The table was truncated after each test.
SQL*Loader
Option
|
Elapsed
Time (Seconds)
|
Time
Reduction
|
direct=false
rows=64
|
135
|
-
|
direct=false
bindsize=512000
rows=10000
|
92
|
32%
|
direct=false
bindsize=512000
rows=10000
database in noarchivelog
|
85
|
37%
|
direct=true
|
47
|
65%
|
direct=true
unrecoverable
|
41
|
70%
|
direct=true
unrecoverable
fixed width data
|
41
|
70%
|
The results above indicate that conventional path
loads take the longest. However, the bindsize
and rows parameters can aid the performance under these
loads. The test involving the conventional load didn’t come close
to the performance of the direct path load with the unrecoverable
option
specified.
It is also worth noting that the fastest import time achieved for this
table (earlier) was 67 seconds, compared to 41 for SQL*Loader direct
path – a 39% reduction in execution time. This proves that
SQL*Loader can load the same data faster than import. These tests did
not compensate for indexes. All database load operations will
execute faster when indexes are disabled.
SQL*Loader Control File
The control file is a text file written in a language that SQL*Loader
understands. The control file describes the task that the SQL*Loader is
to carry out. The control file tells SQL*Loader where to find the data,
how to parse and interpret the data, where to insert the data, and
more.
See Chapter 4, "SQL*Loader Case Studies" for example control files.
Although not precisely defined, a control file can be said to have
three sections:
- The first section contains session-wide information, for example:
- global options such as bindsize, rows, records to skip, etc.
- INFILE clauses to specify where the input data is located
- data character set specification
- The second section consists of one or more "INTO TABLE" blocks.
Each of
these blocks contains information about the table into which the data
is
to be loadedsuch as the table name and the columns of the table.
- The third section is optional and, if present, contains input
data.
Examples
Case 1: Loading Variable-Length Data
Loads stream format records in which the fields
are delimited by commas and may be enclosed by quotation marks. The
data
is found at the end of the control file.
Case 2: Loading Fixed-Format Fields:
Loads a datafile with fixed-length fields,
stream-format
records, all records the same length.
Case 3: Loading a Delimited, Free-Format File
Loads data from stream format records with
delimited
fields and sequence numbers. The data is found at the end of the
control
file.
Case 4: Loading Combined Physical Records
Combines multiple physical records into one
logical
record corresponding to one database row
Case 9: Loading LOBFILEs (CLOBs)
Adds a CLOB column called RESUME to the table
emp,
uses a FILLER field (RES_FILE), and loads multiple LOBFILEs into the
emp
table.
Case 1: Loading
Variable-Length
Data
- A simple control file identifying one
table and three columns to be
loaded.
- Including data to be loaded from the
control file itself, so there is
no
separate datafile.
- Loading data in stream format, with
both types of delimited fields --
terminated
and enclosed.
Control File
1) LOAD DATA
2) INFILE *
3) INTO TABLE dept
4) FIELDS TERMINATED BY ',' OPTIONALLY
ENCLOSED
BY '"'
5) (deptno, dname, loc)
6) BEGINDATA
12,RESEARCH,"SARATOGA"
10,"ACCOUNTING",CLEVELAND
11,"ART",SALEM
13,FINANCE,"BOSTON"
21,"SALES",PHILA.
22,"SALES",ROCHESTER
42,"INT'L","SAN FRAN"
Notes:
- The LOAD DATA statement is required at the beginning
of the control
file.
- INFILE * specifies that the data is found in the
control file and not
in
an external file.
- The INTO TABLE statement is required to identify the
table to be loaded
(DEPT) into. By default, SQL*Loader requires the table to be empty
before
it inserts any records.
- FIELDS TERMINATED BY specifies that the data is
terminated by commas,
but
may also be enclosed by quotation marks. Datatypes for all fields
default
to CHAR.
- Specifies that the names of columns to load are
enclosed in
parentheses.
Since no datatype is specified, the default is a character of length
255.
BEGINDATA specifies the beginning of the data.
Invoking SQL*Loader
To run this example, invoke SQL*Loader with the command:
sqlldr userid=scott/tiger control=ulcase1.ctl
log=ulcase1.log
Case
2: Loading Fixed-Format Fields
- A separate datafile.
- Data conversions.
In this case, the field positions and datatypes are
specified
explicitly.
Control File
1) LOAD DATA
2) INFILE
'ulcase2.dat'
3) INTO
TABLE emp
4)
(empno
POSITION(01:04) INTEGER EXTERNAL,
ename
POSITION(06:15) CHAR,
job
POSITION(17:25) CHAR,
mgr
POSITION(27:30) INTEGER EXTERNAL,
sal
POSITION(32:39) DECIMAL EXTERNAL,
comm
POSITION(41:48) DECIMAL EXTERNAL,
5)
deptno
POSITION(50:51) INTEGER EXTERNAL,
6) modifieddate
"SYSDATE",
7) customerid
constant "0"
)
Notes:
- The LOAD DATA statement is required at the beginning
of the control
file.
- The name of the file containing data follows the
keyword INFILE.
- The INTO TABLE statement is required to identify the
table to be loaded
into.
- Lines 4 and 5 identify a column name and the location
of the data in
the
datafile to be loaded into that column. EMPNO, ENAME, JOB, and so on
are
names of columns in table EMP. The datatypes (INTEGER EXTERNAL, CHAR,
DECIMAL
EXTERNAL) identify the datatype of data fields in the file, not of
corresponding
columns in the EMP table.
- Note that the set of column specifications is enclosed
in parentheses.
- This statement let me insert the current sysdate in
this field
- This statement let me put a constant value
Datafile
Below are a few sample data lines from the file
ULCASE2.DAT.
Blank fields are set to null automatically.
7782
CLARK
MANAGER
7839
2572.50
10
7839
KING
PRESIDENT
5500.00 10
7934
MILLER
CLERK
7782
920.00
10
7566
JONES
MANAGER
7839
3123.75
20
7499
ALLEN SALESMAN
7698 1600.00 300.00 30
7654
MARTIN SALESMAN 7698
1312.50 1400.00 30
.
Case
3: Loading a Delimited, Free Format File
- Loading data (enclosed and terminated) in stream format
- Loading dates using the datatype DATE
- Using SEQUENCE numbers to generate unique keys for
loaded data
- Using APPEND to indicate that the table need not be
empty before
inserting
new records
- Using Comments in the control file set off by double
dashes
- Overriding general specifications with declarations
for individual
fields
In this case, the field positions and datatypes are
specified
explicitly.
Control File
This control file loads the same table as in Case 2,
but
it loads three additional columns (HIREDATE, PROJNO, LOADSEQ). The
demonstration
table EMP does not have columns PROJNO and LOADSEQ. So if you want to
test
this control file, add these columns to the EMP table with the command:
ALTER TABLE EMP ADD (PROJNO NUMBER, LOADSEQ NUMBER)
The data is in a different format than in Case 2. Some
data is enclosed in quotation marks, some is set off by commas, and the
values for DEPTNO and PROJNO are separated by a colon.
1) --
Variable-length, delimited and
enclosed
data format
LOAD DATA
2) INFILE *
3) APPEND
INTO TABLE
emp
4) FIELDS
TERMINATED BY "," OPTIONALLY
ENCLOSED
BY '"'
(empno,
ename, job, mgr,
5) hiredate
DATE(20) "DD-Month-YYYY",
sal, comm,
deptno CHAR TERMINATED BY ':',
projno,
6)
loadseq SEQUENCE(MAX,1))
7) BEGINDATA
8) 7782, "Clark",
"Manager", 7839,
09-June-1981,
2572.50,, 10:101
7839,
"King", "President", ,
17-November-1981,5500.00,,10:102
7934,
"Miller", "Clerk", 7782,
23-January-1982,
920.00,, 10:102
7566,
"Jones", "Manager", 7839,
02-April-1981,
3123.75,, 20:101
7499,
"Allen", "Salesman", 7698,
20-February-1981,
1600.00,
(same line
continued)
300.00, 30:103
7654,
"Martin", "Salesman", 7698,
28-September-1981,
1312.50,
(same line
continued)
1400.00, 3:103
7658,
"Chan", "Analyst", 7566,
03-May-1982,
3450,, 20:101
Notes:
- Comments may appear anywhere in the command lines of
the file, but they
should not appear in data. They are preceded with a double dash that
may
appear anywhere on a line.
- INFILE * specifies that the data is found at the end
of the control file.
- Specifies that the data can be loaded even if the
table already contains
rows. That is, the table need not be empty.
- The default terminator for the data fields is a comma,
and some fields
may be enclosed by double quotation marks (").
- The data to be loaded into column HIREDATE appears in
the format DD-Month-YYYY.
The length of the date field is a maximum of 20. If a length is not
specified,
the length is a maximum of 20. If a length is not specified, then the
length
depends on the length of the date mask.
- The SEQUENCE function generates a unique value in the
column LOADSEQ. This
function finds the current maximum value in column LOADSEQ and adds the
increment (1) to it to obtain the value for LOADSEQ for each row
inserted.
- BEGINDATA specifies the end of the control information
and the beginning
of the data.
- Although each physical record equals one logical
record, the fields vary
in length so that some records are longer than others. Note also that
several
rows have null values for COMM.
Case
4: Loading Combined Physical Records
- Combining multiple physical records to form one
logical record with
CONTINUEIF
- Inserting negative numbers.
- Indicating with REPLACE that the table should be
emptied before the new
data is inserted
- Specifying a discard file in the control file using
DISCARDFILE
- Specifying a maximum number of discards using
DISCARDMAX
- Rejecting records due to duplicate values in a unique
index or due to
invalid
data values
Control File
LOAD DATA
INFILE
'ulcase4.dat'
1)
DISCARDFILE 'ulcase4.dsc'
2)
DISCARDMAX 999
3) REPLACE
4)
CONTINUEIF THIS (1) = '*'
INTO TABLE
emp
(empno
POSITION(1:4) INTEGER
EXTERNAL,
ename
POSITION(6:15) CHAR,
job
POSITION(17:25) CHAR,
mgr
POSITION(27:30) INTEGER EXTERNAL,
sal
POSITION(32:39) DECIMAL EXTERNAL,
comm
POSITION(41:48) DECIMAL EXTERNAL,
deptno
POSITION(50:51) INTEGER EXTERNAL,
hiredate
POSITION(52:60)
INTEGER EXTERNAL)
Notes:
- DISCARDFILE specifies a discard file named ULCASE4.DSC.
- DISCARDMAX specifies a maximum of 999 discards allowed
before
terminating
the run (for all practical purposes, this allows all discards).
- REPLACE specifies that if there is data in the table
being loaded, then
SQL*Loader should delete that data before loading new data.
- CONTINUEIF THIS specifies that if an asterisk is found
in column 1 of
the
current record, then the next physical record after that record should
be appended to it to from the logical record. Note that column 1 in
each
physical record should then contain either an asterisk or a non-data
value.
Data File
The datafile for this case, ULCASE4.DAT, is listed
below.
Note the asterisks in the first position and, though not visible, a new
line indicator is in position 20 (following "MA", "PR", and so on).
Note
that CLARK's commission is -10, and SQL*Loader loads the value
converting
it to a negative number.
*7782 CLARK
MANAGER 7839 2572.50
-10
2512-NOV-85
*7839 KING
PRESIDENT
5500.00
2505-APR-83
*7934 MILLER
CLERK 7782
920.00
2508-MAY-80
*7566 JONES
MANAGER 7839
3123.75
2517-JUL-85
*7499 ALLEN
SALESMAN 7698 1600.00 300.00 25
3-JUN-84
*7654 MARTIN
SALESMAN 7698 1312.50 1400.00
2521-DEC-85
*7658 CHAN
ANALYST 7566
3450.00
2516-FEB-84
*
CHEN ANALYST 7566
3450.00
2516-FEB-84
*7658 CHIN
ANALYST 7566
3450.00
2516-FEB-84
Rejected Records
The last two records are rejected, given two
assumptions.
If there is a unique index created on column EMPNO, then the record for
CHIN will be rejected because his EMPNO is identical to CHAN's. If
EMPNO
is defined as NOT NULL, then CHEN's record will be rejected because it
has no value for EMPNO.
Case
5: Loading Data in Multiple Tables
- Loading multiple tables
- Using SQL*Loader to break down repeating groups in a
flat file and load
the data into normalized tables -- one file record may generate
multiple
database rows
- Deriving multiple logical records from each physical
record
- Using a WHEN clause
- Loading the same field (EMPNO) into multiple tables
Control File
-- Loads
EMP records from first 23
characters
-- Creates
and loads PROJ records for each
PROJNO listed
-- for
each employee
LOAD DATA
INFILE
'ulcase5.dat'
BADFILE
'ulcase5.bad'
DISCARDFILE 'ulcase5.dsc'
1) REPLACE
2)
INTO TABLE emp
(empno
POSITION(1:4)
INTEGER EXTERNAL,
ename
POSITION(6:15)
CHAR,
deptno
POSITION(17:18)
CHAR,
mgr
POSITION(20:23)
INTEGER EXTERNAL)
2) INTO
TABLE proj
-- PROJ
has two columns, both not null:
EMPNO
and PROJNO
3) WHEN
projno != ' '
(empno
POSITION(1:4)
INTEGER EXTERNAL,
3)
projno
POSITION(25:27)
INTEGER EXTERNAL) -- 1st proj
3) INTO
TABLE proj
4) WHEN
projno != ' '
(empno
POSITION(1:4)
INTEGER EXTERNAL,
4)
projno
POSITION(29:31
INTEGER EXTERNAL) -- 2nd proj
2) INTO
TABLE proj
5) WHEN
projno != ' '
(empno
POSITION(1:4)
INTEGER EXTERNAL,
5) projno
POSITION(33:35) INTEGER
EXTERNAL)
-- 3rd proj
Notes:
- REPLACE specifies that if there is data in the tables
to be loaded (EMP
and PROJ), SQL*loader should delete the data before loading new rows.
- Multiple INTO clauses load two tables, EMP and PROJ.
The same set of
records
is processed three times, using different combinations of columns each
time to load table PROJ.
- WHEN loads only rows with non-blank project numbers.
When PROJNO is
defined
as columns 25...27, rows are inserted into PROJ only if there is a
value
in those columns.
- When PROJNO is defined as columns 29...31, rows are
inserted into PROJ
only if there is a value in those columns.
- When PROJNO is defined as columns 33...35, rows are
inserted into PROJ
only if there is a value in those columns.
Data File
1234
BAKER 10 9999 101
102
103
1234
JOKER 10 9999 777
888
999
2664
YOUNG 20 2893 425
abc
102
5321
OTOOLE 10 9999 321
55
40
2134
FARMER 20 4555 236 456
2414
LITTLE 20 5634 236
456
40
6542
LEE 10
4532
102 321 14
2849
EDDS xx
4555
294 40
4532
PERKINS 10 9999 40
1244
HUNT 11 3452
665
133 456
123
DOOLITTLE 12
9940
132
1453 MACDONALD 25
5532 200
Case
6: Loading Using the direct Path Load Method
- Use of the direct path load method to load and index
data
- How to specify the indexes for which the data is
pre-sorted.
- Loading all-blank numeric fields as null
- The NULLIF clause
- Note: Specify the name of the table into which you
want to load data;
otherwise,
you will see LDR-927. Specifying DIRECT=TRUE as a command-line
parameter
is not an option when loading into a synonym for a table.
In this example, field positions and datatypes are
specified
explicitly.
Control File
LOAD DATA
INFILE
'ulcase6.dat'
INSERT
INTO TABLE
emp
1) SORTED
INDEXES (empix)
2)(empno
POSITION(01:04) INTEGER EXTERNAL NULLIF
empno=BLANKS,
ename POSITION(06:15) CHAR,
job
POSITION(17:25)
CHAR,
mgr
POSITION(27:30)
INTEGER EXTERNAL NULLIF mgr=BLANKS,
sal
POSITION(32:39)
DECIMAL EXTERNAL NULLIF sal=BLANKS,
comm POSITION(41:48)
DECIMAL
EXTERNAL NULLIF comm=BLANKS,
deptno
POSITION(50:51) INTEGER
EXTERNAL
NULLIF deptno=BLANKS)
Notes:
- The SORTED INDEXES clause identifies
indexes:presorting data:case study
the indexes on which the data is sorted. This clause indicates that the
datafile is sorted on the columns in the EMPIX index. This clause
allows
SQL*Loader to optimize index creation by eliminating the sort phase for
this data when using the direct path load method.
- The NULLIF...BLANKS clause specifies that the column
should be loaded
as
NULL if the field in the datafile consists of all blanks
Case
7: Extracting Data from a formatted Report
- Using SQL*Loader with an INSERT trigger
- Use of the SQL string to manipulate data
- Different initial and trailing delimiters
- Use of SYSDATE
- Use of the TRAILING NULLCOLS clause
- Ambiguous field length warnings
Note: This example creates a trigger that
uses the last value of unspecified fields.
Data File
The following listing of the report shows the data to
be
loaded:
Today's
Newly Hired Employees
Dept
Job
Manager
MgrNo Emp Name EmpNo Salary (Comm)
----
-------- -------- -----
--------
----- --------- ------
20
Salesman
Blake
7698 Shepard 8061
$1,600.00
(3%)
Falstaff 8066 $1,250.00 (5%)
Major 8064 $1,250.00
(14%)
30
Clerk
Scott
7788 Conrad 8062 $1,100.00
Ford 7369
DeSilva 8063 $800.00
Manager
King
7839 Provo 8065
$2,975.00
Insert Trigger
In this case, a BEFORE INSERT trigger is required to
fill
in department number, job name, and manager's number when these fields
are not present on a data line. When values are present, they should be
saved in a global variable. When values are not present, the global
variables
are used.
The INSERT trigger and the package defining the global
variables is:
CREATE OR REPLACE
PACKAGE uldemo7 AS --
Global
Package Variables
last_deptno NUMBER(2);
last_job
VARCHAR2(9);
last_mgr
NUMBER(4);
END
uldemo7;
/
CREATE OR REPLACE
TRIGGER uldemo7_emp_insert
BEFORE INSERT ON
emp
FOR EACH ROW
BEGIN
IF :new.deptno
IS NOT NULL THEN
uldemo7.last_deptno :=
:new.deptno;
-- save value for later
ELSE
:new.deptno :=
uldemo7.last_deptno;
-- use last valid value
END IF;
IF :new.job IS
NOT NULL THEN
uldemo7.last_job := :new.job;
ELSE
:new.job := uldemo7.last_job;
END IF;
IF :new.mgr IS
NOT NULL THEN
uldemo7.last_mgr := :new.mgr;
ELSE
:new.mgr := uldemo7.last_mgr;
END IF;
END;
/
Note: The phrase FOR EACH ROW is
important.
If it was not specified, the INSERT trigger would only fire once for
each
array of inserts because SQL*Loader uses the array interface.
Control File
LOAD DATA
INFILE
'ULCASE7.DAT'
APPEND
INTO TABLE
emp
1)
WHEN (57) = '.'
2) TRAILING
NULLCOLS
3)
(hiredate SYSDATE,
4)
deptno
POSITION(1:2)
INTEGER EXTERNAL(3)
5)
NULLIF deptno=BLANKS,
job
POSITION(7:14)
CHAR TERMINATED BY WHITESPACE
6)
NULLIF job=BLANKS "UPPER(:job)",
7)
mgr
POSITION(28:31)
INTEGER EXTERNAL
TERMINATED BY WHITESPACE, NULLIF mgr=BLANKS,
ename
POSITION(34:41)
CHAR
TERMINATED BY WHITESPACE "UPPER(:ename)",
empno POSITION(45)
INTEGER EXTERNAL
TERMINATED BY WHITESPACE,
sal
POSITION(51)
CHAR TERMINATED BY WHITESPACE
8)
"TO_NUMBER(:sal,'$99,999.99')",
9)
comm INTEGER
EXTERNAL
ENCLOSED BY '(' AND '%'
":comm * 100"
)
Notes:
- The decimal point in column 57 (the salary field)
identifies a line
with
data on it. All other lines in the report are discarded.
- The TRAILING NULLCOLS clause causes SQL*Loader to
treat any fields that
are missing at the end of a record as null. Because the commission
field
is not present for every record, this clause says to load a null
commission
instead of rejecting the record when only six fields are found instead
of the expected seven.
- Employee's hire date is filled in using the current
system date.
- This specification generates a warning message because
the specified
length
does not agree with the length determined by the field's position. The
specified length (3) is used.
- Because the report only shows department number, job,
and manager when
the value changes, these fields may be blank. This control file causes
them to be loaded as null, and an RDBMS insert trigger fills in the
last
valid value.
- The SQL string changes the job name to uppercase
letters.
- It is necessary to specify starting position here. If
the job field and
the manager field were both blank, then the job field's TERMINATED BY
BLANKS
clause would cause SQL*Loader to scan forward to the employee name
field.
Without the POSITION clause, the employee name field would be
mistakenly
interpreted as the manager field.
- Here, the SQL string translates the field from a
formatted character
string
into a number. The numeric value takes less space and can be printed
with
a variety of formatting options.
- In this case, different initial and trailing
delimiters pick the
numeric
value out of a formatted field. The SQL string then converts the value
to its stored form.
Case
8: Loading Partitioned Tables
- Partitioning of data
- Explicitly defined field positions and datatypes.
- Loading using the fixed record length option
Control File
LOAD DATA
1) INFILE
'ulcase10.dat' "fix 129"
BADFILE 'ulcase10.bad'
TRUNCATE
INTO TABLE lineitem
PARTITION (ship_q1)
2)
(l_orderkey
position
(1:6) char,
l_partkey
position (7:11) char,
l_suppkey
position (12:15) char,
l_linenumber
position
(16:16) char,
l_quantity
position (17:18) char,
l_extendedprice position
(19:26)
char,
l_discount
position (27:29) char,
l_tax
position (30:32) char,
l_returnflag
position
(33:33) char,
l_linestatus
position
(34:34) char,
l_shipdate
position (35:43) char,
l_commitdate
position
(44:52) char,
l_receiptdate
position
(53:61) char,
l_shipinstruct position
(62:78) char,
l_shipmode
position (79:85) char,
l_comment
position (86:128) char)
Notes:
- Specifies that each record in the datafile is of fixed
length (129
characters
in this example). See Input Data and Datafiles.
- Identifies the column name and location of the data in
the datafile to
be loaded into each column.
Table Creation
In order to partition the data the lineitem table is
created
using four (4) partitions according to the shipment date:
Create table lineitem
(l_orderkey
number,
l_partkey
number,
l_suppkey
number,
l_linenumber
number,
l_quantity
number,
l_extendedprice
number,
l_discount
number,
l_tax
number,
l_returnflag
char,
l_linestatus
char,
l_shipdate
date,
l_commitdate
date,
l_receiptdate
date,
l_shipinstruct
char(17),
l_shipmode
char(7),
l_comment
char(43)
)
partition by range
(l_shipdate)
(
partition ship_q1
values less than
(TO_DATE('01-APR-1996',
'DD-MON-YYYY'))
tablespace p01,
partition ship_q2
values less than
(TO_DATE('01-JUL-1996',
'DD-MON-YYYY'))
tablespace p02,
partition ship_q3
values less than
(TO_DATE('01-OCT-1996',
'DD-MON-YYYY'))
tablespace p03,
partition ship_q4
values less than
(TO_DATE('01-JAN-1997',
'DD-MON-YYYY'))
tablespace p04
)
Input Data File
The datafile for this case, ULCASE8.DAT, is listed
below.
Each record is 129 characters in length. Note that five(5) blanks
precede
each record in the file.
1 151978511724386.60
7.04.0NO09-SEP-6412-FEB-9622-MAR-96DELIVER
IN PERSONTRUCK iPBw4mMm7w7kQ zNPL i261OPP
1 2731
73223658958.28.09.06NO12-FEB-9628-FEB-9620-APR-96TAKE
BACK RETURN MAIL 5wM04SNyl0AnghCP2nx lAi
1 3370 3713 810210.96
.1.02NO29-MAR-9605-MAR-9631-JAN-96TAKE
BACK RETURN REG AIRSQC2C 5PNCy4mM
1 5214
46542831197.88.09.06NO21-APR-9630-MAR-9616-MAY-96NONE
AIR Om0L65CSAwSj5k6k
1 6564
6763246897.92.07.02NO30-MAY-9607-FEB-9603-FEB-96DELIVER
IN PERSONMAIL CB0SnyOL PQ32B70wB75k 6Aw10m0wh
1 7403 160524 31329.6
.1.04NO30-JUN-9614-MAR-9601
APR-96NONE FOB C2gOQj OB6RLk1BS15 igN
2 8819 82012441659.44
0.08NO05-AUG-9609-FEB-9711-MAR-97COLLECT
COD AIR O52M70MRgRNnmm476mNm
3 9451 721230
41113.5.05.01AF05-SEP-9629-DEC-9318-FEB-94TAKE
BACK RETURN FOB 6wQnO0Llg6y
3 9717
1834440788.44.07.03RF09-NOV-9623-DEC-9315-FEB-94TAKE
BACK RETURN SHIP LhiA7wygz0k4g4zRhMLBAM
3 9844 1955 6
8066.64.04.01RF28-DEC-9615-DEC-9314-FEB-94TAKE
BACK RETURN REG AIR6nmBmjQkgiCyzCQBkxPPOx5j4hB 0lRywgniP1297
Case
9: Loading LOB Files (CLOBs)
- Adding a CLOB column called RESUME to the table emp.
- Using a FILLER field (RES_FILE).
- Loading multiple LOBFILEs into the emp table.
Control File
LOAD DATA
INFILE *
INTO TABLE EMP
REPLACE
FIELDS TERMINATED BY ','
(
EMPNO INTEGER EXTERNAL,
ENAME CHAR,
JOB CHAR,
MGR INTEGER
EXTERNAL,
SAL DECIMAL
EXTERNAL,
COMM DECIMAL EXTERNAL,
DEPTNO INTEGER EXTERNAL,
1) RES_FILE
FILLER CHAR,
2) "RESUME"
LOBFILE (RES_FILE) TERMINATED BY EOF
NULLIF RES_FILE = 'NONE'
)
BEGINDATA
7782,CLARK,MANAGER,7839,2572.50,,10,ulcase91.dat
7839,KING,PRESIDENT,,5500.00,,10,ulcase92.dat
7934,MILLER,CLERK,7782,920.00,,10,ulcase93.dat
7566,JONES,MANAGER,7839,3123.75,,20,ulcase94.dat
7499,ALLEN,SALESMAN,7698,1600.00,300.00,30,ulcase95.dat
7654,MARTIN,SALESMAN,7698,1312.50,1400.00,30,ulcase96.dat
7658,CHAN,ANALYST,7566,3450.00,,20,NONE
Notes:
- This is a filler field. The filler field is assigned
values from the
datafield
to which it is mapped.
- RESUME is loaded as a CLOB. The LOBFILE function is
used to specify the
name of the field that specifies name of the file which contains the
data
for the LOB field.
This note explains how to use the functions TRIM,
TO_CHAR/TO_NUMBER,
and user defined functions in connection with SQL*Loader using the
following
example:
-- Create the table
DROP TABLE TEST;
CREATE TABLE TEST
(
ID
NUMBER PRIMARY KEY,
FIRST_NAME VARCHAR2(20),
LAST_NAME VARCHAR2(20) NOT NULL,
DEPARTMENT VARCHAR2(20) NOT NULL,
SALARY
NUMBER(10,2)
NOT NULL,
BONUS
NUMBER(10,2),
DESCRIPTION VARCHAR2(50),
TOTAL
NUMBER(10,2)
);
-- Create the user defined functions (used by SQL*Loader)
CREATE OR REPLACE FUNCTION GET_BONUS (dept IN VARCHAR2)
RETURN NUMBER AS
retval NUMBER(10,2);
BEGIN
retval := NULL;
if upper (dept) in ('CLERK', 'SALESMAN') then
if to_char (sysdate, 'YYYY') = '2002' then
retval := 9.2;
else
retval := 7.88;
end if;
elsif upper (dept) = 'ANALYST' then
retval := 18.7;
elsif upper (dept) = 'MANAGER' then
retval := -5.92;
end if;
return (retval);
END;
/
CREATE OR REPLACE FUNCTION CALC_SAL (sal IN NUMBER, bon IN
NUMBER)
RETURN NUMBER AS
retval NUMBER(10,2);
BEGIN
if bon is null then
retval := sal;
else
retval := round (sal + (sal * bon / 100), 2);
end if;
return (retval);
END;
/
The table TEST will be loaded with this control file:
LOAD DATA
INFILE *
INTO TABLE TEST
TRUNCATE
FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
ID
SEQUENCE,
FIRST_NAME "TRIM
(:FIRST_NAME)",
-- 1)
LAST_NAME "TRIM ('x' FROM
:LAST_NAME)",
-- 2)
DEPARTMENT,
SALARY "TO_NUMBER
(:SALARY,
'99999D99',
-- 3)
'NLS_NUMERIC_CHARACTERS='',.''')",
BONUS "GET_BONUS
(:DEPARTMENT)",
-- 4)
DESCRIPTION "'Salary: '||:SALARY||' ->
'||
-- 5)
DECODE (GET_BONUS (:DEPARTMENT), NULL, 'No bonus',
'Bonus: '||TO_CHAR (GET_BONUS (:DEPARTMENT), 'S09D90'))",
TOTAL "CALC_SAL
(TO_NUMBER (:SALARY,
'99999D99',
-- 6)
'NLS_NUMERIC_CHARACTERS='',.'''), GET_BONUS (:DEPARTMENT))"
)
BEGINDATA
" Martin ";"xxxxxxxSmithxxx";CLERK;2459,25
" David ";"xxxxxxxAllenxxx";SALESMAN;4563,9
" Brad
";"xxxxxxxWardxxxx";SALESMAN;4815,81
" Marvin ";"xxxxxxxJonesxxx";MANAGER;9765,33
" Dean
";"xxxxxxxMartinxx";SALESMAN;4214,56
" John
";"xxxxxxxBlakexxx";MANAGER;10333,87
" Clark ";"xxxxxxxGablexxx";MANAGER;11011,11
" Scott ";"xxxxxxxTigerxxx";ANALYST;6865,88
" Ralph
";"xxxxxxxKingxxxx";PRESIDENT;18955,45
" Tina
";"xxxxxxxTurnerxx";SALESMAN;5324,44
" Bryan ";"xxxxxxxAdamsxxx";CLERK;1899,48
" Jesse ";"xxxxxxxJamesxxx";CLERK;2288,99
" John
";"xxxxxxxFordxxxx";ANALYST;7564,83
" John
";"xxxxxxxMillerxx";CLERK;1865,93
1) TRIM deletes the leading/trailing blanks in the column
FIRST_NAME
(i.e. " Martin " becomes "Martin")
2) TRIM deletes the leading/trailing 'x' characters in the column
LAST_NAME
(i.e. "xxxxxxxSmithxxx" becomes "Smith")
3) TO_NUMBER shows that the format of the numbers in the column SALARY
is in the form: 99999D99. That means max. 5 digits integer with max. 2
digit
post-decimal positions. The decimal separator is ','.
If the format is not specified, then the records are not loaded
(ORA-1722
invalid number, if NLS_NUMERIC_CHARACTERS = '.,')
4) The column BONUS is calculated with the user defined function
GET_BONUS.
The Function expects an input parameter, DEPARTMENT (VARCHAR2), and
returns the value, BONUS (NUMBER(2,2))
5) The column DESCRIPTION is a composition of the information from
the previous columns. The Function DECODE checks if a bonus is
available
to the department. If no bonus is available, then the message 'No
bonus'
will be printed. The new thing here is the function TO_CHAR. This
function
modifies the format of the BONUS in this form: sign, 2 integer digits
with
leading zeros, decimal separator, 2 post-decimal positions with
trailing
zeros.
6) The column TOTAL is calculated with the user defined function
CALC_SAL
(the BONUS, if available, is applied to the SALARY)
The result after the loading procedure looks like this in the table
TEST:
SQL> select * from test;
ID
FIRST_NAME
LAST_NAME
DEPARTMENT
SALARY
---------- -------------------- --------------------
--------------------
----------
1
Martin
Smith
CLERK
2459.25
2
David
Allen
SALESMAN
4563.9
3
Brad
Ward
SALESMAN
4815.81
4
Marvin
Jones
MANAGER
9765.33
5
Dean
Martin
SALESMAN
4214.56
6
John
Blake
MANAGER
10333.87
7
Clark
Gable
MANAGER
11011.11
8
Scott
Tiger
ANALYST
6865.88
9
Ralph
King
PRESIDENT
18955.45
10
Tina
Turner
SALESMAN
5324.44
11
Bryan
Adams
CLERK
1899.48
12
Jesse
James
CLERK
2288.99
13
John
Ford
ANALYST
7564.83
14
John
Miller
CLERK
1865.93
This small example shows how to call a stored function. It is possible
to call any built-in or user-defined function
during load process. Usually it's done for date columns, when
non-default date format must be used, however user-defined
function(s) may be called to perform some application logic.
The syntax for such calls is quite simple:
LOAD DATA
INFILE *
APPEND
INTO TABLE dept (
deptno POSITION(01:02) INTEGER
EXTERNAL,
dname POSITION(03:16) CHAR
"LDR_PCK.NAME(:dname, :deptno)",
loc POSITION(17:29) CHAR
)
BEGINDATA
21Dep Loc
22Dep Loc
Here
LDR_PCK.NAME is name of package
and function,
:dnname and
:deptno
are parameters.
When calling user-defined functions we must remember that only
conventional path may be used. An
attempt to use
direct path will cause an
error:
SQL*Loader-00417 SQL String (on column column_name) not allowed
in direct path.
The OPTIONS clause is useful when you usually invoke a control file
with the same set of options, or when the command line and all its
arguments becomes very long. This clause allows you to specify runtime
arguments in the control file rather than on the command line.
SKIP = n |
-- Number of logical records to skip (DEFAULT 0) |
LOAD = n |
-- Number of logical records to load (DEFAULT all) |
ERRORS = n |
-- Number of errors to allow (DEFAULT 50) |
ROWS = n |
-- Number of rows in conventional path bind array (DEFAULT
64) |
BINDSIZE = n |
-- Size of conventional path bind array in bytes |
SILENT = {HEADER | FEEDBACK | ERROR
| DISCARDS | ALL } |
|
-- Suppress messages during run |
For example:
OPTIONS (BINDSIZE=10000, SILENT=(ERRORS, FEEDBACK) )
Values specified on the command line override values specified in
the control file. With this precedence, the OPTIONS keyword in the
control file established default values that are easily changed from
the command line.
Continuing Interrupted
Loads
If SQL*Loader runs out of space for data rows or index entries, the
load is discontinued. (For example, the table might reach its maximum
number of extents.) Discontinued loads can be continued after more
space is made available.
When a load is discontinued, any data already loaded remains in the
tables, and the tables are left in a valid state. SQL*Loader's log file
tells you the state of the tables and indexes and the number of logical
records already read from the input data file. Use this information to
resume the load where it left off.
For example:
SQLLOAD / CONTROL=FAST1.CTL SKIP=345
CONTINUE\_LOAD DATA statement is used to continue a discontinued
direct path load involving multiple tables with a varying number of
records to skip. For more information on this command, see chapter 6 of
``ORACLE7 Server Utilities Users Guide''.
Identifying Data Files
To specify the file containing the data to be loaded, use the INFILE
or INDDN keyword, followed by the filename. A filename specified on the
command line overrides the first INFILE or INDDN statement in the
control file. If no filename is specified, the filename defaults to the
control filename with an extension or filetype of DAT.
Loading into
Non-Empty Database Tables
SQL*Loader does not update existing records, even if they have null
columns. If the tables you are loading already contain data, you have
three choices for how SQL*Loader should proceed:
INSERT - This is the default option. It requires the table to be
empty before loading. SQL*Loader terminates with an error if the table
contains rows.
APPEND - If data already exists in the table, SQL*Loader appends
the new rows to it; if data doesn't already exist, the new rows are
simply loaded.
REPLACE - All rows in the table are deleted and the new data is
loaded. This option requires DELETE privileges on the table.
You can create one logical record from multiple physical records
using CONCATENATE and CONTINUEIF. See chapter 6 of ``ORACLE7 Server
Utilities Users Guide''.
Loading Logical
Records into Tables
The INTO TABLE clause allows you to tell which table you want to
load data into. To load multiple tables, you would include one INTO
TABLE clause for each table you wish to load.
The INTO TABLE clause may continue with some options for loading
that table. For example, you may specify different options (INSERT,
APPEND, REPLACE) for each table in order to tell SQL*Loader what to do
if data already exists in the table.
The WHEN clause appears after the table name and is followed by one
or more field conditions. For example, the following clause indicates
that any record with the value ``q'' in the fifth column position
should be loaded:
WHEN (5) = 'q'
A WHEN clause can contain several comparisons as long as each is
preceded by AND. Parentheses are optional but should be used for
clarity with multiple comparisons joined by AND. For example:
WHEN (DEPTNO = '10') AND (JOB = 'SALES')
To evaluate the WHEN clause, SQL*Loader first determines the values
of all the fields in the record. Then the WHEN clause is evaluated. A
row is inserted into the table only if the WHEN clause is true.
When the control file specifies more fields for a record than are
present in the record, SQL*Loader must determine whether the remaining
(specified) columns should be considered null, or whether an error
should be generated. TRAILING NULLCOLS clause tells SQL*Loader to treat
any relatively positioned columns that are not present in the record as
null columns. For example, if the following data
10 Accounting
is read with the following control file
INTO TABLE dept
TRAILING NULLCOLS
( deptno CHAR TERMINATED BY " ",
dname CHAR TERMINATED BY WHITESPACE,
loc CHAR TERMINATED BY WHITESPACE )
and the record ends after DNAME, then the remaining LOC field is set
to null. Without the TRAILING NULLCOLS clause, an error would be
generated, due to missing data.