Oracle SQL*Loader and External Tables
This post gives a brief introduction to the two oracle’s technologies for loading external data into a database tables – SQL*Loader (SQLLDR) and External Table and later provides some guidelines on when to chose what.
SQL*Loader loads data from external files into tables in the Oracle database. SQL*Loader uses two primary files: the datafile, which contains the information to be loaded, and the control file, which contains information on the format of the data, the records and fields within the file, the order in which they are to be loaded, and even, when needed, the names of the multiple files that will be used for data.
When executed, SQL*Loader will automatically create a log file and a “bad” file. The log file records the status of the load, such as the number of rows processed and the number of rows committed. The “bad” file will contain all the rows that were rejected during the load due to data errors, such as nonunique values in primary key columns. Within the control file, you can specify additional commands to govern the load criteria. If these criteria are not met by a row, the row will be written to a “discard” file. The log, bad, and discard files will by default have the extensions .log, .bad, and .dsc, respectively.
It has two modes of operation:
• Conventional path: SQLLDR will employ SQL inserts on our behalf to load data.
• Direct path: SQLLDR does not use SQL in this mode; it formats database blocks directly.
The direct path load allows you to read data from a flat file and write it directly to formatted database blocks, bypassing the entire SQL engine, undo generation and, optionally, redo generation at the same time. Parallel direct path load is among the fastest ways to go from having no data to a fully loaded database.
External tables were first introduced in Oracle9i Release 1. Put simply, they allow us to treat an operating system file as if it is a read-only database table. External tables have limits—
1. No DML operations are allowed.
2. Indexes cannot be created. The lack of indexes on external tables does not have to be aTnegative factor in application performance. Queries of external tables complete very quickly, even though a full table scan is required with each access. There is I/O involved, but modern I/O systems use caching and RAID techniques to significantly reduce the performance penalty associated with repeated full scans of the same file.
3. Virtual columns cannot be added.
4. Constraints cannot be added.
When you create an external table, you specify its type. Each type of external table has its own access driver that provides access parameters unique to that type of external table. The access driver ensures that data from the data source is processed so that it matches the definition of the external table.
The default type for external tables is ORACLE_LOADER, which lets you read table data from an external table and load it into a database. Oracle Database also provides the ORACLE_DATAPUMP type, which lets you unload data (that is, read data from a table in the database and insert it into an external table) and then reload it into an Oracle database.
To access external files from within Oracle, you must first use the create directory command to define a directory object pointing to the external files’ location. After this, execute the create table command with the organization external clause.
create directory BOOK_DIR as 'e:\oracle\external'; grant read on directory BOOK_DIR to practice; grant write on directory BOOK_DIR to practice; create table BOOKSHELF_EXT (Title VARCHAR2(100), Publisher VARCHAR2(20), CategoryName VARCHAR2(20), Rating VARCHAR2(2)) organization external (type ORACLE_LOADER default directory BOOK_DIR access parameters (records delimited by newline fields terminated by "~" (Title CHAR(100), Publisher CHAR(20), CategoryName CHAR(20), Rating CHAR(2))) location ('bookshelf_dump.lst'));
External Tables vs SQL*Loader
Some of the key functionality features that external tables have over SQLLDR in my experience are as follows:
• The ability to use complex WHERE conditions to selectively load data. SQLLDR has a WHEN clause to select rows to load, but you are limited to using only AND expressions and expressions using equality—no ranges (greater than, less than), no OR expressions, no IS NULL, and so on.
• The ability to MERGE data. You can take an operating system file full of data and update existing database records from it.
• The ability to perform efficient code lookups. You can join an external table to other database tables as part of your load process.
• Easier multitable inserts using INSERT. Starting in Oracle9i, an INSERT statement can insert into one or more tables using complex WHEN conditions. While SQLLDR can load into multiple tables, it can be quite complex to formulate the syntax.
• A shallower learning curve for new developers. SQLLDR is “yet another tool” to learn, in addition to the programming language, the development tools, the SQL language, and so on. As long as a developer knows SQL, he can immediately apply that knowledge to bulk data loading, without having to learn a new tool (SQLLDR).
SQLLDR should be chosen over external tables in the following three situations:
• You have to load data over a network—in other words, when the input file is not on the database server itself. Because the access driver is part of the Oracle software, only the files accessible to the database can be access as external tables.
• Multiple users must concurrently work with the same external table processing different input files.
• You have to work with LOB types. External tables do not support LOBs.