Multi Table Inserts and MERGE


Multitable inserts allow a single INSERT INTO .. SELECT statement to conditionally, or non-conditionally, insert into multiple tables. This statement reduces table scans and PL/SQL code necessary for performing multiple conditional inserts compared to previous versions. It's main use is for the ETL process in data warehouses where it can be parallelized and/or convert non-relational data into a relational format.

Unconditional Multi-table Inserts

There are actually two types of multi-table insert statements: conditional and unconditional. Let’s look at the latter first, we’ll return to conditional multi-insert statements a little later.

Unconditional multi-table insert statement
There are a few things that I’d like you to notice:

  1. The ALL keyword is mandatory with unconditional multi-table insert statements.
  2. You can have as many pairs of the insert-into clause and values clause as you require. (If you had only one it wouldn’t be a multi-table insert, would it?)
  3. The source subquery is mandatory.

INSERT ALL
INTO table1 (col1, col2, col3) values (source_table.col1, source_table.col2, source_table.col3)
INTO table2 (col1, col2, col4) values (source_table.col1, source_table.col2, source_table.col4)
INTO table3(col1, col5, col6) values (source_table.col1, source_table.col5, source_table.col6)
SELECT * FROM source_table;

The above statement will take the relevant columns from source_table and insert them into table1, table2 andtable3. Without multi-table inserts, this would have taken three separate insert statements.

And that’s that for unconditional multi-table inserts; nice and straightforward.

Oh, by the way, if you’re finding that requirement for a source subquery a little restrictive, here’s a nice little trick for you: select from dual.

INSERT ALL
INTO table1 (col1, col2, col3) values (1, 'Mexico', sysdate + 1)
INTO table2 (col1, col2, col4) values (2, 'Tonga', sysdate + 2)
INTO table3 (col1, col4, col6) values (3, 'Norway', sysdate + 3)
SELECT * FROM dual;
Conditional Multi-table Inserts

Conditional multi-table inserts are cleverer than their unconditional cousins and, at first glance, look a lot more complicated. Trust me, they aren’t. Just imagine an Insert statement and a Case statement got drunk together and, nine months later, had a baby; it’d probably look a lot like a conditional multi-table insert. Here’s the syntax:

Let’s dive straight in with an example. Imagine we created 3 new tables – Top_earners, mid_earners and low_earners – and we wanted to populate them from the standard EMP table, based on which tax bracket each member of staff falls into.

INSERT ALL
WHEN (sal >= 3000) THEN
INTO top_earners (emp_id, emp_name, emp_sal) VALUES (empno, ename, sal)
WHEN (sal < 3000 AND sal >= 1500) THEN
INTO mid_earners (emp_id, emp_name, emp_sal) VALUES (empno, ename, sal)
WHEN (sal < 1500) THEN
INTO low_earners (emp_id, emp_name, emp_sal) VALUES (empno, ename, sal)
SELECT * FROM emp;

Taking another look at our statement, I realise that we can tweak it a little, using the ELSE clause.

INSERT ALL
WHEN (sal >= 3000) THEN
INTO top_earners (emp_id, emp_name, emp_sal) VALUES (empno, ename, sal)
WHEN (sal < 3000 AND sal >= 1500) THEN
INTO mid_earners (emp_id, emp_name, emp_sal) VALUES (empno, ename, sal)
ELSE
INTO low_earners (emp_id, emp_name, emp_sal) VALUES (empno, ename, sal)
SELECT * FROM emp;

The final piece of the jigsaw puzzle that we need to talk about is the FIRST keyword that can be used in conjunction with conditional multi-table insert statements (but not with unconditional multi-table inserts).

If you begin your statement with INSERT FIRST instead of INSERT ALL, Oracle will analyse each WHEN condition (in order, from the top) and once it has found a condition that resolves to TRUE it will carry out that insert and not analyse any of the subsequent WHEN conditions. It will do this for each row returned by the source subquery.

Is that clear? No? I didn’t think so. Let me see if I can make it a little clearer by rewriting our EMP example using INSERT FIRST. Remember that we want to segregate our employees based on how much money they earn.

INSERT FIRST
WHEN (sal >= 3000) THEN
INTO top_earners (emp_id, emp_name, emp_sal) VALUES (empno, ename, sal)
WHEN (sal >= 1500) THEN
INTO mid_earners (emp_id, emp_name, emp_sal) VALUES (empno, ename, sal)
WHEN (sal > 0 ) THEN
INTO low_earners (emp_id, emp_name, emp_sal) VALUES (empno, ename, sal)
SELECT * FROM emp;

The above statement would have made no sense with an INSERT ALL. King, whose salary is 5000, would have ended up being inserted into all three tables (since 5000 is greater than 3000, and 5000 is greater than 1500, and 5000 is greater than nothing). However, with INSERT FIRST, Oracle executes the insert statement associated with the first condition that is true, and ignores all subsequent ones. So when we get to Blake, who earns 2850, the top condition will resolve to false and be ignored, the second condition will resolve to true and so his details will be inserted into the mid_earners table; and even though the last condition would have resolved to true too, the database does not even glance at it.

Restrictions

I did say that there were a few restrictions that you’d have to bear in mind, didn’t I? Fortunately, they’re nothing major. However:

  1. You should not use sequences in multi-table insert statements.
  2. You cannot use a multi-table insert to write to a remote table.
  3. You can only use multi-table inserts with tables. Not views, materialized or otherwise.
  4. The sum of all the columns in the various insert-into clauses of a multi-table insert statement must not exceed 999.
  5. You cannot use a TABLE collection expression in a multi-table insert statement.


MERGE

The MERGE statement was introduced in Oracle 9i to conditionally insert or update data depending on its presence, a process also known as an "upsert". The MERGE statement reduces table scans and can perform the operation in parallel if required.

Syntax

Consider the following example where data from the HR_RECORDS table is merged into the EMPLOYEES table.

MERGE INTO employees e
    USING hr_records h
    ON (e.id = h.emp_id)
  WHEN MATCHED THEN
    UPDATE SET e.address = h.address
  WHEN NOT MATCHED THEN
    INSERT (id, address)
    VALUES (h.emp_id, h.address);

The source can also be a query.

MERGE INTO employees e
    USING (SELECT * FROM hr_records WHERE start_date > ADD_MONTHS(SYSDATE, -1)) h
    ON (e.id = h.emp_id)
  WHEN MATCHED THEN
    UPDATE SET e.address = h.address
  WHEN NOT MATCHED THEN
    INSERT (id, address)
    VALUES (h.emp_id, h.address);

Performance

The MERGE statement is optimized for merging sets of data, rather than single rows, as shown in the example below.

Create the following test tables. The source table contains all the rows from the ALL_OBJECTS view, while the destination table contains approximately half of the rows.

CREATE TABLE source_tab AS
SELECT object_id, owner, object_name, object_type
FROM   all_objects;

ALTER TABLE source_tab ADD (
  CONSTRAINT source_tab_pk PRIMARY KEY (object_id)
);

CREATE TABLE dest_tab AS
SELECT object_id, owner, object_name, object_type
FROM   all_objects WHERE ROWNUM <= 25000;

ALTER TABLE dest_tab ADD (
  CONSTRAINT dest_tab_pk PRIMARY KEY (object_id)
);

EXEC DBMS_STATS.gather_table_stats(USER, 'source_tab', cascade=> TRUE);
EXEC DBMS_STATS.gather_table_stats(USER, 'dest_tab', cascade=> TRUE);

The following code compares the performance of four merge operations. The first uses the straight MERGE statement. The second also uses the MERGE statement, but in a row-by-row manner. The third performs an update, and conditionally inserts the row if the update touches zero rows. The fourth inserts the row, then performs an update if the insert fails with a duplicate value on index exception.

SET SERVEROUTPUT ON
DECLARE
  TYPE t_tab IS TABLE OF source_tab%ROWTYPE;
  
  l_tab   t_tab;
  l_start NUMBER;
BEGIN

  l_start := DBMS_UTILITY.get_time;
  
  MERGE INTO dest_tab a
    USING source_tab b
    ON (a.object_id = b.object_id)
    WHEN MATCHED THEN
      UPDATE SET
        owner       = b.owner,
        object_name = b.object_name,
        object_type = b.object_type
    WHEN NOT MATCHED THEN
      INSERT (object_id, owner, object_name, object_type)
      VALUES (b.object_id, b.owner, b.object_name, b.object_type);

  DBMS_OUTPUT.put_line('MERGE        : ' || 
                       (DBMS_UTILITY.get_time - l_start) || ' hsecs');

  ROLLBACK;

  l_start := DBMS_UTILITY.get_time;

  SELECT *
  BULK COLLECT INTO l_tab
  FROM source_tab;
    
  FOR i IN l_tab.first .. l_tab.last LOOP
    MERGE INTO dest_tab a
      USING (SELECT l_tab(i).object_id AS object_id,
                    l_tab(i).owner AS owner,
                    l_tab(i).object_name AS object_name,
                    l_tab(i).object_type AS object_type
             FROM dual) b
      ON (a.object_id = b.object_id)
      WHEN MATCHED THEN
        UPDATE SET
          owner       = b.owner,
          object_name = b.object_name,
          object_type = b.object_type
      WHEN NOT MATCHED THEN
        INSERT (object_id, owner, object_name, object_type)
        VALUES (b.object_id, b.owner, b.object_name, b.object_type);
  END LOOP;

  DBMS_OUTPUT.put_line('ROW MERGE    : ' || 
                       (DBMS_UTILITY.get_time - l_start) || ' hsecs');

  ROLLBACK;

  l_start := DBMS_UTILITY.get_time;

  SELECT *
  BULK COLLECT INTO l_tab
  FROM source_tab;
    
  FOR i IN l_tab.first .. l_tab.last LOOP
    UPDATE dest_tab SET
      owner       = l_tab(i).owner,
      object_name = l_tab(i).object_name,
      object_type = l_tab(i).object_type
    WHERE object_id = l_tab(i).object_id;
    
    IF SQL%ROWCOUNT = 0 THEN
      INSERT INTO dest_tab (object_id, owner, object_name, object_type)
      VALUES (l_tab(i).object_id, l_tab(i).owner, l_tab(i).object_name, l_tab(i).object_type);
    END IF;
  END LOOP;

  DBMS_OUTPUT.put_line('UPDATE/INSERT: ' || 
                       (DBMS_UTILITY.get_time - l_start) || ' hsecs');

  ROLLBACK;

  l_start := DBMS_UTILITY.get_time;

  SELECT *
  BULK COLLECT INTO l_tab
  FROM source_tab;
    
  FOR i IN l_tab.first .. l_tab.last LOOP
    BEGIN
      INSERT INTO dest_tab (object_id, owner, object_name, object_type)
      VALUES (l_tab(i).object_id, l_tab(i).owner, l_tab(i).object_name, l_tab(i).object_type);
    EXCEPTION
      WHEN DUP_VAL_ON_INDEX THEN
        UPDATE dest_tab SET
          owner       = l_tab(i).owner,
          object_name = l_tab(i).object_name,
          object_type = l_tab(i).object_type
        WHERE object_id = l_tab(i).object_id;
    END;    
  END LOOP;

  DBMS_OUTPUT.put_line('INSERT/UPDATE: ' || 
                       (DBMS_UTILITY.get_time - l_start) || ' hsecs');

  ROLLBACK;
END;
/
MERGE	     : 119 hsecs
ROW MERGE    : 1453 hsecs
UPDATE/INSERT: 1280 hsecs
INSERT/UPDATE: 2443 hsecs


MERGE Statement Enhancements in Oracle Database 10g


Test Table

The following examples use the table defined below.

CREATE TABLE test1 AS
SELECT *
FROM   all_objects
WHERE  1=2;

Optional Clauses

The MATCHED and NOT MATCHED clauses are now optional making all of the following examples valid.

-- Both clauses present.
MERGE INTO test1 a
  USING all_objects b
    ON (a.object_id = b.object_id)
  WHEN MATCHED THEN
    UPDATE SET a.status = b.status
  WHEN NOT MATCHED THEN
    INSERT (object_id, status)
    VALUES (b.object_id, b.status);

-- No matched clause, insert only.
MERGE INTO test1 a
  USING all_objects b
    ON (a.object_id = b.object_id)
  WHEN NOT MATCHED THEN
    INSERT (object_id, status)
    VALUES (b.object_id, b.status);

-- No not-matched clause, update only.
MERGE INTO test1 a
  USING all_objects b
    ON (a.object_id = b.object_id)
  WHEN MATCHED THEN
    UPDATE SET a.status = b.status;

Conditional Operations

Conditional inserts and updates are now possible by using a WHERE clause on these statements.

-- Both clauses present.
MERGE INTO test1 a
  USING all_objects b
    ON (a.object_id = b.object_id)
  WHEN MATCHED THEN
    UPDATE SET a.status = b.status
    WHERE  b.status != 'VALID'
  WHEN NOT MATCHED THEN
    INSERT (object_id, status)
    VALUES (b.object_id, b.status)
    WHERE  b.status != 'VALID';

-- No matched clause, insert only.
MERGE INTO test1 a
  USING all_objects b
    ON (a.object_id = b.object_id)
  WHEN NOT MATCHED THEN
    INSERT (object_id, status)
    VALUES (b.object_id, b.status)
    WHERE  b.status != 'VALID';

-- No not-matched clause, update only.
MERGE INTO test1 a
  USING all_objects b
    ON (a.object_id = b.object_id)
  WHEN MATCHED THEN
    UPDATE SET a.status = b.status
    WHERE  b.status != 'VALID';

DELETE Clause

An optional DELETE WHERE clause can be added to the MATCHED clause to clean up after a merge operation. Only those rows in the destination table that match both the ON clause and the DELETE WHERE are deleted. Depending on which table the DELETE WHEREreferences, it can target the rows prior or post update. The following examples clarify this.

Create a source table with 5 rows as follows.

CREATE TABLE source AS
SELECT level AS id,
       CASE
         WHEN MOD(level, 2) = 0 THEN 10
         ELSE 20
       END AS status,
       'Description of level ' || level AS description
FROM   dual
CONNECT BY level <= 5;

SELECT * FROM source;

        ID     STATUS DESCRIPTION
---------- ---------- -----------------------
         1         20 Description of level 1
         2         10 Description of level 2
         3         20 Description of level 3
         4         10 Description of level 4
         5         20 Description of level 5

5 rows selected.

SQL>

Create the destination table using a similar query, but this time with 10 rows.

CREATE TABLE destination AS
SELECT level AS id,
       CASE
         WHEN MOD(level, 2) = 0 THEN 10
         ELSE 20
       END AS status,
       'Description of level ' || level AS description
FROM   dual
CONNECT BY level <= 10;

SELECT * FROM destination;

         1         20 Description of level 1
         2         10 Description of level 2
         3         20 Description of level 3
         4         10 Description of level 4
         5         20 Description of level 5
         6         10 Description of level 6
         7         20 Description of level 7
         8         10 Description of level 8
         9         20 Description of level 9
        10         10 Description of level 10

10 rows selected.

SQL>

The following MERGE statement will update all the rows in the destination table that have a matching row in the source table. The additional DELETE WHERE clause will delete only those rows that were matched, already in the destination table, and meet the criteria of the DELETE WHERE clause.

MERGE INTO destination d
  USING source s
    ON (s.id = d.id)
  WHEN MATCHED THEN
    UPDATE SET d.description = 'Updated'
    DELETE WHERE d.status = 10;

5 rows merged.

SQL>

SELECT * FROM destination;

        ID     STATUS DESCRIPTION
---------- ---------- -----------------------
         1         20 Updated
         3         20 Updated
         5         20 Updated
         6         10 Description of level 6
         7         20 Description of level 7
         8         10 Description of level 8
         9         20 Description of level 9
        10         10 Description of level 10

8 rows selected.

SQL>

Notice there are rows with a status of "10" that were not deleted. This is because there was no match between the source and destination for these rows, so the delete was not applicable.

The following example shows the DELETE WHERE can be made to match against values of the rows before the update operation, not after. In this case, all matching rows have their status changed to "10", but the DELETE WHERE references the source data, so the status is checked against the source, not the updated values.

ROLLBACK;

MERGE INTO destination d
  USING source s
    ON (s.id = d.id)
  WHEN MATCHED THEN
    UPDATE SET  d.description = 'Updated',
                d.status = 10
    DELETE WHERE s.status = 10;

5 rows merged.

SQL>
  
SELECT * FROM destination;

        ID     STATUS DESCRIPTION
---------- ---------- -----------------------
         1         10 Updated
         3         10 Updated
         5         10 Updated
         6         10 Description of level 6
         7         20 Description of level 7
         8         10 Description of level 8
         9         20 Description of level 9
        10         10 Description of level 10

8 rows selected.

SQL>

Notice, no extra rows were deleted compared to the previous example.

By switching the DELETE WHERE to reference the destination table, the extra updated rows can be deleted also.

ROLLBACK;

MERGE INTO destination d
  USING source s
    ON (s.id = d.id)
  WHEN MATCHED THEN
    UPDATE SET  d.description = 'Updated',
                d.status = 10
    DELETE WHERE d.status = 10;

5 rows merged.

SQL>
  
SELECT * FROM destination;

        ID     STATUS DESCRIPTION
---------- ---------- -----------------------
         6         10 Description of level 6
         7         20 Description of level 7
         8         10 Description of level 8
         9         20 Description of level 9
        10         10 Description of level 10

5 rows selected.

SQL>