SQL - UNION Operator


The SQL UNION operator is used to combine data from multiple tables by eliminating duplicate rows (if any).

To use the UNION operator on multiple tables, all these tables must be union compatible. And they are said to be union compatible if and only if they meet the following criteria −

  • The same number of columns selected with the same datatype.
  • These columns must also be in the same order.
  • They need not have same number of rows.

Once these criterion are met, the UNION operator returns the rows from multiple tables as a resultant table which is void of all duplicate values from these tables.

Note − The column names in the final result set will be based on the column names selected in the first SELECT statement. If you want to use a different name for a column in the final result set, you can use an alias in the SELECT statement.

Syntax

The basic syntax of a UNION operator is as follows −

SELECT column1 [, column2 ]
FROM table1 [, table2 ]
[WHERE condition]

UNION

SELECT column1 [, column2 ]
FROM table1 [, table2 ]
[WHERE condition]

Here, the given condition could be any given expression based on your requirement.

UNION on a Single Field

If we want to use UNION to combine the result sets of two or more SELECT statements on a single field, we can simply include that field in the SELECT statement of each query. The UNION operator will automatically remove any duplicate values in the final result set.

Note − It's important to note that when using UNION on a single field, the column names in the result set will be determined by the column name in the first SELECT statement. Therefore, you may need to use an alias in the SELECT statement to ensure that the column name is meaningful for the final result set.

Example

Assume we have created a table with name CUSTOMERS in SQL database using CREATE TABLE statement as shown below −

CREATE TABLE CUSTOMERS (
   ID INT NOT NULL,
   NAME VARCHAR (20) NOT NULL,
   AGE INT NOT NULL,
   ADDRESS CHAR (25),
   SALARY DECIMAL (18, 2),       
   PRIMARY KEY (ID)
);

Following query inserts values into this table using the INSERT statement −

insert INTO CUSTOMERS VALUES(1, 'Ramesh', 32, 'Ahmedabad', 2000.00);
insert INTO CUSTOMERS VALUES(2, 'Khilan', 25, 'Delhi', 1500.00);
insert INTO CUSTOMERS VALUES(3, 'kaushik', 23, 'Kota', 2000.00);
insert INTO CUSTOMERS VALUES(4, 'Chaitali', 25, 'Mumbai', 6500.00);
insert INTO CUSTOMERS VALUES(5, 'Hardik', 27, 'Bhopal', 8500.00);
insert INTO CUSTOMERS VALUES(6, 'Komal', 22, 'MP', 4500.00);
insert INTO CUSTOMERS VALUES(7, 'Muffy', 24, 'Indore', 10000.00);

If we verify the contents of the CUSTOMERS table using the SELECT statement, we can observe the inserted records as shown below −

Table 1 − CUSTOMERS Table is as follows.

SELECT * from CUSTOMERS;
+----+----------+-----+-----------+----------+
| ID | NAME     | AGE | ADDRESS   | SALARY   |
+----+----------+-----+-----------+----------+
|  1 | Ramesh   |  32 | Ahmedabad |  2000.00 |
|  2 | Khilan   |  25 | Delhi     |  1500.00 |
|  3 | kaushik  |  23 | Kota      |  2000.00 |
|  4 | Chaitali |  25 | Mumbai    |  6500.00 |
|  5 | Hardik   |  27 | Bhopal    |  8500.00 |
|  6 | Komal    |  22 | MP        |  4500.00 |
|  7 | Muffy    |  24 | Indore    | 10000.00 |
+----+----------+-----+-----------+----------+

Now, creating the second table ORDERS using CREATE TABLE statement as shown below

CREATE TABLE ORDERS (
   OID INT NOT NULL,
   DATES DATETIME NOT NULL,
   CUSTOMER_ID INT NOT NULL,
   AMOUNT INT NOT NULL,      
   PRIMARY KEY (OID)
);

Following query inserts values into this table using the INSERT statement −

insert INTO ORDERS VALUES(102, '2009-10-08 00:00:00', 3, 3000);
insert INTO ORDERS VALUES(100, '2009-10-08 00:00:00', 3, 1500);
insert INTO ORDERS VALUES(101, '2009-11-20 00:00:00', 2, 1560);
insert INTO ORDERS VALUES(103, '2008-05-20 00:00:00', 4, 2060);

If we verify the contents of the ORDERS table using the SELECT statement, we can observe the inserted records as shown below −

Table 2 − ORDERS Table is as follows.

SELECT * from ORDERS;
+-----+---------------------+-------------+--------+
|OID  | DATES               | CUSTOMER_ID | AMOUNT |
+-----+---------------------+-------------+--------+
| 100 | 2009-10-08 00:00:00 |           3 |   1500 |
| 101 | 2009-11-20 00:00:00 |           2 |   1560 |
| 102 | 2009-10-08 00:00:00 |           3 |   3000 |
| 103 | 2008-05-20 00:00:00 |           4 |   2060 |
+-----+---------------------+-------------+--------+

Using the following query, let us try to combine the SALARY and AMOUNT columns from CUSTOMERS and ORDERS respectively since these columns have similar datatypes.

SELECT SALARY
FROM CUSTOMERS
UNION
SELECT AMOUNT
FROM ORDERS

Output

Output of the above query is as follows −

+----------+
| SALARY   |
+----------+
|  2000.00 |
|  1500.00 |
|  6500.00 |
|  8500.00 |
|  4500.00 |
| 10000.00 |
|  1560.00 |
|  3000.00 |
|  2060.00 |
+----------+

UNION on Multiple Fields

When we use UNION on multiple fields, the number and order of the fields in each SELECT statement must match. Also, the data types of the fields in each SELECT statement must be compatible for the UNION to work correctly. If the data types are not compatible, you may need to use conversion functions such as CAST or CONVERT to ensure that the data types match.

Example

As the CUSTOMERS and ORDERS tables are not union-compatible individually, let us first join these two tables into a bigger table using Left Join and Right Join. The joined tables retrieved will have same number of columns with same datatypes, becoming union compatible. Now, these tables are combined using UNION query shown below −

SELECT  ID, NAME, AMOUNT, DATES
FROM CUSTOMERS
LEFT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID
UNION
SELECT  ID, NAME, AMOUNT, DATES
FROM CUSTOMERS
RIGHT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;

Output

This would produce the following result −

+------+----------+--------+---------------------+
| ID   | NAME     | AMOUNT | DATES               |
+------+----------+--------+---------------------+
|    1 | Ramesh   |   NULL | NULL                |
|    2 | Khilan   |   1560 | 2009-11-20 00:00:00 |
|    3 | kaushik  |   1500 | 2009-10-08 00:00:00 |
|    3 | kaushik  |   3000 | 2009-10-08 00:00:00 |
|    4 | Chaitali |   2060 | 2008-05-20 00:00:00 |
|    5 | Hardik   |   NULL | NULL                |
|    6 | Komal    |   NULL | NULL                |
|    7 | Muffy    |   NULL | NULL                |
+------+----------+--------+---------------------+

UNION with WHERE clause

We can use the WHERE clause with UNION operator to filter the results of each SELECT statement before combining them.

Syntax

Following is the syntax for using the WHERE clause with UNION operator −

SELECT column1, column2, column3
FROM table1
WHERE column1 = 'value1'
UNION
SELECT column1, column2, column3
FROM table2
WHERE column1 = 'value2';

Example

In the following query we are trying to retrieve the ids of the customers from both the table where id is greater than 5 and 2 from the 'CUTOMERS' and 'ORDERS' tables respectively −

SELECT ID, SALARY
FROM customers
WHERE ID > 5
UNION
SELECT CUSTOMER_ID, AMOUNT
FROM orders
WHERE CUSTOMER_ID > 2

Output

Following is the result produced −

+----+----------+
| ID | SALARY   |
+----+----------+
|  3 |  1500.00 |
|  3 |  3000.00 |
|  4 |  2060.00 |
|  6 |  4500.00 |
|  7 | 10000.00 |
+----+----------+

UNION with ORDER BY clause

When we use UNION with ORDER BY clause, it combines the sorted result sets of all SELECT statements and produces a single sorted result set.

Example

In here, we are trying to retrieve the ids of the customers from both the table where id is greater than 5 and 2 from the 'CUTOMERS' and 'ORDERS' tables respectively, sorted low to high from their salary −

SELECT ID, SALARY
FROM customers
WHERE ID > 5
UNION
SELECT CUSTOMER_ID, AMOUNT
FROM orders
WHERE CUSTOMER_ID > 2
ORDER BY SALARY;

Output

Following is the output of the above query −

+----+----------+
| ID | SALARY   |
+----+----------+
|  3 |  1500.00 |
|  4 |  2060.00 |
|  3 |  3000.00 |
|  6 |  4500.00 |
|  7 | 10000.00 |
+----+----------+

Note − It's important to note that the ORDER BY clause in a UNION statement applies to the entire result set, not just the last SELECT statement.

UNION with Aliases

We can use aliases in the SELECT statement of UNION operator to give a table or column a temporary name, which can be useful when working with multiple tables or columns with similar names.

When using UNION with aliases, it's important to note that the column aliases are determined by the first SELECT statement. Therefore, if you want to use different aliases for the same column in different SELECT statements, you need to use column aliases in all SELECT statements to ensure consistent column names in the final result set.

Syntax

Following is the syntax for using Union with Aliases −

SELECT column1 AS alias1, column2 AS alias2
FROM table1
UNION
SELECT column3 AS alias1, column4 AS alias2
FROM table2;

Example

The following query retrieves all the IDs from both tables, along with an indication of whether each id is of the customer or the order −

SELECT ID, 'customer' AS type
FROM CUSTOMERS
UNION
SELECT OID, 'order' AS type
FROM ORDERS;

Output

Following is the output produced −

+-----+----------+
| ID  | type     |
+-----+----------+
|   1 | customer |
|   2 | customer |
|   3 | customer |
|   4 | customer |
|   5 | customer |
|   6 | customer |
|   7 | customer |
| 100 | order    |
| 101 | order    |
| 102 | order    |
| 103 | order    |
+-----+----------+

There are two other operators which are like the UNION operator.

  • SQL INTERSECT Operator − This is used to combine two SELECT statements, but returns rows only from the first SELECT statement that are identical to a row in the second SELECT statement.

  • SQL EXCEPT Operator − This combines two SELECT statements and returns rows from the first SELECT statement that are not returned by the second SELECT statement.

Advertisements