SQL - Self Join


Self Join, just like its name suggests, is a type of join that combines the records of a table with itself.

Have you ever considered a case where we are required to compare two columns of a same table? For instance, let us suppose an organization is choosing a Secret Santa among its employees based on some colors. It is done by assigning one color to each of its employees and having other employees pick a color from the pool of various colors. In the end, the employee this color is assigned to will become their Secret Santa.

As we can see in the figure below, the information regarding the colors assigned and a color each employee picked is entered into a table. The table is joined to itself using self join over the color columns to pick the Secret Santa.

Self Join

Self Join in SQL

The SQL Self Join is used to join a table to itself as if the table were two tables. To carry this out, at least one table is temporarily renamed in the SQL statement.

Self Join is a type of inner join, which is performed in cases where the comparison between two columns of a same table is required; probably to establish a relationship between them. In other words, a table is joined with itself when it contains both Foreign Key and Primary Key in it.

However, unlike queries of other joins, we use WHERE clause to specify the condition for the table to combine with itself; instead of the ON clause.

Syntax

Following is the basic syntax of Self Join in SQL −

SELECT column_name(s)
FROM table1 a, table1 b
WHERE a.common_field = b.common_field;

Here, the WHERE clause could be any given expression based on your requirement.

Example

Self Join only requires one table to join itself; so, let us create a Customers table containing the customer details like their names, age, address and the salary they earn.

CREATE TABLE CUSTOMERS (
   ID INT NOT NULL,
   NAME VARCHAR (20) NOT NULL,
   AGE INT NOT NULL,
   ADDRESS CHAR (25),
   SALARY DECIMAL (18, 2),       
   PRIMARY KEY (ID)
);

Now insert values into this table using the INSERT statement as follows −

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (1, 'Ramesh', 32, 'Ahmedabad', 2000.00 );

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (2, 'Khilan', 25, 'Delhi', 1500.00 );

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (3, 'kaushik', 23, 'Kota', 2000.00 );

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (4, 'Chaitali', 25, 'Mumbai', 6500.00 );

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (5, 'Hardik', 27, 'Bhopal', 8500.00 );

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (6, 'Komal', 22, 'MP', 4500.00 );

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (7, 'Muffy', 24, 'Indore', 10000.00 );

The table will be created as −

+----+----------+-----+-----------+----------+
| ID | NAME     | AGE | ADDRESS   | SALARY   |
+----+----------+-----+-----------+----------+
|  1 | Ramesh   |  32 | Ahmedabad |  2000.00 |
|  2 | Khilan   |  25 | Delhi     |  1500.00 |
|  3 | kaushik  |  23 | Kota      |  2000.00 |
|  4 | Chaitali |  25 | Mumbai    |  6500.00 |
|  5 | Hardik   |  27 | Bhopal    |  8500.00 |
|  6 | Komal    |  22 | MP        |  4500.00 |
|  7 | Muffy    |  24 | Indore    | 10000.00 |
+----+----------+-----+-----------+----------+

Now, let us join this table using the following Self Join query. Our aim is to establish a relationship among the said Customers on the basis of their earnings. We are doing this with the help of WHERE clause.

SELECT a.ID, b.NAME as EARNS_HIGHER, a.NAME as EARNS_LESS, a.SALARY as LOWER_SALARY
FROM CUSTOMERS a, CUSTOMERS b
WHERE a.SALARY < b.SALARY;

Output

The resultant table displayed will list out all the customers that earn lesser than other customers −

+----+--------------+------------+--------------+
| ID | EARNS_HIGHER | EARNS_LESS | LOWER_SALARY |
+----+--------------+------------+--------------+
|  2 | Ramesh       | Khilan     |      1500.00 |
|  2 | Kaushik      | Khilan     |      1500.00 |
|  6 | Chaitali     | Komal      |      4500.00 |
|  3 | Chaitali     | Kaushik    |      2000.00 |
|  2 | Chaitali     | Khilan     |      1500.00 |
|  1 | Chaitali     | Ramesh     |      2000.00 |
|  6 | Hardik       | Komal      |      4500.00 |
|  4 | Hardik       | Chaitali   |      6500.00 |
|  3 | Hardik       | Kaushik    |      2000.00 |
|  2 | Hardik       | Khilan     |      1500.00 |
|  1 | Hardik       | Ramesh     |      2000.00 |
|  3 | Komal        | Kaushik    |      2000.00 |
|  2 | Komal        | Khilan     |      1500.00 |
|  1 | Komal        | Ramesh     |      2000.00 |
|  6 | Muffy        | Komal      |      4500.00 |
|  5 | Muffy        | Hardik     |      8500.00 |
|  4 | Muffy        | Chaitali   |      6500.00 |
|  3 | Muffy        | Kaushik    |      2000.00 |
|  2 | Muffy        | Khilan     |      1500.00 |
|  1 | Muffy        | Ramesh     |      2000.00 |
+----+--------------+------------+--------------+

Self Join with ORDER BY Clause

Furthermore, after joining a table with itself using self join, the records in the combined table can also be sorted in an ascending order using the ORDER BY clause.

Syntax

Following is the syntax for it −

SELECT column_name(s)
FROM table1 a, table1 b
WHERE a.common_field = b.common_field
ORDER BY column_name;

Example

Let us use the same table Customers shown below −

+----+----------+-----+-----------+----------+
| ID | NAME     | AGE | ADDRESS   | SALARY   |
+----+----------+-----+-----------+----------+
|  1 | Ramesh   |  32 | Ahmedabad |  2000.00 |
|  2 | Khilan   |  25 | Delhi     |  1500.00 |
|  3 | kaushik  |  23 | Kota      |  2000.00 |
|  4 | Chaitali |  25 | Mumbai    |  6500.00 |
|  5 | Hardik   |  27 | Bhopal    |  8500.00 |
|  6 | Komal    |  22 | MP        |  4500.00 |
|  7 | Muffy    |  24 | Indore    | 10000.00 |
+----+----------+-----+-----------+----------+

Upon executing the following query, we will be joining the Customers table with itself using self join on a WHERE clause; then, arrange the records in an ascending order using the ORDER BY clause with respect to a specified column.

SELECT  a.ID, b.NAME as EARNS_HIGHER, a.NAME as EARNS_LESS, a.SALARY as LOWER_SALARY
FROM CUSTOMERS a, CUSTOMERS b
WHERE a.SALARY < b.SALARY
ORDER BY a.SALARY;

As we can see above, we are arranging the records based on the salary column.

Output

The resultant table is displayed as follows −

+----+--------------+------------+--------------+
| ID | EARNS_HIGHER | EARNS_LESS | LOWER_SALARY |
+----+--------------+------------+--------------+
|  2 | Ramesh       | Khilan     |      1500.00 |
|  2 | Kaushik      | Khilan     |      1500.00 |
|  2 | Chaitali     | Khilan     |      1500.00 |
|  2 | Hardik       | Khilan     |      1500.00 |
|  2 | Komal        | Khilan     |      1500.00 |
|  2 | Muffy        | Khilan     |      1500.00 |
|  3 | Chaitali     | Kaushik    |      2000.00 |
|  1 | Chaitali     | Ramesh     |      2000.00 |
|  3 | Hardik       | Kaushik    |      2000.00 |
|  1 | Hardik       | Ramesh     |      2000.00 |
|  3 | Komal        | Kaushik    |      2000.00 |
|  1 | Komal        | Ramesh     |      2000.00 |
|  3 | Muffy        | Kaushik    |      2000.00 |
|  1 | Muffy        | Ramesh     |      2000.00 |
|  6 | Chaitali     | Komal      |      4500.00 |
|  6 | Hardik       | Komal      |      4500.00 |
|  6 | Muffy        | Komal      |      4500.00 |
|  4 | Hardik       | Chaitali   |      6500.00 |
|  4 | Muffy        | Chaitali   |      6500.00 |
|  5 | Muffy        | Hardik     |      8500.00 |
+----+--------------+------------+--------------+

Not just the salary column, the records can be sorted based on the alphabetical order of names, numerical order of Customer IDs etc.

Advertisements