SQL - DISTINCT Keyword


The SQL DISTINCT keyword is used in conjunction with the SELECT statement to eliminate all the duplicate records and fetching only unique records.

There may be a situation when you have multiple duplicate records in a table. While fetching such records, it makes more sense to fetch only those unique records instead of fetching duplicate records.

The basic syntax of DISTINCT keyword is as follows −

SELECT DISTINCT column1, column2,.....columnN FROM table_name

Where, column1, column2, etc. are the columns we want to retrieve unique or distinct values from, and table_name represents the name of the table containing the data.

DISTINCT keyword on Single Column

We can use the DISTINCT keyword on a single column to retrieve all unique values in that column, i.e. with duplicates removed. This is often used to get a summary of the distinct values in a particular column or to eliminate redundant data.

Example

Assume we have created a table with name CUSTOMERS in SQL database using CREATE TABLE statement as shown below −

CREATE TABLE CUSTOMERS (
   ID INT NOT NULL,
   NAME VARCHAR (20) NOT NULL,
   AGE INT NOT NULL,
   ADDRESS CHAR (25),
   SALARY DECIMAL (18, 2),       
   PRIMARY KEY (ID)
);

Following query inserts values into this table using the INSERT statement −

insert INTO CUSTOMERS VALUES(1, 'Ramesh', 32, 'Ahmedabad', 2000.00);
insert INTO CUSTOMERS VALUES(2, 'Khilan', 25, 'Delhi', 1500.00);
insert INTO CUSTOMERS VALUES(3, 'kaushik', 23, 'Kota', 2000.00);
insert INTO CUSTOMERS VALUES(4, 'Chaitali', 25, 'Mumbai', 6500.00);
insert INTO CUSTOMERS VALUES(5, 'Hardik', 27, 'Bhopal', 8500.00);
insert INTO CUSTOMERS VALUES(6, 'Komal', 22, 'MP', 4500.00);
insert INTO CUSTOMERS VALUES(7, 'Muffy', 24, 'Indore', 10000.00);

If we verify the contents of the CUSTOMERS table using the SELECT statement, we can observe the inserted records as shown below −

SELECT * from CUSTOMERS;
+----+----------+-----+-----------+----------+
| ID | NAME     | AGE | ADDRESS   | SALARY   |
+----+----------+-----+-----------+----------+
|  1 | Ramesh   |  32 | Ahmedabad |  2000.00 |
|  2 | Khilan   |  25 | Delhi     |  1500.00 |
|  3 | kaushik  |  23 | Kota      |  2000.00 |
|  4 | Chaitali |  25 | Mumbai    |  6500.00 |
|  5 | Hardik   |  27 | Bhopal    |  8500.00 |
|  6 | Komal    |  22 | MP        |  4500.00 |
|  7 | Muffy    |  24 | Indore    | 10000.00 |
+----+----------+-----+-----------+----------+

First, let us see how the following SELECT query returns the duplicate salary records −

SELECT SALARY FROM CUSTOMERS
ORDER BY SALARY;

This would produce the following result, where the salary (2000) is coming twice which is a duplicate record from the original table −

+----------+
| SALARY   |
+----------+
|  1500.00 |
|  2000.00 |
|  2000.00 |
|  4500.00 |
|  6500.00 |
|  8500.00 |
| 10000.00 |
+----------+

Now, let us use the DISTINCT keyword with the above SELECT query and then see the result −

SELECT DISTINCT SALARY FROM CUSTOMERS ORDER BY SALARY;

Output

This would produce the following result where we do not have any duplicate entry −

+----------+
| SALARY   |
+----------+
|  1500.00 |
|  2000.00 |
|  4500.00 |
|  6500.00 |
|  8500.00 |
| 10000.00 |
+----------+

DISTINCT keyword on multiple columns

We can also use the DISTINCT keyword on multiple columns to retrieve all unique combinations of values across those columns, i.e. removing redundant records in a table. This is often used to get a summary of the distinct values in multiple columns or to eliminate redundant data based on multiple criteria.

Example

In the following query we are trying to get a list of all unique combinations of customer's age and salary using the DISTINCT keyword −

SELECT DISTINCT AGE, SALARY FROM CUSTOMERS ORDER BY AGE

Output

We can see that the duplicate value "25" appears twice in the result set since each combination of "25" with it's specific 'salary' is unique, so both rows are included in the result set −

+-----+----------+
| AGE | SALARY   |
+-----+----------+
|  22 |  4500.00 |
|  23 |  2000.00 |
|  24 | 10000.00 |
|  25 |  1500.00 |
|  25 |  6500.00 |
|  27 |  8500.00 |
|  32 |  2000.00 |
+-----+----------+

DISTINCT keyword with COUNT() function

Using the DISTINCT keyword with the COUNT() function will produce the count of unique values in a column of a table.

Syntax

Following is the syntax for using the DISTINCT keyword with COUNT() function −

SELECT COUNT(DISTINCT column_name) FROM table_name WHERE condition;

Where, column_name is the name of the column for which we want to count the unique values and table_name is the name of the table that contains the data.

Example

In the following query we are trying to count the number of distinct age of customers −

SELECT COUNT(DISTINCT AGE) as UniqueAge  FROM CUSTOMERS

Output

Following is the result produced −

+-----------+
| UniqueAge |
+-----------+
|  6        |
+-----------+

DISTINCT keyword with NULL values

In SQL, when there are NULL values in the column, DISTINCT treats them as unique values and includes them in the result set.

Example

Consider the following CUSTOMERS table having NULL values in it's SALARY column −

+----+----------+-----+-----------+----------+
| ID | NAME     | AGE | ADDRESS   | SALARY   |
+----+----------+-----+-----------+----------+
|  1 | Ramesh   |  32 | Ahmedabad |  2000.00 |
|  2 | Khilan   |  25 | Delhi     |  1500.00 |
|  3 | kaushik  |  23 | Kota      |  2000.00 |
|  4 | Chaitali |  25 | Mumbai    |  NULL    |
|  5 | Hardik   |  27 | Bhopal    |  8500.00 |
|  6 | Komal    |  22 | MP        |  NULL    |
|  7 | Muffy    |  24 | Indore    | 10000.00 |
+----+----------+-----+-----------+----------+

Now, we are trying to retrieve the distinct salary of the customers using the following query −

SELECT DISTINCT SALARY FROM CUSTOMERS ORDER BY SALARY;

Output

Following is the output of the above query −

+----------+
| SALARY   |
+----------+
|  NULL    |
|  1500.00 |
|  2000.00 |
|  8500.00 |
| 10000.00 |
+----------+
Advertisements