Inner Join Tutorial in Snowflake SQL ๐
An inner join is a type of SQL join that returns only the rows that have a match in both tables. It is the most common type of join and is used to combine data from two tables where there is a common column between them.
Inner Join Syntax in Snowflake SQL ๐
Before looking at the syntax, the general steps to perform an inner join are the following:
Step 1: Specify the Columns Specify the columns that you want to retrieve from the tables.
Step 2: Specify the First Table Specify the first table that you want to retrieve data from.
Step 3: Specify the Second Table Specify the second table that you want to join with the first table.
Step 4: Specify the Join Condition Specify the condition for joining the two tables.
Now that we know the general steps, let’s look at the syntax
SELECT column_name(s)
FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;
Example of an Inner Join in Snowflake SQL ๐
Let’s say we have two tables, customers
and orders
, and we want to retrieve the customer names and their corresponding order details.
Table: customers
customer_id | name |
---|---|
1 | John Smith |
2 | Jane Doe |
3 | Bob Brown |
Table: orders
order_id | customer_id | order_date |
---|---|---|
1 | 1 | 2022-01-01 |
2 | 1 | 2022-01-15 |
3 | 2 | 2022-02-01 |
4 | 3 | 2022-03-01 |
We join the customers
table with the orders
table on the customer_id
column. The resulting table will only include rows where there is a match in both tables.
SELECT c.name, o.order_date
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id;
Result
name | order_date |
---|---|
John Smith | 2022-01-01 |
John Smith | 2022-01-15 |
Jane Doe | 2022-02-01 |
Bob Brown | 2022-03-01 |
Advanced examples ๐
Joining on multiple columns
Data tables in real life can be complex and require joins on multiple columns. It is very easy to do an inner join with multiple columns by just updating the join condition with multiple column names. The below query performs a join based on customer_id and country.
SELECT *
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id AND c.country = o.country;
Joining multiple tables
There area also scenarios in the real world where we must join across multiple tables. We just need to add extra condition for each required table. The below query performs a join with customers and orders and then with products.
SELECT *
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id
INNER JOIN products p
ON o.product_id = p.product_id;
Best practices and extra tips ๐
- Always make sure to specify the
ON
clause or you will run a Cartesian product! Cartesian products require a lot of computation and the end result is usually not what you are looking for. - Always double check that your are joining on the right columns, or your join result will result in incorrect data.
- Aliases are optional but they greatly improve readability of Snowflake queries especially if you are joining across multiple tables. Make sure to use alias names that make sense and are related to the table it is aliasing.
Inner Join Exercises ๐
Exercise 1
Perform an inner join on the employees
and departments
tables to retrieve the employee names and their corresponding department names.
employees
table
employee_id | name | department_id |
---|---|---|
1 | John Doe | 2 |
2 | Jane Smith | 1 |
3 | Jim Brown | 3 |
departments
table
department_id | department_name |
---|---|
1 | HR |
2 | IT |
3 | Sales |
-- your code here
Exercise 2
Retrieve the customer names and their corresponding order details for customers who have placed more than one order.
customers
table
customer_id | name |
---|---|
1 | John Doe |
2 | Jane Smith |
3 | Jim Brown |
4 | Alice Johnson |
orders
table
order_id | customer_id | order_date |
---|---|---|
101 | 1 | 2023-01-15 |
102 | 2 | 2023-02-20 |
103 | 1 | 2023-03-10 |
104 | 3 | 2023-04-05 |
105 | 4 | 2023-05-12 |
106 | 2 | 2023-06-18 |
-- your code here
Exercise 3
Perform an inner join on the products
and suppliers
tables to retrieve the product names and their corresponding supplier names.
products
table
product_id | product_name | supplier_id |
---|---|---|
1 | Laptop | 1 |
2 | Smartphone | 2 |
3 | Tablet | 1 |
4 | Printer | 3 |
suppliers
table
supplier_id | supplier_name |
---|---|
1 | Supplier A |
2 | Supplier B |
3 | Supplier C |
-- your code here
Answers ๐
Exercise 1 ๐
SELECT e.name, d.department_name
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id;
name | department_name |
---|---|
John Doe | IT |
Jane Smith | HR |
Jim Brown | Sales |
Exercise 2 ๐
SELECT c.name, o.order_date
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id
GROUP BY c.name
HAVING COUNT(o.order_id) > 1;
name | order_date |
---|---|
John Doe | 2023-01-15 |
John Doe | 2023-03-10 |
Jane Smith | 2023-02-20 |
Jane Smith | 2023-06-18 |
Exercise 3 ๐
SELECT p.product_name, s.supplier_name
FROM products p
INNER JOIN suppliers s
ON p.supplier_id = s.supplier_id;
product_name | supplier_name |
---|---|
Laptop | Supplier A |
Smartphone | Supplier B |
Tablet | Supplier A |
Printer | Supplier C |
Conclusion ๐
Inner joins are a fundamental concept in SQL and are used to combine data from multiple tables. By following the guidelines and examples outlined in this tutorial, you should now be comfortable performing inner joins in Snowflake SQL. Remember to practice with exercises and to always specify the ON
clause to ensure that you are joining on the correct columns.