How to Perform Inner Join in Snowflake

ยท 843 words ยท 4 minute read

Inner Join Tutorial in Snowflake SQL ๐Ÿ”—

An inner join is a type of SQL join that returns only the rows that have a match in both tables. It is the most common type of join and is used to combine data from two tables where there is a common column between them.

Inner Join Syntax in Snowflake SQL ๐Ÿ”—

Before looking at the syntax, the general steps to perform an inner join are the following:

Step 1: Specify the Columns Specify the columns that you want to retrieve from the tables.

Step 2: Specify the First Table Specify the first table that you want to retrieve data from.

Step 3: Specify the Second Table Specify the second table that you want to join with the first table.

Step 4: Specify the Join Condition Specify the condition for joining the two tables.

Now that we know the general steps, let’s look at the syntax

SELECT column_name(s)
FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;

Example of an Inner Join in Snowflake SQL ๐Ÿ”—

Let’s say we have two tables, customers and orders, and we want to retrieve the customer names and their corresponding order details.

Table: customers

customer_id name
1 John Smith
2 Jane Doe
3 Bob Brown

Table: orders

order_id customer_id order_date
1 1 2022-01-01
2 1 2022-01-15
3 2 2022-02-01
4 3 2022-03-01

We join the customers table with the orders table on the customer_id column. The resulting table will only include rows where there is a match in both tables.

SELECT c.name, o.order_date
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id;

Result

name order_date
John Smith 2022-01-01
John Smith 2022-01-15
Jane Doe 2022-02-01
Bob Brown 2022-03-01

Advanced examples ๐Ÿ”—


Joining on multiple columns

Data tables in real life can be complex and require joins on multiple columns. It is very easy to do an inner join with multiple columns by just updating the join condition with multiple column names. The below query performs a join based on customer_id and country.

SELECT *
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id AND c.country = o.country;

Joining multiple tables

There area also scenarios in the real world where we must join across multiple tables. We just need to add extra condition for each required table. The below query performs a join with customers and orders and then with products.

SELECT *
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id
INNER JOIN products p
ON o.product_id = p.product_id;

Best practices and extra tips ๐Ÿ”—

  • Always make sure to specify the ON clause or you will run a Cartesian product! Cartesian products require a lot of computation and the end result is usually not what you are looking for.
  • Always double check that your are joining on the right columns, or your join result will result in incorrect data.
  • Aliases are optional but they greatly improve readability of Snowflake queries especially if you are joining across multiple tables. Make sure to use alias names that make sense and are related to the table it is aliasing.

Inner Join Exercises ๐Ÿ”—

Exercise 1

Perform an inner join on the employees and departments tables to retrieve the employee names and their corresponding department names.

employees table

employee_id name department_id
1 John Doe 2
2 Jane Smith 1
3 Jim Brown 3

departments table

department_id department_name
1 HR
2 IT
3 Sales
-- your code here

Exercise 2

Retrieve the customer names and their corresponding order details for customers who have placed more than one order.

customers table

customer_id name
1 John Doe
2 Jane Smith
3 Jim Brown
4 Alice Johnson

orders table

order_id customer_id order_date
101 1 2023-01-15
102 2 2023-02-20
103 1 2023-03-10
104 3 2023-04-05
105 4 2023-05-12
106 2 2023-06-18
-- your code here

Exercise 3

Perform an inner join on the products and suppliers tables to retrieve the product names and their corresponding supplier names.

products table

product_id product_name supplier_id
1 Laptop 1
2 Smartphone 2
3 Tablet 1
4 Printer 3

suppliers table

supplier_id supplier_name
1 Supplier A
2 Supplier B
3 Supplier C
-- your code here

Answers ๐Ÿ”—

Exercise 1 ๐Ÿ”—

SELECT e.name, d.department_name
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id;
name department_name
John Doe IT
Jane Smith HR
Jim Brown Sales

Exercise 2 ๐Ÿ”—

SELECT c.name, o.order_date
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id
GROUP BY c.name
HAVING COUNT(o.order_id) > 1;
name order_date
John Doe 2023-01-15
John Doe 2023-03-10
Jane Smith 2023-02-20
Jane Smith 2023-06-18

Exercise 3 ๐Ÿ”—

SELECT p.product_name, s.supplier_name
FROM products p
INNER JOIN suppliers s
ON p.supplier_id = s.supplier_id;
product_name supplier_name
Laptop Supplier A
Smartphone Supplier B
Tablet Supplier A
Printer Supplier C

Conclusion ๐Ÿ”—

Inner joins are a fundamental concept in SQL and are used to combine data from multiple tables. By following the guidelines and examples outlined in this tutorial, you should now be comfortable performing inner joins in Snowflake SQL. Remember to practice with exercises and to always specify the ON clause to ensure that you are joining on the correct columns.