Understanding SQL and Relational Databases — Part 3

How to utilize filtering, ordering, and grouping features when using SQL.

Introduction

To refresh our memory, the word“query” alludes to any SQL statement that recovers information from your database. This was discussed in Part 2, where a simple query was composed utilizing a SELECT statement. You know how to recover single units of information, or rows, with a query like the one below, which will choose every row from the titles table. But, before we do that, we must make sure to connect to our database.

import sqlite3 
conn = sqlite3.connect('pubs_data.db')
cur = conn.cursor()
cur.execute("""SELECT * FROM titles;""").fetchall()
https://www.google.com/url?sa=i&url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fsql%2Fssms%2Fvisual-db-tools%2Fdesign-database-diagrams-visual-database-tools&psig=AOvVaw1vSvc7oA0_z716qiIlgK7p&ust=1592517304324000&source=images&cd=vfe&ved=0CAIQjRxqFwoTCPCmyoCFiuoCFQAAAAAdAAAAABAD

Consider the possibility that you needed to choose the most expensive Sci-Fi book. Or all of the books that are Sci-Fi. Or all the Sci-Fi books that have the word ‘Star’ in it. Data storage wouldn’t be very valuable if you couldn’t see, examine, and control that specific information. Fortunately, SQL is a really useful asset for doing just that. In this post, we’ll go through executing a bunch of common and useful SQL clauses.

Using the WHERE Clause

The WHERE clause filters query results using some sort of condition. You can use WHERE to filter by one or multiple conditions. Examples are below:

To select a specific title:

cur.execute("""SELECT * FROM titles 
WHERE title = 'Zed';""").fetchall()

To select multiple titles:

cur.execute("""SELECT * FROM titles 
WHERE title = 'Zed' OR title = 'We';""").fetchall()

Using the ORDER BY and LIMIT Clauses

Two other clauses that you can use to refine your searches are the ORDER BY and LIMIT clauses. ORDER BY permits you to sort the outcomes by a specific condition. For instance, you could sort by the title column if you wanted to get your search in alphabetical order.

*Note* ORDER BY is set to ascending, or ASC, by default. If you needed to sort by the inverse direction, use the keyword DESC in conjunction with ORDER BY.

The LIMIT clause is commonly the last argument in a SQL query and basically confines the output to a set number of results. LIMIT was used in the code in Part 2 of this series.

Simple use of ORDER BY:

cur.execute("""SELECT * FROM titles ORDER BY pubdate;""").fetchall()

Using ORDER BY and LIMIT together:

cur.execute("""SELECT * FROM titles 
ORDER BY pubdate DESC LIMIT 1;""").fetchone()

*Note* If you would just like the first result (or only result, if applicable) in your query, you can utilize the fetchone() method. Rather than bringing a list of all the results, it will return the first/only output. You can utilize this with or without using LIMIT 1.

Detailed example of selecting specific columns:

cur.execute("""SELECT title, type, price
FROM titles
WHERE (type = 'Sci-Fi' OR type = 'Fantasy') AND (price >= 20.00)
ORDER BY price DESC
LIMIT 15;""")

Other Modifiers

BETWEEN

As we’ve been seeing, having the option to sort and select certain information is very important. Now, let’s suppose you need to choose all of the books that were printed between 2017 and 2019. To make this query, you can use BETWEEN.

cur.execute("""SELECT title FROM titles 
WHERE pubdate BETWEEN ('2017') AND
('2019');""").fetchall()

Aggregate Functions

SQL also has what are called aggregate functions. These are SQL statements that can do math stuff, such as getting the mean of a column’s values, recovering the smallest and biggest values from a column, adding values in a column, or counting various records that meet certain conditions. Below is a code example to find the lowest price book:

cur.execute("""SELECT MIN(price) FROM titles;""").fetchall()

Also, one of the most convenient aggregate functions that should be noted is GROUP BY. As you can probably assume, GROUP BY organizes your results by a given column. For example, let’s say you wanted to pull all the Sci-Fi books from your dataset that were published in 2019, but want to group them by price. It would look like this:

cur.execute("""SELECT title, type, price, pubdate FROM titles 
WHERE (type = 'Sci-Fi') AND (pubdate = '2019')
GROUP BY price;""").fetchall()

Conclusion

SQL is a very common and very in-demand tool for data scientists. It can be intimidating at first — I know it was for me as a beginning data science student. But like with any other skill, you will get better with practice. So keep at it! I hope these blog posts were helpful, and thank you for taking the time to read them.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store