Understanding SQL and Relational Databases — Part 1

A brief overview of SQL

SQL (pronounced “sequel”), which is an acronym for “Structured Query Language”, was created by IBM in the 1970s and remains very popular today. There is a range of versions of SQL, including MySQL, SQLite, and PostgreSQL. Each of these versions has unique qualities, such as specific functions or keywords for a certain task. However, they all have the same essential structures, including keywords like SELECT to query databases and the same general database design. Connecting to and querying SQL databases will be covered in greater detail in Part 2 of this series.

SQL databases are an assortment of tables and are the root container for information. Each table is like a .csv file or a spreadsheet inside an Excel workbook. The tables are two-dimensional items with specific columns and any related entries are composed into rows. Below is an example diagram of a database structure. This diagram will be the “database” used in all 3 parts of this series.

Image for post
Image for post

In the diagram, every rectangle is a table, with the table’s name recorded at the top. For this database, we have seven tables: publishers, titles, employee, pub_info, discounts, stores, and sales. Underneath every one of the table names we have a rundown of the different column names related with that table. For instance, the publishers table has five columns: pub_id, pub_name, city, state, and country.

SQL is intended to work with relational data. Relational data is just bits of data that are identified with one another. In the diagram above, information in the publishers table has some relationship to information in the titles table, showing that a specific publisher might be related with a particular title. Similarly, certain discounts are related with specific stores.

You may note that a few of these column names have a key-like image before them. This shows this is the primary key for that table. A primary key is a one-of-a-kind identifier for a table. In other words, there must be unique values for this column entry. For example, city would not be the best decision for a primary key because it’s possible for multiple cities to have the same name.

If you look carefully, you’ll see that the columns that are the primary key for one table can also show up on different tables. This is called a foreign key, or otherwise known as the primary key for an alternate (or “foreign”) table. This is the core idea of how data in various tables are related in a relational database. In the event that you were given a particular pub_id and afterward were provided a list of data from pub_info that incorporated the pub_id, you could figure out which logo a publisher has by coordinating the primary and foreign keys together.

Thank you for taking the time to read this post and I hope that it was helpful & informative! If you’re looking for more information on SQL, specifically on how to connect to and query SQL databases, check out part 2.

Aspiring Data Scientist — Recent Graduate of Flatiron School’s Online Data Science Bootcamp

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store