top of page
Search
Writer's pictureJerry Zhu

SQL CSE 50 Reference Sheet

Updated: Feb 18, 2021

I saw that many people struggled to learn SQL in my CSE 50 class, myself included. As it is a business-oriented class, it can be a little overwhelming getting used to two languages and a new IDE (Python, SQL, and Jupyter Notebooks) without any experience. As a Computer Science student, I put this resource together to hopefully help organize the main commands that will be used in SQL with SQLalchemy. I have tried to simplify the sheet to be as simple as possible to reference when coding in SQL, and I have also included conceptualized tables that will hopefully make it easier to visualize. If anyone reading has questions, you can always contact me and I will try my best to sort it out. The secret to learning and mastering a computer language is to just try it out yourself, so be sure to get jupyter lab running. Thanks for reading!


This reference sheet may be updated with more commands as time goes on.


Table of Contents:

Section 1: Commands to initialize SQL into Python

Section 2: Creating and Inputting Data

Section 3: Querying and Fetching Data

Section 4: JOIN commands


Section 1: Commands to initialize SQL into Python


!pip install ipython-sql

# Installs SQL into our Python environment

import sqlalchemy

# imports frameworks of SQL

engine = sqlalchemy.create_engine(‘sqlite:///new.db’)

# Creates SQL Engine

con = engine.raw_connection()

# establishes a connection to SQL

cursor = con.cursor()

# creates a cursor so we can edit and query the database

!pip install pandas

# installs pandas into our enviroment

import pandas as pd

# pandas is a python library for data handling, in SQL, often used to import CSV files



Section 2: Creating and Inputting Data


After inputting any data, make sure to run con.commit(), or else the table will not be updated!


cursor.execute(​"CREATE TABLE table_name (first_name TEXT, last_name TEXT);"​)

Creates a table with name “table_name”, with columns “first_name” and “last_name”

TEXT indicates the type of data each column holds respectively

Examples of possible types of data:

TEXT

BOOLEAN

INTEGER

Conceptually, this is what the table looks like after running the code:





The table is empty as we have not inputted any data yet


cursor.execute("INSERT INTO table_name VALUES ('Tony', 'Quill');")

Inserts values into the table titled, “table_name.” with VALUES of ‘Tony’ into the first column and ‘Quill’ into the second column. Make sure the data types are the same as expected to avoid any issues.


Conceptually, this is what our table would look like after this command:


for row in cursor.execute("SELECT * FROM table_name;"):

print(row)


These two lines of code will print out all of the data in the table “table_name.” SELECT * FROM names is the SQL command for fetching all of the rows in “table_name.” The for loop calls the print function for every row in the table. After running this code with the before displayed table, the output will be as follows:


(‘Tony’ , ’Quill’)


As there is only one row in the table, it only prints once.


data = pd.read_csv('spreadsheet.csv')

This line of code will import a CSV file names ‘spreadsheet.csv’. Make sure to run the line of code “import pandas as pd” as pandas is required to use this function. The variable data” now holds the imported data from ‘spreadsheet.csv’


data.to_sql('table_name', con, if_exists='append', index=False)

This line of code will create a new table called ‘flights’.

  • The first parameter is the name of the new table

  • The second parameter notes which connection we used, remember the line of code we had before

con = engine.raw_connection()

  • The third parameter tells what the line of code to do if there is a table that is already named ‘table_name’. In this case, it will just add onto the previous one.

  • The fourth parameter determines if another column will be added that stores the indices of each row


q = "DROP TABLE IF EXISTS table_name;"

cursor.execute(q)

These lines of code will “drop” aka delete the table if the table exists. As always, make sure to commit changes after they are made.


Section 3: Querying and Fetching Data

In this section, each explanation for lines of code will include an output, so you can conceptually see the function in action. All of these examples will be fetching from the following table:




for row in cursor.execute("SELECT * FROM table_name LIMIT 3;"):

print(row)


These two lines of code will print out all of the items in “table_name”, but the LIMIT keyword will make it so that only 3 values are printed. The output will be as follows:


(‘Tony’ , ’Quill’, 1000, ‘YouTube’)

(‘John’ , ’Smith’,17, ‘Instagram’)

(‘Frank’ , ‘Appleseed’, 31, ‘YouTube’)


for row in cursor.execute("PRAGMA table_info('table_name');"):

print(row)


These two lines of code will print out the name of each column in a data table, with corresponding data types (TEXT, BOOLEAN, INTEGER, etc.)


command = "SELECT First_Name, Platform FROM table_name;"

for row in cursor.execute(command):

print(row)


These lines of code will only print the columns “Depart_Time” “Origin” and “Aircraft,” from table “table_name”. As explained before, SELECT * FROM will select and print every column, as the “*” character denotes that it should print every column. By replacing this star with specific columns, it will print those instead. The output will be as follows:


(‘Tony’ , ‘YouTube’)

(‘John’ , ‘Instagram’)

(‘Frank’ , ‘YouTube’)

(‘Carolina’ , ‘YouTube’)

(‘Elise’ , ‘Instagram’)


q= "SELECT * FROM table_name ORDER BY Followers DESC;"

for row in cursor.execute(q):

print(row)


These lines of code will print out all columns in “table_name” ordered by column “duration” in descending order. To print in ascending order, use ASC instead of DESC. The output will be as follows:


(‘Carolina’ , ’Turner’, 7850, ‘YouTube’)

(‘Tony’ , ’Quill’, 1000, ‘YouTube’)

(‘Elise’ , ‘Mendez’, 357, ‘Instagram’)

(‘Frank’ , ‘Appleseed’, 31, ‘YouTube’)

(‘John’ , ’Smith’,17, ‘Instagram’)


q= """SELECT * FROM table_name WHERE Platform = 'Instagram’;"""

for row in cursor.execute(q):

print(row)


These lines of code will print out all of the values in which the column “Platform” has a value of ‘Instagram’. The output is as follows:


(‘John’ , ’Smith’,17, ‘Instagram’)

(‘Elise’ , ‘Mendez’, 357, ‘Instagram’)


q= "SELECT First_Name, Followers FROM table_name WHERE Platform = ‘YouTube’ ORDER BY Followers DESC LIMIT 2;"

for row in cursor.execute(q):

print(row)


Multiple fetching functions can be combined by stringing them together in the SQL command. In this case, the line of code will return the two columns “First_Name” and “Followers”, ordering by “Followers” in descending order, limiting their “Platform” to YouTube, and limiting the return to 2 items. This may be quite confusing, so it may be easier to take a look at the output:


(‘Carolina’ , 7850)

(‘Tony’ , 1000)



Section 4: JOIN commands

Join commands can combine multiple tables into one, optimizing retrieval from tables and queries. For this section, we will use the following two tables to help conceptualize what these commands look like:


Table Name: table_1


Table Name: table_2


q = "SELECT Last_name, Enrolled FROM table_1 LEFT JOIN table_2 ON table_1.ID = table_2.ID"

for row in cursor.execute(q):

print(row)


This command will return the Last Name and Enrollment status of people, despite the fact that the Enrollment status and Last_Name are stored in different tables. The way that the code does this is by using the ID number from each table to “connect” the tables using a Left Join. A left join takes all of the data from table_1 and the intersection it has with table_2. The output will be as follows:


(‘Quill’ , 1)

(‘Turner’ , 0)

(‘Mendez’ , 1)

('Smith', None)


Keep in mind that booleans will often return a value of 1 when true and a value of 0 when false, this is how the computer stores boolean values.


q = "SELECT Last_name, Enrolled FROM table_1 INNER JOIN table_2 ON table_1.ID = table_2.ID"

for row in cursor.execute(q):

print(row)

This command is very similar to the Left Join, however, instead of taking all elements from table_1 and the intersection with table_2, Inner Join will only return the intersection of both tables. In this scenario, this will lead to the Inner Join to be a true subset of Left Join. The output will be as follows:


(‘Quill’ , 1)

(‘Turner’ , 0)

(‘Mendez’ , 1)


q = "CREATE TABLE IF NOT EXISTS table_name (ID_Number INTEGER PRIMARY KEY, Last_name TEXT);"

cursor.execute(q)

con.commit()


This command creates a table with a primary key. Although this is not necessarily a join command, learning what a primary key does is useful when dealing with multiple tables. Ideally, every table should have a form of a primary key. A primary key should be a piece of data that is unique to each element in the array, such as an ID_Number. A Last_Name would be a bad key as multiple people can have the same last name. If you would like to use two columns to create a Primary Key, the syntax is as follows: PRIMARY KEY(column_1, column_2)




0 comments

Comments


bottom of page