SQL Projects

On this page, you'll find projects that I've completed using SQL!
Please click on the project title to access my Github codes.

Investigating Sakila DVD Rental Database

Barplot of most rented actors

In this project, I will be querying the Sakila DVD Rental database which holds information about a company that rents DVDs.

Project Information

I am doing this to gain an understanding of the customer base and to answer the questions listed below:

  • Who were the top 10 most-rented actor of 2006?
  • How many rentals are missing from each category at the Woodridge store?
  • How much did the top 20 districts each spend?
  • For the 15 top-spending district, by how much are they outperforming the preceding district?

While the goal of this project is to investigate the database and create visuals answering the questions listed above, this project is also an opportunity to showcase what I've learned as part of the Nanodegree program. Some skills I would like to draw attention to are my ability to join many tables, create window functions, create Common Table Expressions (CTE) and perform calculations with the help of logical operators.

The querying phase was performed using PostgresSQL and the resulting tables were then saved as .csv files. The visuals were then created using Google Sheets and Slides.

Recent Projects

December 10, 2022

Employee Analysis & Predictions

In this project, I will identify the top three factors that contribute to turnover (backed up by evidence provided by analysis). The analysis is backed up by robust experimentation and appropriate visualization. Additionally, we also want to build a model that accurately predicts employee salary.

December 10, 2022

Home Sale Price Predictions

Build a model that get an estimate of how the SalePrice of the house is related to the square footage of the living area of the house and if the SalesPrice depends on which of 3 interested neighborhood the house is located in. Next, build the most predictive model with as many neccessary for sales prices of homes in all of Ames Iowa.

October 22, 2022

Beer Analysis project with R

Assume that the audience is the CEO and CFO of Budweiser (your client) and that they only have had one class in statistics and have indicated that I cannot take more than 7 minutes of their time. They have hired you to address 9 questions / items.

Project Spotlight

Disaster Response Pipeline

Create an NLP pipeline to help people in needs

Create a machine learning/NLP pipeline to categorize these events and build a model to classify messages that are sent during disasters.

By classifying these messages, we can allow these messages to be sent to the appropriate disaster relief agency. The dataset -provided by Figure Eight- is used to build a model that classifies disaster messages, while the web app is where a respondent can input a new message and get classification results in several categories.