Welcome to My Data Story!

Learn about how I got my start in data science!

SMU DS, Class of 2024

I am currently a Data Science graduate student at Southern Methodist University in Dallas, Texas. I am part of the 2024 class and have met many diligent students who share the same interest as me. Not only did I have the privlege of having these wonderful people as classmates, some have become very good friends of mine. Being a part of this cohort has been an eye-opener for me as I learn increasingly advanced statistical methods to apply to the data science field. During this time, I learned how to create better models to gain to gain insights and make predictions from data.

How did all this start?

As a History undergraduate, I always thought that I was going to become a professor specializing in American History. That is until I discovered the process of Exploratory Data Analysis in 2021 during the pandemic. I was intrigued with the process and how I can obtain useful information from it. I was able to utilize what I've learned via Youtube to scrap data from various websites with the BeautifulSoup package in Python and support my thesis on radical attacks within the modern European Union. Such data includes migration pattern, poverty rate per year, unemployment levels and more. By the time I graduated, my interest in data became obsessive which lead me to sign up got courses at Udacity where I learned the basic of data modeling, engineering and dashboard creation.

At Stiddle as a Marketing Consultant, I create gather, clean & analyze data to create create visualizations with the goal of presenting actionable insights to the executive team. Moreover, I've had the oppurtunity to create a NLP pipeline to extract, transform and load data for analysis to determine the type of content to put out. Since then, I've only become increasingly obsessed with the field and plan focus on Natural Language Processing, more specifically Speech Recognition.Though I was a History major, I took many math classes as an undergraduate -such as Linear Algerbra and Vector Calculus- and wanted to put these knowledge to good use. As a result, I decided to enroll in SMU Data Science program where I furthered my understanding and knowledge in statistcs with concepts such as simple & multiple linear regression with SAS & R. As I continue to learn more, I plan on applying what I've learned to every aspect of my professional life.

AT&T cohort

Update: August 15th, 2023

As of August 15th, 2023, I’ve officially finished my internship at AT&T as a Data Scientist. During my time there, I’ve encountered many amazing people from all walks of life and expanded my horizon and knowledge more than I could’ve ever imagined. Moreover, I have made precious connections and some of the best friends anyone could ask for!

At AT&T, I’ve had the pleasure of being mentored by Rajeev Garg, the Director of Data Insights, and Naveen Murthy, a Principal Tech Staff focused on Big Data. While I’ve struggled a ton while doing the project, I am grateful for the hardship because I’ve learned so much about my chosen field, Natural Language Processing, not to mention the telecommunication industry. I learned about new tokenization and word embedding tactics and new models that I know I’ll use extensively in the future. Furthermore and most importantly, I learned how to work with big data, how to put the business’s needs first, and how the field I’ve chosen rapidly evolves. There’s always something new to learn every day!

While it was an arduous journey, I wouldn’t trade it for anything because only through hardship can there be growth. I felt that I’d grown a ton, not only as a data scientist but also as a person. I discovered many of my strengths along with my weaknesses. This was a summer of growth unlike anything I’ve experienced!

Thank you, AT&T! A special thank you to all my co-interns for making me feel welcome in Dallas!

Recent Projects

December 10, 2022

Employee Analysis & Predictions

In this project, I will identify the top three factors that contribute to turnover (backed up by evidence provided by analysis). The analysis is backed up by robust experimentation and appropriate visualization. Additionally, we also want to build a model that accurately predicts employee salary.

December 10, 2022

Home Sale Price Predictions

Build a model that get an estimate of how the SalePrice of the house is related to the square footage of the living area of the house and if the SalesPrice depends on which of 3 interested neighborhood the house is located in. Next, build the most predictive model with as many neccessary for sales prices of homes in all of Ames Iowa.

October 22, 2022

Beer Analysis project with R

Assume that the audience is the CEO and CFO of Budweiser (your client) and that they only have had one class in statistics and have indicated that I cannot take more than 7 minutes of their time. They have hired you to address 9 questions / items.

Project Spotlight

Disaster Response Pipeline

Create an NLP pipeline to help people in needs

Create a machine learning/NLP pipeline to categorize these events and build a model to classify messages that are sent during disasters.

By classifying these messages, we can allow these messages to be sent to the appropriate disaster relief agency. The dataset -provided by Figure Eight- is used to build a model that classifies disaster messages, while the web app is where a respondent can input a new message and get classification results in several categories.