Olivia Lund's Data Science Portfolio Project

I'm a computer science graduate from California State University, Chico.

This site hosts the deliverables for the main project of CSCI 385 - Data Science, taught by Dr. Kevin Buffardi, from when I took it in the fall of 2019.

The portfolio project is intended to explore data sets connected by a subject or theme using theory and principles of data science, implemented using the advanced statistical methods provided in the R programming language.

Project Deliverables

Part 1: Organizing and Exploring

This section of the project focuses on identifying a large and varied dataset that interesting analyses could be drawn from, and using the principles of tidy data to clean it up.

For this section of the project, I chose a dataset containing daily meteorological reports from New Delhi, India, spanning from 1996 until 2014.

Part 1 code and report

Part 2: Analyzing and Predicting

This section of the project focuses more on predictive models, using an R library that trains a machine learning system on a portion of the data, and assesses how accurately it predicts the values of the rest of the dataset. In addition, it features an additional dataset to provide a more robust understanding of the subject.

For this section of the project, I added data from a spreadsheet of New Delhi air quality reports taken daily during the year 2001. This was combined with the prior model to attempt to predict air quality in addition to weather patterns.

Part 2 code and report

Part 3: Results and Operationalization

This section of the project emphasizes reflecting on the experience of making the project and on exploring the potential that the project has to have a positive social impact in the world in addition to revising and correcting the work of the previous projects from peer and instructor feedback.

In this section of the project, I performed some cursory analysis about the effectiveness of the project and explored what operationalization would look like for the project and what effects that could possibly have.

Part 3 code and report