Predicting Hospital Readmission Using Tidymodels

I wanted to share one of my project presentations from the last class of my master’s program, Statistical Learning II. This project involved building a decision support tool using patient record data to predict the probability of hospital readmission. I used Shiny to build my decision support tool which featured an XGBoost model that I constructed using the tidymodels collection of R packages. In the presentation below, I briefly discuss the steps I took while building my decision support tool, including data pre-processing, feature engineering, and model building.

Highlights from rstudio::conf(2022)!

I had the pleasure of attending the 2022 R Studio conference last week in Washington D.C. As always, it was filled with incredible workshops taught by highly esteemed instructors, fascinating presentations from R users all over the globe, and of course, great company! Reunited with my good friend and former professor, Dr. Shannon Pileggi. I wanted to take a moment to reflect on some of the highlights from my experience at the conference:

By Noelle Pablo

August 1, 2022

Principal Components Analysis in R: College Sports

Last fall, I took a multivariate statistics course at the University of Kansas as part of the Applied Statistics, Analytics, and Data Science graduate program. One of the topics covered in the course was principal components analysis (PCA). PCA is commonly used when 1. there are a large number of numerical variables in a data set, and 2. the variables are strongly correlated with each other. After conducting a PCA, the correlated variables can be replaced by a smaller number of uncorrelated variables, known as the principal components.

An Investigation of the Bechdel Test Using Logistic Regression

Introduction Last spring, I took a categorical data analysis course at the University of Kansas as part of the Applied Statistics, Analytics, and Data Science graduate program. For our final project, we were asked to find a data set, form a research question, and conduct a series of analyses to answer that research question. While looking for inspiration for my final project, I found a Bechdel test data set containing the Bechdel test ratings for films released between 1970 and 2013 on the Tidy Tuesday weekly data project which was originally used in a FiveThirtyEight article titled The Dollar-And-Cents Case Against Hollywood’s Exclusion of Women.

By Noelle Pablo

January 30, 2022

Olympics Shiny App

During summer 2021, I took a data visualization and acquisition course at the University of Kansas as part of the Applied Statistics, Analytics, and Data Science graduate program. The final project of the course was to visualize real world data interactively by creating an R Shiny app. While brainstorming ideas for my Shiny app, I found a historical Olympic Games data set from the Tidy Tuesday weekly data project located in the R for Data Science GitHub repository.

By Noelle Pablo

November 21, 2021