Xiaohan (Aria) Wang

Data Scientist

Resume

About Me


Hi! I'm Aria.

I love traveling through the maze of overwhelming data and uncovering hidden patterns and relationships - yes, I'm a data scientist. I fell in love with data and analytics since I started my undergrad at UCLA as a Statistics major. It was love at first sight. I was so fascinated about data science in that it helps solve problems and make informed decisions. I'm fortunate to say that I found my passion early and have been doing what I love since then.

I'm also a vlogger. I take and share videos of the special moments that I experienced with my family and friends. I started recording vlogs since 2018, when I realized that I always have a nostalgia for the present before it has become the past. Because of my "anticipatory nostalgia," I have a greater appreciation for what I have and what I love. I cherish every present moment I'm in now, and I look forward to remembering it through videos, which allow me to look back and relive my present.

It's great to meet you! Please feel free to check out my GitHub and Youtube pages! If you have any questions, or if you just want to say hi, please feel free to email me at xiaohanwang2020@u.northwestern.edu.

Experience


Placeholder image

Spiegel Research Center

Research Assistant - -

  • Built end-to-end machine learning solutions for predicting customer churn and win-back for WEHCO Media. Developed the data cleaning and feature engineering pipeline in PySpark. Achieved high prediction accuracy (AUC 0.80) using logistic regression models in R
  • Extracted patterns in the reading engagement behaviors of WEHCO customers, and provided actionable insights for supporting WEHCO to shift from advertising-based strategy to reader-based revenue models
  • Provided thorough evaluation on the effectiveness of newsletters and pricing strategies to improve customer retention and lifetime value. Supported WEHCO’s efforts in increasing customer stickiness by investigating the reliability of current engagement metrics
survival analysis reader engagement customer behavior customer LTV journalism media PySpark R
Placeholder image

HSBC Bank

Graduate Student Consultant - -

Northwestern MSiA industry practicum project

  • Created an open-source SQL data repository that can be leveraged for existing revenue analysis within HSBC
  • Detected market dynamics from geospatial data with Lasso and Random Forest models. Identified groups of branches with consolidation/closure potentials using K-means clustering. Created ArcGIS visualization dashboard
open-source data predictive modeling Lasso Random Forest k-means Python SQL ArcGIS
Placeholder image

Acumen, LLC

Statistical Programmer Intern - -

  • Improved the definitions of control and risk windows for FDA’s real-time influenza surveillance on a rare syndrome by analyzing its presence in diagnoses from historical Medicare claims for the past 2 decades using SAS
  • Optimized the prediction of beneficiaries’ choices of pharmacy chains for prescription refills by identifying key demographic factors influencing beneficiaries’ decisions. Improved the classification accuracy by 17% by implementing Random Forest models
classification machine learning public health FDA R SAS

Selected Projects


Mask On - Clinical/N95 Face Mask Detection

Coffee Bean Supplier Recommender

Predicting "Match" in a Speed Dating Experiment

Text-Based Emotion Classification

Network of Amazon Co-Purchased Products

Spotify Top Songs - Data Visualization with D3.js

Education


Northwestern University

M.S. in Analytics - -

Expected coursework : Predictive Analytics, Data Visualization, Data Mining, Databases, Text Analytics, Big Data Analytics (Hadoop & Spark), Analytics Value Chain (Data Science Pipeline), Deep Learning, Data Warehousing, Social Network Analysis

University of California, Los Angeles

B.S. in Statistics & B.S. in Mathematics/Economics - -

Relevant coursework : Statistical Programming, Regression Modeling, Time Series Analysis, Monte Carlo Methods, Machine Learning, Optimization, Stochastic Process, Linear Algebra, Econometrics Theory

Contact