Jeffrey Wu


Interactive Resume

Profile

About me

I am a recent graduate of Boston University and an aspiring data scientist. I double majored in statistics and computer science, giving me a strong theoretical and programming background for developing and evaluating machine learning and statistical models. I enjoy the journey of cleaning and exploring data to make sense of the unknown, especially when it comes to NLP and sports analysis.

Jeffrey Wu

Details

Name:
Jeffrey Wu
Age:
22 years
Location:
Vienna, VA, United States, Earth

Employment

BU Computer Science Department

Sept 2017 - Jan 2019

Grader
– Collaborated with 2 professors and 3 other graders to grade assignments for a class of 200+ students in a timely manner
– Formulated rubrics for exercises involving discrete mathematics and combinatoric structures Boston, MA

BU Initiative for Literacy Development

Sept 2015 - Jan 2019

Tutor
– Collaborated with 4 peers to design and execute constructive reading and math activities and lessons in different classroom settings
– Promoted literacy and mathematics development in 3rd through 5th grade classrooms
– Supported instructors in assessing students’ reading and mathematics proficiency Boston, MA | https://www.bu.edu/seo/students/build/

Education

Boston University

September 2015 - January 2019

BA in Computer Science and Statistics
Concentration in Statistics
Cum Laude
Cum. GPA: 3.64/4.0

Coursework

Computation
  • Intro to Algorithms
  • Data Mining
  • Intro to Databases
  • Theory of Computation
  • Concepts of Programming Languages
  • Intro to Artificial Intelligence
  • Advanced Database Applications
  • Computation Tools for Data Science
  • Machine Learning
Math/Statistics
  • Multivariate Calculus
  • Linear Algebra
  • Discrete Mathematics
  • Applied Statistics
  • Abstract Algebra
  • Linear Models
  • Probability
  • Analysis of Variance
  • Time Series and Forecasting
  • Mathematical Statistics
Boston, MA

Skills


Languages

  • Java (5+ years)
  • Python (3 years)
  • R (3 years)
  • SQL (2 years)
  • NoSQL(2 years)
  • SAS (2 years)
  • HTML(5) (Familiar)
  • CSS (Familiar)
  • Haskell (Familiar)
  • JavaScript (Familiar)
  • C/C++ (Familiar)

Libraries and Tools

  • Excel (5+ years)
  • JMP (4 years)
  • Scikit-learn (3 years)
  • Pandas (3 years)
  • Matplotlib/Seaborn (3 years)
  • Numpy(3 years)
  • Scipy (3 years)
  • Keras (2 years)
  • Tensorflow (2 years)
  • Pytorch (2 years)
  • AWS (Familiar)
  • Tableau (Familiar)
  • MapReduce (Familiar)
  • Hadoop/Pig (Familiar)

Models/Algorithms

  • Data Cleaning
  • Statistical Analysis
  • Regression
  • Data Visualization
  • Model Evaluation
  • Classification
  • Decision Trees
  • Random Forest
  • k-Means
  • Natural Language Processing
  • Clustering
  • Neural Networks
  • Naive Bayes
  • k-Nearest Neighbor
  • Support Vector Machines
  • Gaussian Mixture Models
  • Hierarchical Clustering
  • Boosting/Bagging

Projects

Brazilian Name Recognition Sept 2018 - Dec 2018

- Collaborated with a teammate to develop and train a decision forest and LSTM neural network in Python using Tensorflow to recognize whether a given full name is Brazilian or not
– Achieved an 0.842 accuracy with the decision forest and 0.905 accuracy with the neural network on a test set of 12,000 names

Tools Used: Python, Tensorflow, Neural Networks, Decision Forest

Movie Recommendation System Feb 2018 – Mar 2018

– Extracted features from IMDb movie reviews using Python’s NLTK, Gensim, and Scikit-learn libraries
– Implemented k-Means++ algorithm in Python to cluster IMDb movies based on movie summaries and the content of each movie’s reviews in order to recommend movies depending on previously liked movies

Tools Used: Python, NTLK, Gensim, Scikit-learn, k-Means++

Regression Modelling Bike Sharing Patterns in Boston Oct 2017 – Dec 2017

– Collaborated with a team of 4 to develop and evaluate a linear regression model in R for predicting daily bike rental counts using Boston bike share and weather data pulled from 2011
– Identified outliers and significant predictors to optimize the model in R

Tools Used: R, Linear Regression

Predicting NBA Player Efficiency from College Data Mar 2017 – May 2017

– Developed a web scraper in Python to pull NBA and college basketball data from ESPN
– Conducted data preparation using Python and outlier detection using JMP
– Facilitated a team of 3 to develop and evaluate a linear regression model in JMP for predicting professional basketball efficiency from college team and individual statistics

Tools Used: Python, BeautifulSoup, JMP