Jeffrey Wu

Profile

About me

I am a recent graduate of Boston University and an aspiring data scientist. I double majored in statistics and computer science, giving me a strong theoretical and programming background for developing and evaluating machine learning and statistical models. I enjoy the journey of cleaning and exploring data to make sense of the unknown, especially when it comes to NLP and sports analysis.

Details

Name:
Jeffrey Wu
Age:
22 years
Location:
Vienna, VA, United States, Earth

Employment

BU Computer Science Department

Sept 2017 - Jan 2019

Grader
– Collaborated with 2 professors and 3 other graders to grade assignments for a class of 200+ students in a timely manner
– Formulated rubrics for exercises involving discrete mathematics and combinatoric structures Boston, MA

BU Initiative for Literacy Development

Sept 2015 - Jan 2019

Tutor
– Collaborated with 4 peers to design and execute constructive reading and math activities and lessons in different classroom settings
– Promoted literacy and mathematics development in 3rd through 5th grade classrooms
– Supported instructors in assessing students’ reading and mathematics proficiency Boston, MA | https://www.bu.edu/seo/students/build/

Education

Boston University

September 2015 - January 2019

BA in Computer Science and Statistics
Concentration in Statistics
Cum Laude
Cum. GPA: 3.64/4.0

Coursework

Computation

Intro to Algorithms
Data Mining
Intro to Databases
Theory of Computation
Concepts of Programming Languages
Intro to Artificial Intelligence
Advanced Database Applications
Computation Tools for Data Science
Machine Learning

Math/Statistics

Multivariate Calculus
Linear Algebra
Discrete Mathematics
Applied Statistics
Abstract Algebra
Linear Models
Probability
Analysis of Variance
Time Series and Forecasting
Mathematical Statistics

Boston, MA

Skills

Languages

Java (5+ years)
Python (3 years)
R (3 years)
SQL (2 years)
NoSQL(2 years)
SAS (2 years)

HTML(5) (Familiar)
CSS (Familiar)
Haskell (Familiar)
JavaScript (Familiar)
C/C++ (Familiar)

Libraries and Tools

Excel (5+ years)
JMP (4 years)
Scikit-learn (3 years)
Pandas (3 years)
Matplotlib/Seaborn (3 years)
Numpy(3 years)
Scipy (3 years)

Keras (2 years)
Tensorflow (2 years)
Pytorch (2 years)
AWS (Familiar)
Tableau (Familiar)
MapReduce (Familiar)
Hadoop/Pig (Familiar)

Models/Algorithms

Data Cleaning
Statistical Analysis
Regression
Data Visualization
Model Evaluation
Classification
Decision Trees
Random Forest
k-Means
Natural Language Processing

Clustering
Neural Networks
Naive Bayes
k-Nearest Neighbor
Support Vector Machines
Gaussian Mixture Models
Hierarchical Clustering
Boosting/Bagging

Projects

Brazilian Name Recognition Sept 2018 - Dec 2018

- Collaborated with a teammate to develop and train a decision forest and LSTM neural network in Python using Tensorflow to recognize whether a given full name is Brazilian or not
– Achieved an 0.842 accuracy with the decision forest and 0.905 accuracy with the neural network on a test set of 12,000 names

Tools Used: Python, Tensorflow, Neural Networks, Decision Forest

Movie Recommendation System Feb 2018 – Mar 2018

– Extracted features from IMDb movie reviews using Python’s NLTK, Gensim, and Scikit-learn libraries
– Implemented k-Means++ algorithm in Python to cluster IMDb movies based on movie summaries and the content of each movie’s reviews in order to recommend movies depending on previously liked movies

Tools Used: Python, NTLK, Gensim, Scikit-learn, k-Means++

Regression Modelling Bike Sharing Patterns in Boston Oct 2017 – Dec 2017

– Collaborated with a team of 4 to develop and evaluate a linear regression model in R for predicting daily bike rental counts using Boston bike share and weather data pulled from 2011
– Identified outliers and significant predictors to optimize the model in R

Tools Used: R, Linear Regression

Predicting NBA Player Efficiency from College Data Mar 2017 – May 2017

– Developed a web scraper in Python to pull NBA and college basketball data from ESPN
– Conducted data preparation using Python and outlier detection using JMP
– Facilitated a team of 3 to develop and evaluate a linear regression model in JMP for predicting professional basketball efficiency from college team and individual statistics

Tools Used: Python, BeautifulSoup, JMP

Interactive Resume