Education

Boston University
M.S. in Applied Data Analytics
University of Michigan
B.S. in Mathematics

About Me

Hello! Welcome to a showcase of my data science, machine learning and statistics projects.

My passion is to utilize data to our advantage - optimization, explaining the past, and predicting the future. I have a strong emphasis on statistical testing, machine learning and exploratory data analysis. I am dedicated to problem-solving, leveraging creativity and expertise to navigate obstacles with utmost efficiency and precision.

In my leisure hours, I find myself enjoying the same interests - combining creativity with engineering in sewing and crocheting.

About Me

Hello! Welcome to a showcase of my data science, machine learning and statistics projects.

My passion is to utilize data to our advantage - optimization, explaining the past, and predicting the future. I have a strong emphasis on statistical testing, machine learning and exploratory data analysis. I am dedicated to problem-solving, leveraging creativity and expertise to navigate obstacles with utmost efficiency and precision.

In my leisure hours, I find myself enjoying the same interests - combining creativity with engineering in sewing and crocheting.

Skills

python
TENSORFLOW
SKLEARN
PYSPARK
PANDAS
NUMPY
PLOTLY
R
SQL
C++
GOOGLE CLOUD PLATFORM (GCP)
GOOGLE ANALYTICS
WEBFLOW
AIRTABLE
KUBERNETES
EXCEL

Projects

TED Talk Summarization
python
natural language processing
  • Leveraged Wav2Vec and tokenizers to convert speech to text with 89% character accuracy
  • Implemented various algorithms (eg. BERT, KeyBERT, spaCy) to extract keywords and keyphrases
Brain MRI Tumor Classification
python
tensorflow
COMPUTER VISION
  • Developed a Vision Transformer (ViT) model to detect tumor presence from brain MRI scans with 88.83% accuracy
  • Utilized overfitting prevention techniques and hyperparameter tuning for model optimization
Flight Analytics
python
spark
big data
machine learning
Google cloud platform
  • Built machine learning models (regression & classification) for flight delay prediction on 60M rows of data
  • Utilized PySpark for parallelization of tasks and deployed model on GCP with a 8.8GB dataset
Uber vs. Lyft Prices
r
statistical testing
exploratory data analysis
  • Applied statistical hypothesis testing (t-tests, ANCOVA) to identify significant price differences
  • Communicated statistical findings into easily understandable content with visualizations
Mental Health Importance Classification
python
machine learning
  • Employed 5 ML algorithms and 5 dimensionality reduction methods, creating 25 models to predict employee's take on mental health importance
  • Evaluated models using performance metrics, with the best model at an ROC AUC of 0.92
Image Resize
C++
computer vision
  • Authored a computer vision algorithm in C++ that detects and removes unimportant pixels in images, and resizes them without distinguishably distorting the images
  • Implemented a unit testing framework to detect bugs and memory leaks, ensuring seamless use of the program
Post Classification
C++
natural language processing
  • Composed a C++ program using NLP and ML techniques that read blog posts and determined their subjects, allowing for efficient classification of past and future posts
  • Utilized binary search trees for the efficient storing and searching of elements, optimizing program run time
Chicago Traffic Accidents Analysis
r
statistical testing
exploratory data analysis
  • Performed data cleaning, transformation and analysis with the use of regression models in R to study the correlation of different variables with traffic accidents, determining leading causes
  • Worked with the tidyverse and ggplot packages for data analysis and data visualization, translating technical content to easily digestible visuals for a large audience
Salary Prediction
python
machine learning
statistical testing
  • Adopted machine learning algorithms (logistic regression, random forest, k-NN, Naïve Bayes) and optimized hyperparameters to train data for successful prediction of salaries
  • Executed thorough data standardization and label encoding with Chi-square test dimensionality reduction for seamless and prediction with higher accuracy
TED Talk Summarization
python
natural language processing
  • Leveraged Wav2Vec and tokenizers to convert speech to text with 89% character accuracy
  • Implemented various algorithms (eg. BERT, KeyBERT, spaCy) to extract keywords and keyphrases
Brain MRI Tumor Classification
python
tensorflow
COMPUTER VISION
  • Developed a Vision Transformer (ViT) model to detect tumor presence from brain MRI scans with 88.83% accuracy
  • Utilized overfitting prevention techniques and hyperparameter tuning for model optimization
Flight Analytics
python
spark
big data
machine learning
Google cloud platform
  • Built machine learning models (regression & classification) for flight delay prediction on 60M rows of data
  • Utilized PySpark for parallelization of tasks and deployed model on GCP with a 8.8GB dataset
Mental Health Importance Classification
python
machine learning
  • Employed 5 ML algorithms and 5 dimensionality reduction methods, creating 25 models to predict employee's take on mental health importance
  • Evaluated models using performance metrics, with the best model at an ROC AUC of 0.92
Image Resize
C++
computer vision
  • Authored a computer vision algorithm in C++ that detects and removes unimportant pixels in images, and resizes them without distinguishably distorting the images
  • Implemented a unit testing framework to detect bugs and memory leaks, ensuring seamless use of the program
Post Classification
C++
natural language processing
  • Composed a C++ program using NLP and ML techniques that read blog posts and determined their subjects, allowing for efficient classification of past and future posts
  • Utilized binary search trees for the efficient storing and searching of elements, optimizing program run time
Salary Prediction
python
machine learning
statistical testing
  • Adopted machine learning algorithms (logistic regression, random forest, k-NN, Naïve Bayes) and optimized hyperparameters to train data for successful prediction of salaries
  • Executed thorough data standardization and label encoding with Chi-square test dimensionality reduction for seamless and prediction with higher accuracy
Flight Analytics
python
spark
big data
machine learning
Google cloud platform
  • Built machine learning models (regression & classification) for flight delay prediction on 60M rows of data
  • Utilized PySpark for parallelization of tasks and deployed model on GCP with a 8.8GB dataset
Uber vs. Lyft Prices
r
statistical testing
exploratory data analysis
  • Applied statistical hypothesis testing (t-tests, ANCOVA) to identify significant price differences
  • Communicated statistical findings into easily understandable content with visualizations
Chicago Traffic Accidents Analysis
r
statistical testing
exploratory data analysis
  • Performed data cleaning, transformation and analysis with the use of regression models in R to study the correlation of different variables with traffic accidents, determining leading causes
  • Worked with the tidyverse and ggplot packages for data analysis and data visualization, translating technical content to easily digestible visuals for a large audience
Salary Prediction
python
machine learning
statistical testing
  • Adopted machine learning algorithms (logistic regression, random forest, k-NN, Naïve Bayes) and optimized hyperparameters to train data for successful prediction of salaries
  • Executed thorough data standardization and label encoding with Chi-square test dimensionality reduction for seamless and prediction with higher accuracy

Experience

Data Analyst Intern
Builders Patch • FinTech Startup
May 2023 - Present
New York, NY
  • Spearheaded a public data dashboard initiative, utilizing Python to aggregate and visualise data
  • Orchestrated end-to-end product lifecycles, bridging technical concepts between product and engineering
  • Proactively engaged with clients to address concerns and facilitate communication to engineering
  • Served as a product expert, troubleshooting and resolving roadblocks, being the go-to for inquiries
  • Proficiently updated and created sites on Webflow, handling CMS collections and ensuring optimal UI/UX
Advanced Analytics Intern
CIMB Bank • Finance
June – July 2022
Jakarta, Indonesia
  • Scripted automation in R, saving 100 hrs/mo. of manual data cleaning effort
  • Performed SQL-based analysis, delivering actionable insights for data-driven decision-making
  • Communicated data and findings, leveraging visual reports to capitalize on marketing campaigns
Strategy Intern
CONCAT Inc. • Social Media Startup
May – August 2021
Seoul, South Korea
  • Conducted competitor analysis, presenting findings to CEO, resulting in successful seed funding
  • Improved UI/UX with informed insights based on current trends for target demographic
Builders Patch • FinTech Startup
Product Manager & Data Scientist
New York, NY
January 2024 - Present
  • Orchestrated end-to-end product lifecycles, bridging technical concepts between product and engineering
  • Drove feature prioritization through cross-functional collaboration and client engagement
  • Executed SQL queries for data collection, generating marketing data visualizations in Python
  • Implemented a rule-based algorithm to standardize data for comparative data analysis
Data Analyst
May - December 2023
  • Led full-cycle development of a public data dashboard, including exploratory data analysis and the automation of data aggregation and visualization in Python to highlight key insights
  • Initiated the automation weekly of data entry in Webflow CMS collections, streamlining site maintenance
  • Served as a product expert, troubleshooting and resolving roadblocks, being the go-to for inquiries
CIMB Niaga • Banking
Advanced Analytics Intern
Jakarta, Indonesia
June - July 2022
  • Scripted automation in R, saving 100 hrs/mo. of manual data cleaning effort
  • Performed SQL-based analysis, delivering actionable insights for data-driven decision-making
  • Communicated data and findings, leveraging visual reports to capitalize on marketing campaigns
CONCAT Inc. • Social Media Startup
Strategy Intern
Seoul, South Korea
May - August 2021
  • Conducted competitor analysis, presenting findings to CEO, resulting in successful seed funding
  • Improved UI/UX with informed insights based on current trends for target demographic

Get more great resources

Get the latest design resources from across the web. Straight to your inbox.