Skip to content
View JustinAliData's full-sized avatar

Block or report JustinAliData

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
JustinAliData/README.md

Hi, I'm Justin Ali

Data scientist with a decade of analytical experience — from real-time signals intelligence in the U.S. Air Force to building end-to-end ML pipelines today.

Houston, TX · LinkedIn · justin.ali.data@gmail.com


About

I'm a Springboard Data Science Fellow and an MS Management Information Systems / Data Analytics Certificate candidate at Texas Southern University (May 2026). Before pivoting to data science, I spent six years as a U.S. Air Force Airborne Cryptologic Linguist, translating and analyzing classified intelligence in real time — work that taught me how to extract clear signal from messy, high-stakes data under tight deadlines. I bring that same precision to the problems I work on now.

I'm most interested in roles that involve forecasting, classification, and turning ambiguous business questions into reproducible ML pipelines.

Featured projects

Project What it does Stack
Flight Departure Delay Prediction LightGBM classifier tuned with Bayesian optimization — improved AUC over baseline while cutting compute vs. grid search. Python, LightGBM, scikit-optimize, stratified k-fold CV
Texas Electricity Demand Forecasting Time-series forecasting of hourly electricity demand across ERCOT regions. Python, Pandas, time-series ML
U.S. Lower 48 Energy Usage End-to-end data wrangling and EDA across 48 states to surface regional consumption patterns. Python, Pandas, NumPy, Matplotlib
Customer Segmentation (K-Means) K-Means + PCA segmentation pipeline with silhouette scoring and stakeholder-ready profiles. Python, scikit-learn, PCA
Olympic Athletes — Data Storytelling 120 years of Olympic data turned into a narrative on gender representation and sport dominance. Pandas, Matplotlib, Seaborn
Logistic Regression Pipeline Reproducible binary classification workflow with GridSearchCV, ROC-AUC, and full methodology docs. scikit-learn, GridSearchCV

Tech stack

Languages: Python · SQL ML: scikit-learn · LightGBM · Random Forest · Logistic Regression · K-Means · Bayesian Optimization Data & stats: Pandas · NumPy · SciPy · feature engineering · hypothesis testing · time-series analysis Viz: Matplotlib · Seaborn · Power BI · Tableau Tools: Jupyter · Git/GitHub · GridSearchCV · PCA · cross-validation

Background

  • MS, Management Information Systems + Certificate in Data Analytics — Texas Southern University (May 2026)
  • Springboard Data Science Career Track — Fellow (May 2026)
  • BS, Business Administration — Bellevue University
  • U.S. Air Force — Airborne Cryptologic Linguist (2009–2015)

Get in touch

I'm currently looking for data scientist roles. The best way to reach me is justin.ali.data@gmail.com or via LinkedIn.

Popular repositories Loading

  1. uk-house-price-eda uk-house-price-eda Public

    UK House Price Index (Average price) — London Datastore

    Jupyter Notebook

  2. DataScienceGuidedCapstone DataScienceGuidedCapstone Public archive

    Forked from springboard-curriculum/DataScienceGuidedCapstone

    Guided Capstone with Springboard

    Jupyter Notebook

  3. capstone-two-electricity-demand-texas capstone-two-electricity-demand-texas Public

    Time-series forecasting of hourly electricity demand in ERCOT regions.

  4. weather-api-data-wrangling weather-api-data-wrangling Public

    API data wrangling project using Open-Meteo

    Jupyter Notebook

  5. us-energy-usage-eda us-energy-usage-eda Public

    End-to-end data wrangling and EDA of energy consumption across the U.S. lower 48 states.

    Jupyter Notebook

  6. country-club-case-study country-club-case-study Public

    Using SQL/Python on a country club database

    Jupyter Notebook