Data scientist with a decade of analytical experience — from real-time signals intelligence in the U.S. Air Force to building end-to-end ML pipelines today.
Houston, TX · LinkedIn · justin.ali.data@gmail.com
I'm a Springboard Data Science Fellow and an MS Management Information Systems / Data Analytics Certificate candidate at Texas Southern University (May 2026). Before pivoting to data science, I spent six years as a U.S. Air Force Airborne Cryptologic Linguist, translating and analyzing classified intelligence in real time — work that taught me how to extract clear signal from messy, high-stakes data under tight deadlines. I bring that same precision to the problems I work on now.
I'm most interested in roles that involve forecasting, classification, and turning ambiguous business questions into reproducible ML pipelines.
| Project | What it does | Stack |
|---|---|---|
| Flight Departure Delay Prediction | LightGBM classifier tuned with Bayesian optimization — improved AUC over baseline while cutting compute vs. grid search. | Python, LightGBM, scikit-optimize, stratified k-fold CV |
| Texas Electricity Demand Forecasting | Time-series forecasting of hourly electricity demand across ERCOT regions. | Python, Pandas, time-series ML |
| U.S. Lower 48 Energy Usage | End-to-end data wrangling and EDA across 48 states to surface regional consumption patterns. | Python, Pandas, NumPy, Matplotlib |
| Customer Segmentation (K-Means) | K-Means + PCA segmentation pipeline with silhouette scoring and stakeholder-ready profiles. | Python, scikit-learn, PCA |
| Olympic Athletes — Data Storytelling | 120 years of Olympic data turned into a narrative on gender representation and sport dominance. | Pandas, Matplotlib, Seaborn |
| Logistic Regression Pipeline | Reproducible binary classification workflow with GridSearchCV, ROC-AUC, and full methodology docs. | scikit-learn, GridSearchCV |
Languages: Python · SQL ML: scikit-learn · LightGBM · Random Forest · Logistic Regression · K-Means · Bayesian Optimization Data & stats: Pandas · NumPy · SciPy · feature engineering · hypothesis testing · time-series analysis Viz: Matplotlib · Seaborn · Power BI · Tableau Tools: Jupyter · Git/GitHub · GridSearchCV · PCA · cross-validation
- MS, Management Information Systems + Certificate in Data Analytics — Texas Southern University (May 2026)
- Springboard Data Science Career Track — Fellow (May 2026)
- BS, Business Administration — Bellevue University
- U.S. Air Force — Airborne Cryptologic Linguist (2009–2015)
I'm currently looking for data scientist roles. The best way to reach me is justin.ali.data@gmail.com or via LinkedIn.