Projects

Browse selected personal projects. Click a card to view a short summary, visuals, and links.

Computer Vision Phenophase Image Analysis (ResNet-50 + GANs) Dec 2023 GitHub
Phenology project
Detect leaf phenophase from PhenoCam images and forecast SOS/EOS across sites with augmentation for rare phases.
  • ResNet-50 classifier; GANs for data scarcity
  • Cross-site generalization beyond single camera tuning
  • Calendar-level SOS/EOS with confidence bands
Data Engineering YouTube Data Pipeline with Apache Airflow Oct 2023 GitHub
YouTube pipeline
Config-driven ETL from YouTube API to S3/Snowflake with Airflow orchestration.
  • Incremental loads, retries, schema checks, logs
  • Idempotent upserts; downstream content analytics
Analytics House Price Profiler on Snowflake Oct 2023 GitHub
Housing profiler
60k+ listings scraped and standardized; modeled price drivers and sensitivity.
  • Flattened JSON → ~40% faster queries; full geocoding
  • Answered 11 key business questions
ML Breast Cancer Prediction (LR vs. NN) Oct 2023 GitHub
Breast cancer project
Compare classical and deep approaches for early detection on tabular features.
  • LR 92.9% acc; Keras NN 97.3% acc
  • SMOTE for imbalance; calibrated probabilities
DE & BI Uber Data Analytics (GCS · Mage · BigQuery · Looker) Aug 2023 GitHub
Uber analytics
End-to-end pipeline to BI with KPI queries returning in seconds.
  • Stakeholder dashboard for demand peaks & supply gaps
ML Credit Card Fraud Detection Aug 2023 GitHub
Fraud detection
Imbalanced classification with SMOTE and model comparison (DT, LR, RF, NB).
  • Best model ~99% accuracy; +~10% after rebalancing
Regression Salary Prediction (Gradient Descent) Jul 2023 GitHub
Salary prediction
From baseline to tuned GD with strong MSE reduction and clear diagnostics.
  • MSE reduced from 91.2% → 6.3% with scaling & step tuning

Research

Work in progress from the ARID Lab at the University of Arizona. Click to view a brief, non-confidential summary.

Causal Inference Insurance at Birth & Infant Outcomes (EHR, multi-site) Sep 2025
Insurance & infant outcomes (placeholder)
Causal/logistic modeling to assess payer-type effects on infant survival using multi-site EHR.
  • Disparities: uninsured/self-pay highest risk; Medicaid ~10% survival gain; private ~70% strongest protection
  • Reproducible pipelines for subgroup analyses and equity-focused reporting
Health Analytics Healthcare Utilization & Guideline Adherence Aug 2025
Utilization & adherence (placeholder)
ML/statistical models on 50k+ records to evaluate compliance and long-horizon utilization patterns.
  • Identified 3 distinct utilization patterns + top 5 predictors of continuous care
  • Analyzed 10-year trajectories across pediatric/adult cohorts; equity-focused insights and scalable sequence models