Hello there!

I’m a data scientist focused on making messy data usable—clean pipelines, clear metrics, and results people can act on.

A recent meeting started with three dashboards and no agreement. I combined provider data, county context, and outcomes into one reproducible pipeline—and one honest chart. The debate ended, and the team aligned on the first fix. That’s my north star: reliable data, simple communication, practical action.
  • 500k+ multi-site EHR/Medicaid rows standardized into an OMOP-style model
  • ~60% reduction in prep time via streamlined ETL & automated geocoding
  • +25% improvement in clinical-text extraction accuracy with LLM-assisted QA

Education

University of Arizona logo
Master of Science, Data Science Tucson, AZ · Dec 2023 · GPA: 4.0/4.0
Selected courses
  • Machine Learning & Predictive Modeling
  • Causal Inference & Experimental Design
  • Databases (SQL, BigQuery) & Data Engineering
  • Cloud & Workflow Orchestration (Airflow)
  • NLP for Clinical Text
  • Statistical Computing with R/Python
Focus
Reproducible pipelines (Python/R/SQL), privacy-aware analytics
Capstone
Clinical ETL → OMOP-style model, geocoding automation
Recognition
GPA 4.0/4.0
University of Engineering and Management logo
Bachelor of Technology, Electrical Engineering Jaipur, India · May 2017 · GPA: 7.66/10 (magna cum laude)
Representative courses
  • Signals & Systems; Control Theory
  • Digital Logic & Microprocessors
  • Power Systems & Machines
  • Numerical Methods; Probability & Stats
  • Programming (C/C++), MATLAB
Theme
Systems thinking → transition to data science
Honors
Magna cum laude

Career Roadmap

Foundation Systems thinking, control, and computation. This set the discipline for how I approach messy data and complex pipelines today.
Engineering Built enterprise ETL and data products at scale (Teradata/Informatica → Python/Unix). Learned reliability, SLAs, and clear handoffs.
Analysis Formal training in ML, causal inference, and data engineering. Focused on reproducible Python/R/SQL workflows and privacy-aware analytics.
Research Prototyped pipelines, experiments, and dashboards that made results more actionable for teams.
Impact Standardized 500k+ EHR/Medicaid rows to an OMOP-style model, automated geocoding, and shipped dashboards that informed care decisions.


Thanks for visiting! Feel free to explore and connect.