Physicist turned data scientist, focused on applying machine learning to problems in health, medicine, and nutrition.
I'm drawn to projects where the analysis has real stakes โ understanding disease, improving diagnostics, or uncovering how lifestyle affects biology. Currently seeking junior data science roles.
-
MRI Brain Tumor Classification โ Deep learning pipeline for classifying brain MRI scans into four tumor categories using ResNet18. Emphasis on minimizing missed diagnoses through class-weighted training, early stopping, and threshold-optimized inference.
-
Healthcare Data Warehouse โ Built a medallion architecture (bronze โ silver โ gold) data warehouse on PostgreSQL using CMS Medicare synthetic claims data (DE-SynPUF, 2.3M beneficiaries, 11M+ claims). Bronze layer loads raw CSVs; silver layer implements type casting, deduplication, and code decoding; gold layer focuses on dimensional modeling and analytics. Demonstrates SQL proficiency, data quality practices, and ETL pipeline design.
-
American Gut Project Microbiome Analysis โ Analyzing the effects of coffee consumption on gut microbiome composition. Combines biology domain knowledge with statistical analysis.
-
Oura Menstrual Cycle Phase Detection - Analyzing personal data collected from the Oura ring to detect ovulation based on temperature readings and detect irregular luteal phase lengths compared to population statistics. Combines reproductive health knowledge with statistical analysis.
Languages: Python, SQL, C++
ML/DL: PyTorch, TensorFlow, Scikit-learn
Data: Pandas, NumPy, Matplotlib, Seaborn
Tools: Git, Jupyter, Docker, AWS
- CNNs Cats and Dogs โ Binary image classification with CNNs
- Modern Data Warehouse & Analytics โ SQL data warehouse design and analytics pipeline
- Mitigating Costs And Preventing Casualties From Global Terrorism - Tableau dashboard to visualize trends in terrorism

