
Technical Data Scientist specialising in building scalable data pipelines and machine learning models with a strong focus on health related analytics. Experienced in Python, SQL, model development, data validation, and cloud based processing. Skilled in transforming raw multimodal datasets into reliable insights to support experimentation, evaluation, and product decision making. Passionate about responsible AI, safety, reproducibility, and data driven systems that improve user wellbeing and healthcare outcomes.
Languages and Frameworks: Python, SQL, Pandas, NumPy, Scikit Learn, PyTorch, TensorFlow
Machine Learning: Classification, NLP, embeddings, model optimisation, feature engineering
Pipelines and Data Engineering: ETL and ELT design, data cleaning, orchestration concepts, validation, monitoring
Cloud and Big Data: Hadoop, Spark, Databricks familiarity, data lakes and warehousing concepts
Visual Analytics: Power BI, Tableau, Matplotlib, Seaborn
Governance and Safety: GDPR, Collibra, data privacy, data quality frameworks, documentation
Version Control and Collaboration: GitHub, JIRA, Confluence, CI principles
Health NLP Prediction Model
Loan and Market Outcome Modelling
Airbnb Market Insights Dashboard
Animal Classification using Transfer Learning
Responsible AI, healthcare analytics, model safety, pipeline optimisation, continuous learning, volunteer work, reading, and football