Data Scientist with extensive experience in Python, R, SQL, and advanced machine learning frameworks such as TensorFlow and PyTorch. Proven expertise in predictive modelling, NLP, deep learning, and statistical analysis. Adept at data cleaning, feature engineering, and building ETL pipelines. Skilled in visualization tools including Power BI, Tableau, Matplotlib, Seaborn, and Plotly. Proficient in AWS services like EC2, S3, and Lambda as well as Google App Engine. Demonstrated ability to analyze performance data for injury prediction and player monitoring. Experienced in handling financial transactions and personalizing services based on customer needs. Multilingual professional with strong IT support skills and product expertise.
Natural Language Processing (NLP) Project - Sequence Classification, University of Surrey, Implemented sequence classification for abbreviation and long-form detection using the PLOD dataset sourced from biomedical domain literature., Developed and fine-tuned models with Bio BERT and Bi-Directional LSTM for token classification tasks, employing the BIO (Begin, Inside, Outside) tagging format for labelling abbreviations (AC) and long forms (LF)., Handled datasets of over 50,000 labelled tokens using Hugging Face's dataset repository, ensuring efficient model training and evaluation with no duplication of instances., Focused on Named Entity Recognition (NER) techniques applied to biomedical text and successfully classified sequence elements for abbreviation detection in scientific articles. Cloud-Native API for Financial Risk Analysis, University of Surrey, Designed and implemented a cloud-native system for risk assessment and profitability analysis of trading strategies using Monte Carlo methods., Developed the front-end using Google App Engine (GAE) and backend logic with AWS Lambda and EC2 for scalable, parallel processing of trading signals in financial time series., Integrated Amazon S3 for persistent storage of past results, enabling easy retrieval and audit of risk calculations and profitability., Utilized OHLC data from Yahoo Finance, detecting trading patterns like Three White Soldiers and Three Black Crows to assess market risk with confidence intervals., System included user-interactive API endpoints for initialization, analysis, and real-time cost estimates based on cloud usage., Python, AWS S3, AWS LAMBDA, Google App Engine, SQL, Power BI Predictive Employee Attrition Models, University of Surrey, Collected, cleaned, and preprocessed employee data in R, ensuring high-quality input for predictive modeling., Developed predictive employee attrition models using advanced data analysis and machine learning techniques in R to help organizations identify potential employee turnover., Analysed employee data to uncover key factors contributing to attrition, such as performance, satisfaction, and work-life balance., Presented the findings through a comprehensive presentation that outlined the problem, methodology, and solutions. The results provided actionable insights, helping improve employee retention strategies., Transferred complex datasets into actionable insights using Power BI, Tableau for varied technical and non-technical audiences., R, Scikit-learn, Pandas, Power BI, Tableau Protein Localization Prediction, University of Surrey, Designed a predictive model for E.coli protein localization using features extracted from the UC Irvine Machine Learning Repository dataset., Applied classification techniques to identify protein locations within the cell, enhancing understanding of biological data modeling., Microsoft Excel, Microsoft Visual Studio Code FC Barcelona Performance Analysis: Pre & Post Messi Era, Developed machine learning models using Random Forest Regressor to analyse FC Barcelona's performance pre- and post-Messi era., Predicted goal-scoring patterns based on match statistics like shots, corners, and fouls., Found the factors that potentially affected FC Barcelona because of Messi's presence., Evaluated models using MAE, RMSE, and R² scores, revealing a drop in predictive accuracy after Messi's departure., Jupyter Notebook, GitLab, Excel