Summary
Overview
Work history
Education
Skills
Certification
Affiliations
Timeline
Generic

Raihan Ahmed

Summary

I am a data scientist who is passionate about bringing data science concepts and techniques to real world problems, a person who sees AI as a tool for aide not replacement. I offer a strong foundation in analytical thinking and problem solving. I am well versed in python and sql and have extensive knowledge in statistics and machine learning. Being and end to end data scientist I work from the ground up will all my projects from the data build to the visualisation dashboards.

Overview

7
7
years of professional experience
5
5
years of post-secondary education
1
1
Certification

Work history

Data scientist

Openreach
London
01.2020 - Current

Built a churn model to predict the probability of a customer leaving:


Technical aspect:

  • Liaised with subject matter experts to locate and create the data via sql scripts in big query, carried out EDA to determine insight and patterns within the data as well as dictate my feature engineering.
  • Cleaned and prepped the data for modelling by checking data types, removing nulls and outliers etc using pandas and numpy.
  • Prototyped a stacked model with a combination of tree models from light GBM, cat boost and xgboost packages as well as an MLP model from the keras package to predict the probability of a customer leaving.
  • Built a dashboard to display predictions assisted with explainability from packages lime and SHAP.
  • Worked with data engineers to ensure the productionisation of the data build.
  • Deploy and monitor model.


Business aspect:

  • Targeting customers with a high probability of leaving


Segmentation model to create customer cohorts


Technical aspect:

  • Build customer specific dataset by joining and filtering many large data sources.
  • Carried out EDA to determine insight and patterns, which was presented to non-technical stakeholders.
  • Cleaned and prepped the data for modelling by checking data types, removing nulls and outliers etc using pandas and numpy.
  • Created a unsupervised clustering model for numerical data types (k-means) and for categorical data types (k-modes).
  • Model output and visualisations are fed into a dashboard for stakeholders to see.


Business aspect:

Created cohorts passed onto the commercial team for more tailored marketing and logistics.


Time-series model forecasting the number of faults coming into the business:


Technical aspect:

  • Created an sql script to build the data for the model
  • Explored the data using Jupyter notebooks ,tested multiple models such as ARIMA, prophet and LSTM
  • Built a dashboard using dash to display the forecasts to given stockholders hosted in a docker container
  • Python package and pipelines built to deploy project on GCP where is runs in an automated fashion.


Business aspect:

  • Modernised and automated a business process from excel sheets and statistical models to databases and machine learning models for the commercial team.
  • Forecasts being published in a day as a pose to two weeks.
  • Forecast is used to determine the number of engineers to hire hence the more accurate forecast has led to less resource wastage.


Built a classification model to predict propensity to fault and furthered faults:


Technical aspect:

  • Created an sql script pulling data in from multiple sources to create a dataset
  • Ran exploratory analysis ranging from visualisations to statistical analysis
  • Created a pipeline to clean the data I.e. remove nulls, remove outliers and set data types
  • Created a pipeline to engineer new features as a result of EDA
  • Built various types of machine learning models such as ensemble tree models and neural networks. As well as building a stacked model which gave the best precision and recall
  • Building a layer of explainability wrapping around the model giving direction to the business how to act on the prediction


Business aspect:

  • Saved the business 10 million a year in costs by saving further engineer visits
  • Sending out the more correctly skilled engineer to the customer
  • Customer fault fixed quicker and more efficient


Setting data science practices within the business:


  • Creating a framework for python ML projects for the team to use
  • Migrating local projects to git
  • Creating wiki tutorials for the team
  • Presenting AI concepts to non-technical audiences

Junior data scientist

Teradata
London
09.2018 - 12.2019

Building a model to predict price elasticity for a given product for a multinational retail client:


Technical aspect:

• Liaising with industry experts to determine which features should be used in our analytical dataset

• Cleaning the dataset i.e. dealing with missing data, transforming features

• Harnessing linear models to predict price elasticity, writing production R code and building R packages

• Creating tests to see if outputs match business requirements of the client

• Documenting methods used and presenting to clients both technical and non-technical

• Creating the production environment using docker which runs R scripts and writes outputs to a database


Business aspect:

• Allows the company to in real time to determine the price for maximum profit and demand


Building a topic modeller for categorising tweets from a customer service account :


Technical aspect:

• Ingesting live tweets

• Creating a dataset of tweets

• Cleaning tweets i.e. removing emojis and special characters

• Analysis of tweets including word count and time analysis

• Preparation for modelling i.e. lemmatising, count vectoriser

• Creating a Latent Dirichlet Allocation model


Business aspect:

• Allowed the business to see the topics of the most occurring current issue

Education

Master of Science - Data Science and Analytics

Brunel University London
Uxbridge
10.2017 - 09.2018

Bachelor of Science - Actuarial Science

University of East Anglia
Norwich
09.2012 - 07.2016

Skills

  • Machine learning
  • SQL programming
  • Google Cloud Platform
  • Deep learning
  • Time series analysis
  • Docker
  • Clustering
  • Presenting
  • Critical analysis
  • Complex problem-solving

Certification

Professional Program for data science, Microsoft

Affiliations

  • Travelling
  • Table tennis
  • Tech

Timeline

Data scientist

Openreach
01.2020 - Current

Junior data scientist

Teradata
09.2018 - 12.2019

Master of Science - Data Science and Analytics

Brunel University London
10.2017 - 09.2018

Bachelor of Science - Actuarial Science

University of East Anglia
09.2012 - 07.2016
Raihan Ahmed