Summary
Overview
Work history
Education
Skills
Certification
Affiliations
Timeline
Generic

Ope Akinbola

London

Summary

Detail-orientated and thorough individual with strong problem solving and critical thinking skills. Committed to creating secure network architecture and developing solutions to limit access to protected data and programmes. Monitors computer virus reports and regularly updates virus protection systems.

Overview

8
8
years of professional experience
1
1
Certification

Work history

Data Engineer

RICHO DIGITAL SERVICE
London
10.2022 - 08.2023
  • Migrating data and constructing ETL (Extract, Transform, Load) workflows to extract information from on-premises sources, third-party APIs, and other platforms into a Synapse workspace, utilizing Python, PySpark, and SQL
  • Create, build, and sustain scalable, automated, and informative data tables that serve as the core input for models, reports, and dashboards
  • Designed best practices to facilitate the seamless automation of data ingestion and data pipeline workflows in Azure Data factory, and Databricks to ensure continuous operation
  • Evaluate and maintain the workflow and enhance the effectiveness of data pipelines responsible for handling more than 50 terabytes of data daily
  • Engage with business stakeholders to understand business needs and translate the business needs into actionable reports
  • Key Achievement Implemented automated ETL (extract, transform, load) procedures, simplifying data manipulation, and cutting processing time by up to 50%
  • Enhanced the efficiency of current ETL processes and SQL queries to optimize the weekly business report's performance.

Data Engineer

Booking.com
London
05.2021 - 10.2022
  • Constructed a data pipeline that migrated and processed transactional data from on-premises MySQL database into Azure using Synapse Analytics by incorporating 10 million rows of records which reduce manual workload by 30% monthly
  • Maintained data pipeline up-time of 97% while ingesting daily transactional data and evaluate the workflow and increase the efficiency of data pipelines that process over 10 TB of data daily
  • Used Databricks notebook to Create tables, partitioning tables, Join conditions, correlated sub queries, nested queries, views for the business application development
  • Utilized PySpark to distribute data processing on large datasets to improve ingestion and processing of data by 70%
  • Engage with business stakeholder to understand business needs and translate the business needs into actionable reports.

Data Engineer

Department of Health and Social Care
London
09.2020 - 04.2021
  • Created pipelines in Azure Data Factory using linked services/datasets/pipeline/ to extract, transform and load data from different sources like Blob storage, Azure SQL Data Warehouse
  • Developed spark applications using pyspark and spark-sql for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights in Covid-19 data
  • Designed analytics dashboard using PowerBI for showing real time updates Test modelling
  • Solution utilized PowerBI, enterprise gateway and Azure SQL Data Warehouse
  • Used Databricks, Jupyter notebooks and spark-shell to develop, test and analyze spark jobs before scheduling customized spark jobs
  • Deployed and tested (ci/cd) our developed code using visual studio team services
  • Worked in agile development environment in sprint cycles of two weeks by dividing and organizing tasks
  • Used Azure DevOps for a SaaS based code repository and also use it for tracking our agile/scrum workflows in our application development.

Data Engineer

Biffa
London
12.2018 - 09.2020
  • Maintained and created data flows using Data Factory, Stream Analytics, Data Lake and HD Insight
  • Used spark in HD Insight to process the steaming and data flows and stored the processed data in Data Lake
  • Created dashboards to analyze and view different relationships in the data using Microsoft Power BI
  • Improved and did bug fixes on already created pipelines and flows on Azure HDInsight
  • Data warehouse using Azure SQL Data Warehouse
  • Created and managed Kafka, hadoop and spark clusters in HDInsight
  • Use Azure Databricks to aggregate data, create a data warehouse and deploy work from notebook into production
  • Create machine learning models using sklearn machine learning library to predict the amount of waste using random forest and linear regression
  • Use Python for data wrangling e.g
  • Pandas for merging, pivoting/spreading, melting/gathering, etc., data into DataFrames
  • Created Data manipulation, analysis and visualization using Python (pandas, matplotlib).

Data Engineer

DELIVEROO
London
07.2016 - 12.2018
  • Used AWS kinesis to ingest the data into amazon S3
  • Created and maintained the AWS data pipeline using Kinesis, EMR and Amazon S3
  • Processed and analyzed the stream and batch data using Spark in EMR
  • Processed the unstructured and semi-structured data using EMR
  • Used AWS Glue to efficiently load and prepare the data for analytics
  • Created dashboards for analyzing the streaming data using Configured Kinesis agent for kinesis streaming and Kinesis Firehose
  • Managed Amazon IAM roles and added different policies to the roles according to the requirement
  • Server-less architecture using AWS Lambda with Amazon S3 and Amazon Redshift DB.

Hadoop developer

Pairview
London
04.2015 - 07.2016
  • Developed data ingestion system using spark streaming and kafka
  • Created Kafka cluster on Confluent Cloud and produce data to Kafka topics on the cluster
  • Created spark applications using Scala for batch processing of the data and deployed the application on Cloudera cluster
  • Loaded data from databases using Sqoop into hive, Hbase and HDFS
  • Developed oozie workflow to analyze and solve Big Data problems of the client
  • Developed map-reduce applications using hive and pig in order to solve the big data related problems of the client
  • Managed different databases including RDBMS like SQL Server, MySql and NoSQL databases like Cassandra and MongoDB
  • Loaded semi-structured data into Hive and created Hive tables using Hiveql.

Education

BSC. (Honours) computing science -

Staffordshire University

Btec national diploma Applied science - undefined

Lambeth college

Skills

  • Spark
  • PySpark
  • Scala
  • Python
  • SQL
  • Hadoop
  • Hive
  • MapReduce
  • Pig
  • Sqoop
  • Cloudera
  • HDP
  • S3
  • EMR
  • EC2
  • Kinesis
  • Elastic Search
  • HDInsight
  • DataLake
  • Databricks
  • Cloud Storage
  • DataProc
  • Pub/Sub
  • SQL Server
  • MySql
  • MS Access
  • Microsoft SQL Server
  • MongoDb
  • Cassandra
  • Tableau
  • Gliffy
  • Adobe Photoshop
  • JIRA
  • PowerBI

Certification

  • SAP Certified Application Associate- Business Intelligence with SAP NetWeaver
  • BCS Certificate in Business Analysis Practice
  • Prince2 Foundation
  • GDPR Regulation and Compliance.

Affiliations

  • Football

Timeline

Data Engineer

RICHO DIGITAL SERVICE
10.2022 - 08.2023

Data Engineer

Booking.com
05.2021 - 10.2022

Data Engineer

Department of Health and Social Care
09.2020 - 04.2021

Data Engineer

Biffa
12.2018 - 09.2020

Data Engineer

DELIVEROO
07.2016 - 12.2018

Hadoop developer

Pairview
04.2015 - 07.2016

BSC. (Honours) computing science -

Staffordshire University

Btec national diploma Applied science - undefined

Lambeth college
Ope Akinbola