Summary

Overview

Work history

Education

Skills

Certification

Timeline

Usman Ghani Mughal

London,United Kingdom

Summary

Data Engineer with 4+ years of experience specializing in designing, building, and optimizing scalable, robust data pipelines with a strong focus on reliability, performance, and maintainability across diverse industries. Proficient in data ingestion, ETL/ELT workflows, and architecting resilient data solutions using frameworks like the Medallion Architecture. Skilled in Databricks, PySpark, Delta Lake, and building Spark-based pipelines on AWS EMR with Airflow. Experienced in CI/CD automation, OLTP/OLAP data modelling, Power BI dashboarding, and contributing to AI-driven initiatives.

Overview

years of professional experience

years of post-secondary education

Certification

Work history

Data Engineer

Cloud Enterprise Business Solutions (CEBS)

Islamabad/Pakistan

2023.12 - 2025.01

Company Overview: Client: MATAS (Denmark’s largest health & beauty retailer)
Migrated Matas (D365 F&O) to Synapse Link in 2 months, reducing load time by 20%.
Built scalable ADF pipelines to ingest 1TB/day data into CDP (ADLS Gen2), cutting latency by 40% and optimized existing ADF, reducing runtime by 73% (from 30 minutes to 8 minutes).
Built CI/CD in Azure Devops to deploy 50+ ADF, ensuring seamless production releases.
Developed and maintained a Medallion architecture with optimised PySpark in Databricks, leveraging Auto Loader, Unity Catalog, and Delta Live Tables for real-time and batch processing, and orchestrated over 100+ workflows using Databricks Asset Bundles, deployed via CI/CD.

Big Data Engineer

Nowasys LTD

Islamabad/Pakistan

2023.02 - 2023.12

Company Overview: Client: Anteriad (Anteriad powers B2B with the industry’s leading data)
Developed 15 ingestion PySpark pipelines on AWS EMR, ingesting 25–30 terabytes daily into an S3 data lake. Orchestrated workflows via Airflow.
Developed/deployed a PySpark-based DQ Framework, reduced data quality errors by 99%.
Optimized Spark code and fine-tuned AWS EMR configurations, improving performance and resource utilization by 50%.
Implemented automated backfill mechanism for batch pipelines, to reduce data loss by 100%.

Big Data Engineer

the ENTERTAINER

Lahore/Pakistan

2022.05 - 2023.02

Company Overview: The ENTERTAINER provides 2-for-1 deals on services from top brands in the Middle East.
Optimised Azure Synapse DWH to support cross-functional teams, reducing ad-hoc query time by 5% and dashboard reporting latency by 20%.
Defined a standardized data modeling approach (Kimball) for DWH; the approach now serves as a blueprint for 10+ data engineers across the analytics and data team.
Built and monitored 50+ ELT pipelines in Azure Data Factory, ingesting data into fact and dimension tables, implementing watermarking for incremental loads.
Developed PySpark pipelines on Databricks to process and transform 50M+/day web/app logs, loading into the DWH to improve recommendation system accuracy by 15%.

Data Engineer

Afiniti

Islamabad/Pakistan

2021.04 - 2022.05

Company Overview: Afiniti is a leading provider of customer experience (CX) artificial intelligence (AI).
Designed and implemented data pipelines for port, vehicle and broadband data serving both US and UK markets, processing 10M+ records daily.
Engineered and optimised processes using multiprocessing and multithreading, improving performance by 30% on 100K+ tasks.
Built web scraping engine to gather data from sources, reducing data acquisition time by 40%.
Provided guidance and support to AI teams, facilitating their understanding and utilization of third-party datasets effectively.

Education

MSc - Data and Data Science Technology

Northumbria University

Current

Bachelor of Science - Computer Science

Comsats University Islamabad

2017.01 - 2021.01

Skills

Cloud technologies: Databricks, Azure DevOps
Data warehouses: Synapse SQL, Redshift, Teradata
Data lakes: ADLS Gen2, S3, HDFS
Databases: MySQL, SQL Server, MongoDB
Data formats: CSV, JSON, Parquet, Delta
Distributed computing: Spark

Streaming frameworks: Spark Structured Streaming, Kafka
ETL tools: Azure Data Factory
Programming languages: Python, Java, Scala, C
Data manipulation: Excel, Pandas, NumPy
Orchestration tools: Apache Airflow, Cron Jobs
Dashboards: Power BI, Tableau

Certification

Data Engineering (Nanodegree - Udacity)
Microsoft Azure Databricks for Data Engineering
Introduction to Big Data with Spark and Hadoop (Coursera | IBM)
ETL and Data Pipelines with Shell, Airﬂow and Kafka (Coursera | IBM)
Introduction to Bash Shell Scripting
Apache Spark Essential Training: Big Data Engineering (LinkedIn)
Advanced Python (LinkedIn)
Advanced SQL for Query Tuning and Performance Optimization (LinkedIn)

Timeline

Data Engineer

Cloud Enterprise Business Solutions (CEBS)

2023.12 - 2025.01

Big Data Engineer

Nowasys LTD

2023.02 - 2023.12

Big Data Engineer

the ENTERTAINER

2022.05 - 2023.02

Data Engineer

Afiniti

2021.04 - 2022.05

Bachelor of Science - Computer Science

Comsats University Islamabad

2017.01 - 2021.01

MSc - Data and Data Science Technology

Northumbria University