
Experienced Data Engineer with expertise in ETL, data modelling, and medallion architecture in Azure, Fabric and Databricks ecosystem. Proficient in batch and streaming processes, leveraging tools such as Fabric Data Factory, Eventhouse, Synapse, OneLake, Delta Live Tables and Azure Data Factory. Skilled in Python, PySpark, and SQL programming languages. Adept at using Azure DevOps for CI/CD pipelines and Databricks Asset Bundles for efficient data management. Committed to optimising data workflows and enhancing data-driven decision-making processes.
End-to-End Sales Azure Data Engineering Project (Azure | Databricks | CI/CD)
Built production-grade Azure pipeline ingesting sales data from REST APIs via Azure Data Factory into ADLS, processing through Medallion Architecture in Databricks. Implemented Spark Structured Streaming with Auto Loader for incremental ingestion, Delta Live Tables for curated Gold datasets with SCD Type 2 dimensions, and full CI/CD using GitHub and Databricks Asset Bundles across Dev/Test/Prod environments.
Overall Project Impact
Link: End-to-End-Sales-Azure-Data-Engineering-Project-With-Databricks-AssetsBundle-CI-CD
Azure Data Engineering Project With CI/CD And Databricks Asset Bundles
Implemented an end-to-end Azure data pipeline using Azure SQL Database as the source, orchestrating incremental ingestion with Azure Data Factory into Azure Data Lake (Bronze layer). Processed and enriched data in Azure Databricks using Spark Structured Streaming, Auto Loader, and schema evolution to build the Silver layer. Created curated Gold layer datasets with Delta Live Tables (DLT), implementing SCD Type 2 for dimensions and SCD Type 1 upserts for facts, following the Medallion Architecture. Integrated CI/CD with Git to automate deployments across environments and delivered analytics-ready datasets using PySpark and SQL to Databricks SQL Warehouse and Azure Synapse Analytics.
Overall Project Impact
• End-to-End Automation: 100% of the pipeline automated from ingestion → transformation → delivery using Azure Data Factory, Databricks, and Delta Live Tables (DLT), with incremental ingestion and CDC ensuring zero duplication and full traceability.
• Data Quality & Reliability: Implemented SCD Type 2 for dimension tables and SCD Type 1 for fact tables, with data quality expectations validated on 100% of tables and full audit logging across ingestion and transformations.
• Scalability & Maintainability: Designed using the Medallion Architecture (Bronze → Silver → Gold) with automated schema evolutionand CI/CD deployments via Azure DevOps and GitHub Asset Bundles across Dev → Test → Prod environments.
• Business Value & Analytics Enablement: Delivered curated datasets to Databricks SQL Warehouse, Synapse Analytics, and Power BI Partner Connect, enabling self-service analytics and accelerating decision-making while reducing manual reporting effort by ~2–3 hours per week.
Link: End-to-End-Sales-Azure-Data-Engineering-Project-With-Databricks-AssetsBundle-CI-CD
Flights Azure Databricks Project
Developed a fully end-to-end data engineering solution built exclusively on Azure Databricks, leveraging Spark Structured Streaming for real-time data ingestion and processing. Utilized PySpark to perform scalable, high-performance data transformations, and built Delta Live Tables (DLT) pipelines to automate Slowly Changing Dimensions (SCDs) while enforcing data quality and consistency. Designed and delivered dynamic dimensional models that produced curated, analytics-ready datasets for downstream consumption.
Overall Project Impact
Link: Flights Azure Databricks End To End Data Engineering Project
Microsoft Fabric Data Engineering Project
Developed end-to-end Fabric pipeline with parameterized Data Factory ingestion, OneLake storage, and Fabric Notebooks for transformation. Implemented SCD Type 2 for historical tracking and star schema modelling in Fabric Data Warehouse. Delivered Power BI dashboards with email-based monitoring
Overall Project Impact
Link: Microsoft Fabric Airbnb Data Engineering Project
Azure Databricks End-To-End Project with Azure Devops
Delivered a scalable, automated data engineering solution using Azure Databricks, Azure Data Factory, and Azure Data Lake Storage, with real-time ingestion implemented through Spark Structured Streaming. Built reliable transformation pipelines incorporating SCD Type 1 (manual) and SCD Type 2 (Delta Live Tables), and applied Star Schema modelling with incremental data loading to support efficient analytics. Curated high-quality datasets across the bronze, silver, and gold layers, and published analytics-ready data to Azure Synapse Analytics and Databricks SQL Warehouse for BI and reporting.
Overall Project Impact
Link: https://www.jesseportfolio.co.uk/post/azure-databricks-end-to-end-dataengineering-project-with-azure-devops
Olympics Data Engineering Project with Azure DevOps
Built an end-to-end Azure and Databricks data pipeline using the Olympics 2024 dataset, designed around the Medallion Architecture (Bronze → Silver → Gold). Orchestrated data ingestion with Azure Data Factory into Azure Data Lake Storage and applied strong data governance using Unity Catalog for secure, centralized access control. Developed transformation pipelines with Delta Live Tables (DLT), implementing CDC and SCD Type 1 to manage incremental updates and ensure data consistency. Delivered curated gold-layer datasets to Databricks SQL Warehouse and Azure Synapse Analytics, enabling high-performance analytics and optimized reporting for tools such as Power BI.
Overall Project Impact
Link: jesseportfolio.co.uk/post/olympics-data-engineering-project-with-azure-devops