Summary
Overview
Work History
Education
Skills
Certification
Projects
Timeline
Generic

Rija Fatima

Dublin,Ireland

Summary

Accomplished data professional with 3+ years of experience in designing distributed data solutions, implementing scalable ETL pipelines, and enhancing data analytics frameworks.- Extensive expertise in programming languages such as Python, SQL and proficiency in big data tools including Apache Spark and Databricks. Adept at utilising ETL tools like Azure Data Factory and AWS Kinesis for efficient data processing. Experienced in data modelling, visualisation using Power BI, and version control through GitHub. Committed to leveraging advanced analytics to drive business insights and support strategic decision-making.

Overview

3
3
years of professional experience
5
5
years of post-secondary education
1
1
Certification

Work History

Data Engineer II

Afiniti AI Ltd
Dublin, ROI
01.2023 - 01.2025
  • Managed complete data pipeline construction using Airflow, DBT, Python, SQL, and Power BI
  • Used AWS Kinesis to improve ETL pipelines, achieving a 63% data ingestion boost
  • Designed and implemented distributed data solutions using Azure Databricks and Apache Spark.
  • Designed scalable ETL workflows using Azure Data Factory and Synapse Analytics, enhancing productivity by 42%.
  • Enhanced data retrieval speed by 35% using Azure Data Lake, T-SQL, and PySpark.
  • Applied upsert and Type-2 logic to data models, achieving 25% greater data accuracy.
  • Created and released more than 15 Power BI dashboards for data-driven decisions.
  • Engineered Python scripts to automate ETL processes, cutting manual effort by 50% and improving data accuracy.
  • Maintained robust CI/CD pipelines for improved delivery lifecycles.
  • Combined data using AWS Data Pipeline and PySpark for reliable ETL processes.
  • Optimized batch data ingestion in Python for Alteryx, HDFS, and AWS Redshift, following data dimensional modelling.
  • Performed feature engineering on Snowflake, improving model value by 5% through data acceleration with Databricks.
  • integrated Genesys Cloud data via API, converting nested JSON to RDBMS for batch processing and data warehousing.

Data Engineer I

Afiniti AI Ltd
Dublin, ROI
04.2022 - 12.2022
  • Integrated end-to-end data pipeline solutions using Azure Data Factory and Databricks, ensuring seamless data transfer across 10+ datasets with zero downtime.
  • Optimised SQL scripts with window functions, reducing processing time by 40%.
  • Enhanced Azure infrastructure with RBAC, managed identities, and Azure AD integrations.
  • Implemented data preprocessing improvements in ADF, leading to a 35% accuracy increase.
  • Enhanced Azure Databricks clusters, cutting job execution time by 25% and lowering costs.
  • Deployed as a part of 3 data engineer team two queues, adding $2.5M/year to portfolio, driving business growth and scalability.
  • Developed executive dashboards using Apache Spark for runtime data monitoring.
  • Improved data quality with rigorous validation techniques and error handling protocols.

Data Analytics Associate

Afiniti AI Ltd
Dublin, ROI
12.2021 - 04.2022
  • Conducted market analysis across Europe and North America, to enhance insight for the AI team, creating predictive metrics used in enhancing AI models performance by 80%.
  • Designed real-time performance dashboards using Power BI and Apache Superset.
  • Structured data for AI teams' decision-making and predictive modelling
  • Applied advanced statistical analytics and dimensional modelling to deploy AI-driven solutions on customer warehouse data.
  • Integrated data into GP and MS Azure Databricks using PySpark pipelines

Education

Master of Science - Software Development

University of Glasgow
Glasgow, United Kingdom
09.2019 - 11.2020

Bachelor of Science - Physiology, Sport Science, and Nutrition

University of Glasgow
Glasgow, United Kingdom
09.2015 - 06.2019

Skills

  • Programming: Python, SQL, Java, Bash, Linux command line
  • Big Data Tools: Apache Spark, Databricks, Apache Airflow, DBT
  • ETL Tools: Azure Data Factory, AWS Glue, Talend
  • Data Manipulation and Analysis: Pandas, NumPy
  • Machine Learning & AI: Scikit-learn, TensorFlow, PyTorch
  • Version Control & CI/CD: Git, Docker, Kubernetes
  • Cloud Platforms: Azure (Data Lake, Synapse Analytics, Blob Storage), AWS (Glue, Kinesis, S3)
  • Databases: Snowflake, PostgreSQL, Greenplum, MySQL
  • Data Modeling: Dimensional Modeling, Normalized Modeling, Snowflake Schema
  • Data Visualization: Power BI, Apache Superset
  • Version Control: GitHub, Bitbucket, GitLab
  • Methodologies: Agile, CI/CD, Sprint Development

Certification


  • Azure DP-203: Data Engineer Associate (in progress)

Projects

  • catalogued Data Pipeline on AWS Developed a comprehensive real-time data pipeline to ingest, process, and visualize data using AWS services. Automated data ingestion from a local generator to Kinesis, transformed and stored data in S3 as Parquet files, catalogued data with Glue Crawler, enabled querying via Athena, and created dashboards with QuickSight.

Technologies Used: AWS S3, Kinesis Data Streams, Glue ETL, GlueCrawler,Athena,QuickSight, Python, Boto3

  • Sales Maturity Forecasting Pioneered a sales maturity forecasting system to predict customer holdings 30 days after a call, ensuring compliance with the client's requirement of a 30-day maturity period. Predictions were stored in a Greenplum (PostgreSQL) database, and an interactive dashboard was created using Python libraries, providing real-time insights on current gain numbers and 30-day predictions to stakeholders.

Technologies Used: Python (pandas, matplotlib, seaborn, jinja2, dash), Greenplum (PostgreSQL)

  • 'Meal' Web Application Developed a Django-based platform to assist users with grocery shopping, recipe management, and expiry date tracking. Integrated data scraping functionality to provide price-per-unit analysis and automated email notifications for expiry reminders.

Technologies Used: Python, Django, SQLite, BeautifulSoup, HTML, CSS, Bootstrap, PythonAnywhere


Timeline

Data Engineer II

Afiniti AI Ltd
01.2023 - 01.2025

Data Engineer I

Afiniti AI Ltd
04.2022 - 12.2022

Data Analytics Associate

Afiniti AI Ltd
12.2021 - 04.2022

Master of Science - Software Development

University of Glasgow
09.2019 - 11.2020

Bachelor of Science - Physiology, Sport Science, and Nutrition

University of Glasgow
09.2015 - 06.2019
Rija Fatima