Data Engineer with4+ years of hands-on experience and up-to-date skills on the state-of-the-art automation Technologies in Data Management and integration including ETL, Data Warehousing, BI Development, SQL Development and Big Data.
Overview
4
4
years of professional experience
1
1
Certification
Work History
Azure Data Engineer
Company - EPAM, Client: Thomson Reuters
08.2022 - Current
My roles and responsibilities encompassed designing, building, and maintaining end-to-end data pipelines on Azure
I actively participated in migrating on-premises data solutions to Azure and optimizing existing data infrastructure
My tasks included creating pipeline jobs, scheduling triggers, and mapping data flows in Azure Data Factory, as well as performing data transformation using Azure Data bricks notebooks
Additionally, I developed and implemented ETL workflows, conducted data cleaning and engineering using Python packages, and utilized Kafka and Spark Streaming for real-time data processing and storage
Responsibilities:
Develop, and implement technical solutions using different technical components of azure cloud such as Azure Data Factory, Data bricks, Azure Data Lake storage (ADLS Gen2), Spark SQL and Pyspark
Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and Azure Data Lake Analytics
Worked on Azure Data Factory, designed and deployed scalable data intake pipelines for consuming data from a variety of sources, including SQL databases, CSV files, and REST APIs
Developed an ETL process using Python functions in Azure Functions to transfer data from Azure Cosmos DB to Azure Synapse Analytics based on specific events and use cases
Designed and developed a scalable data warehouse using Azure Blob Storage, Data Lake, and Azure Synapse to store and manage large volumes of data
Created POWER BI Visualizations and Dashboards as per the requirements
Designed, built, and automated insightful business intelligence dashboards using Tableau to help stakeholders and clients plan future sales and marketing strategies
Implementing One time Data Migration of multistate level data from SQL server to Snowflake by using Python and Snow SQL
Performed data messaging operations using DAX expressions code in Power BI and Utilized Power BI gateway to keep dashboards and reports up to date with on-premises data sources
Developed SSIS Packages to import data into MS SQL Server from multiple OLTP sources
Worked on GIT to maintain source code in Git and GitHub repositories
Designed various Jenkins jobs to continuously integrate the processes and execute CI/CD pipelines using Jenkins
Developed operational, analytical dashboards and data visualization reports using Power BI
Designed and built Spark/PySpark-based Extract Transformation Loading (ETL) pipelines for migration of credit card transactions, account, and customer data into enterprise Hadoop Data Lake
Developed SSIS packages using for each loop in Control Flow to process all excel files in folder, File System Task to move file into Archive after processing and Execute SQL task to insert transaction log data into the SQL table
Used Pyspark and Spark-SQL to clean, transform and aggregate data with proper file and compression types as per requirement before writing data to azure data lake storage
Proficient in the Python programming language and has developed Web services
Used JIRA as the Scrum Tool for the Scrum Task board and work on incidents
Worked through Waterfall, Scrum/Agile Methodologies
Environment: Azure Data Lake, Azure Data bricks, Azure Data Factory, DAX, Snow SQL, Snowflake, Python, Power BI, Tableau, Azure Blob Storage, T SQL, Spark SQL, Rest APIs, SSIS, Git, Github, MS SQL, OLAP, Jenkins, ETL, Azure Cosmos DB, Snow flake, Snow SQL, CI/CD, Jira, Scrum, Waterfall, Scrum, Agile
Data Engineer
Company - EY, Client - Deutsche Bank
07.2020 - 08.2022
The bank main activities are loan origination and deposit generation
In this bank, my duties included creating, implementing, and maintaining all databases, including complex queries, triggers, and stored procedures
I also helped with the administration of several bank databases in both development and production environments
It offers the entire spectrum of personal and business banking services and products
Responsibilities:
Exposed to all aspects of software development life cycle (SDLC) like Analysis, Planning, Developing, Testing, implementing and post-production analysis of the projects
Develop, and implement technical solutions using different technical components of azure cloud such as Azure Data Factory, Data bricks, Azure Data Lake storage (ADLS Gen2), Spark SQL and Pyspark
Participating in meetings with Business stakeholders to demonstrate working solutions
Designed ETL packages dealing with different data sources (SQL Server, Flat Files) and loaded the data into target data sources by performing different kinds of transformations using SQL Server Integration Services (SSIS)
Helped individual teams to set up their repositories in bit bucket and maintain their code and help them setting up jobs which can make use of CI/CD environment
Worked to import the data from on-premises to Azure Data, work was done on the Data Factory tool
Used stored procedures to load the data into data warehouse, Configuration, Logging and Auditing
Queries to transform the data
Worked with deployments from Dev to UAT, and then to Prod
Worked on visualization tools Power BI, Confidential Excel - formulas, Pivot Tables, Charts and DAX Commands
Defining data warehouse (star and snowflake schema), fact table, cubes, dimensions, and measures using SQL Server Analysis Services
Used the GIT version control system for repository access and coordination with CI tools
Developed complex T-SQL queries, stored procedures, triggers, and views to extract, manipulate, and transform data for reporting and analysis
Involved in loading data from the Linux file system to Hadoop Distributed File System (HDFS) and setting up HIVE, PIG, HBASE, and SQOOP on Linux/Solaris Operating System
Developed Power BI reports and dashboards from multiple data sources (SQL, SSAS, CSV, Flat files, Azure SQL Database) using data blending
Wrote Spark SQL queries and Python scripts to design the solutions and implemented them using Pyspark
Worked on Jira tickets for new features, improvements, and issues with bugs or defects
Used short-term releases and specifically Agile and Scrum methodology to work on the project
UNIX and Linux platforms are well-known, and efforts have been made to compile the code and fix issues
Environment: Azure Data Factory, Data bricks, Azure Data Lake storage (ADLS Gen2), Spark SQL, Pyspark, ETL, SSIS, CI/CD, Power BI, Excel, DAX, GIT, T SQL, Linux, Hadoop, Hive, Pig, Hbase, Sqoop, SSAS, CSV, Azure SQL, Spark, Pyspark, Jira, Agile, Scrum, UNIX, Linux
Education
Master of Science - International Business Management
Heriot-Watt University
06.2024
Skills
Azure Services: Azure Data Factory, Airflow, Azure Data Bricks, Logic Apps, Functional App, Snowflake, Azure DevOps
Languages: SQL, PL/SQL, Python, HiveQL, Scala
Cloud: Azure DevOps, CI/CD, Git
Big Data Technologies: MapReduce, Hive, Python, PySpark, Scala, Kafka, Spark streaming, Oozie, Sqoop, Zookeeper
Bug Tracking Tool: Jira
Databases: MS SQL, Azure SQL DB, Azure Synapse, MS Excel, MS Access, Oracle11g/12c, Cosmos DB
IDE & Build Tools: Cassandra, MongoDB, NO SQL
Scripting Languages: Python, Shell Scripting, Power Shell
Methodology: Agile, Scrum, Waterfall
Operating Systems: Linux Unix, Windows
Azure Data Solutions: Azure Data Factory, SQL Server, SQL Databases, SQL Data Warehouse, Azure Data bricks, Azure Synapse Analytics, and Azure Cosmos DB
Certification
Microsoft Certified: Azure Data Engineer Associate November,2024
Timeline
Azure Data Engineer
Company - EPAM, Client: Thomson Reuters
08.2022 - Current
Data Engineer
Company - EY, Client - Deutsche Bank
07.2020 - 08.2022
Master of Science - International Business Management
Heriot-Watt University
Similar Profiles
George PeaffGeorge Peaff
Copy Editor at Practical Law, a Thomson Reuters CompanyCopy Editor at Practical Law, a Thomson Reuters Company