Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Durga Pothula

Stevenage,HRT

Summary

  • I have around 5 years of experience as Azure data Engineer comprising designing and implementing ETL processes. Proficient in developing scalable data pipelines on the Azure cloud platform. Strong expertise in Data integration, transformation, and ensuring data quality. Committed to optimizing data workflows to enhance business intelligence and analytics.
  • Experience on Migrating on Premises SQL database to Azure data lake, Azure Synapse Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse or Dedicated SQL Pool.
  • Gained experience at using Azure services such as Azure Data Factory, Azure Data bricks, Azure Synapse analyses, and others to develop reliable data pipelines, carry out advanced data analyses, and support data-driven decision-making.
  • Proficient in writing Bash scripts, SQL scripts, and PL/SQL scripts. Extensive Expertise with relational databases like MySQL, Oracle, and SQL Server.
  • Experience in Reporting Services, Power BI (Dashboard Reports), Crystal Reports, Tableau using MS SQL Server.
  • Proficiency in SQL across several dialects (we commonly write MySQL, PostgreSQL, Redshift, SQL Server, and Oracle, Snowflake)
  • Highly skilled in utilizing Hadoop, HDFS, Map-Reduce, Hive, and Spark SQL for efficient ETL tasks, real-time data processing, and analytics.
  • Experienced in developing and carrying out ETL (Extract, Transform, Load) processes with Informatica Power Centre, guaranteeing precise and prompt data transfer.
  • Have basic ADF administrative knowledge, including installing IR, creating services like Logic applications and AZURE Data Lake storage, and granting access to ADLS via the service principle.
  • Has experience on migrating SQL databases with Microsoft Visual Studio's Managed Instance, Azure Data Factory, and SSIS.
  • Have strong data modelling skills for utilizing SQL and PL/SQL query-based methods to determine results. Teams from engineers, data scientists, and stakeholders are our downstream customers.
  • Additionally, managing and approving database access; and migrating on-premise databases to Azure Data Lake store using Azure Data factory.
  • Created and managed ETL/ELT processes using Azure Data Factory for data movement and transformation.
  • Gained knowledge of the Azure cloud platform (HDInsight, Data Lake, Databricks, Blob Storage, Data Factory, Synapse, SQL DB, DWH and Data Storage Explorer).
  • An in-depth comprehension and familiarity with NoSQL databases including MongoDB, PostgreSQL, HBase, and Cassandra.
  • Familiarity with Power BI and Tableau for creating interactive visualizations and reports based on processed data.
  • Strong experience in using Pyspark, SQL, Scala, Python, and advanced SQL for automating simple to overly complicated ETL processes using various triggers.
  • Experience in Data Warehouse/Data mart, OLTP and OLAP implementations teamed with Project Scope, Analysis, Requirements gathering, Data modelling, Effort Estimation, ETL Design, ELT Design, development, System testing, implementation, and production support.
  • Create and execute data storage strategies utilizing Azure services, including Azure Data Lake Storage, Azure SQL Database, and Azure Cosmos DB.
  • Gained experience on creating and managing data pipelines, using Azure Data Factory and Azure Databricks.
  • Having experience in utilizing issue tracking systems such as JIRA and Control Version Systems like Git and SVN.
  • Knowledge of Agile approaches, such as Scrum. familiarity in setting up and utilizing Splunk applications on Linux and UNIX systems.
  • Strong experience with UNIX/LINUX environments and shell scripts.

Overview

5
5
years of professional experience
1
1
Certification

Work History

AZURE DATA ENGINEER

CGI
Bangalore, KA
03.2021 - 12.2022

Description: To enable accurate decision-making, create and manage dynamic dashboards, and reports. Ensure accurate and compliant data by working together across sectors. By using SQL, Python, Tableau, and Power BI, you can leverage sophisticated analytics to uncover insights and improve operations. Effectively convey your message while using data visualization to shape achievement.

Responsibilities:

  • Developed in pipeline built such as creating linked servers, datasets and writing dynamic code in Azure Data Factory.
  • Designed and set up an enterprise Data Lake to provide support for diverse use cases including analytics, processing, storage, and reporting of large, rapidly changing data.
  • Created Azure Data Factory (ADF pipelines) using Azure Blob. Performed ETL using Azure Data Bricks.
  • Used the Inversion control system for repository access and coordination with CI/CD tools.
  • Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, Spark SQL, Data bricks, and Azure Data Lake (ADLS).
  • Worked on NoSQL databases like HBase and imported data from MySQL and processed data using Hadoop Tools and exported to Cassandra NoSQL database.
  • Automates ETL workflows, including data extraction. Extracts https data source, and orchestrate ETL workflows.
  • Data extraction is facilitated through the utilization of a JSON configuration file. By passing of parameters to the Azure Data Factory pipeline, the project emphasizes automation and scalability, paying the way for future expansions.
  • Demonstration of building an end-to-end Extract, Transform, Load (ETL) cloud solution using Microsoft Azure.
  • Developed and tested ETL mappings in Informatica to ensure accurate data transformation and alignment with business objectives.
  • Through thorough testing, the ETL process' accuracy and functionality were confirmed, demonstrating the new field's successful integration across many data layers.
  • Leading the charge to adopt an Agile project management style for the Transactional Data Warehouse (TDW), which prioritizes iterative development and continuous improvement, revolutionized project delivery.
  • Regular reviews and demos increased stakeholder participation and allowed for timely revisions based on user feedback and changing requirements.
  • Smoke testing was used to validate acceptance criteria for all delivered tales, guaranteeing their quality and deployment readiness.
  • By educating team members on the new reporting format, and encouraging knowledge sharing among team members, I helped ensure a smooth transition to the changes.

Environment: Azure Data Factory, Data Lake, Azure Data Factory, ADF pipelines, Azure, ETL, Azure Databricks, CI/CD, Azure Data Storage, Spark SQL, Databricks, Azure Data Lake (ADLS), NoSQL, HBase, MySQL, Hadoop, Cassandra NoSQL, JSON, Microsoft Azure, Transactional Data Warehouse (TDW), Stakeholders, Smoke Testing.

DATA ENGINEER

CAPGEMINI
Bangalore, KA
05.2018 - 03.2021

Description: The bank's main activities are loan origination and deposit generation. In this bank, duties included creating, implementing, and maintaining all databases, including complex queries, triggers, and stored procedures. Also helped with the administration of several bank databases in both development and production environments. It offers the entire spectrum of personal and business banking services and products.

Responsibilities:

  • Developed a data pipeline and used Azure stack components such as Azure Data Factory, Azure Data Lake, Azure Data Bricks, Azure Synapse analytics, and Azure Key Vault for analytics.
  • Design and implement database solutions in Azure SQL Data Warehouse, AzureSQL.
  • Utilized the Power BI reporting tool to showcase novel data visualization techniques and sophisticated reporting approaches to the team.
  • Managed the creation and development of data schemas, ETL pipelines using Python and MySQL stored procedures, and Jenkins automation
  • Developed and enhanced Snowflake tables, views, and schemas to enable effective data retrieval and storage for reporting and analytics requirements.
  • Implemented massive ETL with customer data on AZURE with data factory to leading high performance and optimized solution.
  • Used Cloudera, HDFS, MapReduce, Hive, HiveUDF, Pig, Sqoop, and Spark to analyse large and important datasets.
  • Created Spark applications for data extraction, transformation, and aggregation from various file formats using Python, Pyspark, and Spark-SQL. These applications were then analysed and transformed to reveal insights into client usage patterns.
  • Deployed windows Kubernetes cluster with Azure Container service (ACS) from Azure CLI and utilized Kubernetes and Docker for runtime environment of the CI/CD system to build test and deploy.
  • I worked with NoSQL databases including HBase, processed data using Hadoop Tools, imported data from MySQL, and exported data to the NoSQL database Cassandra.
  • Developed be spoke interactive reports, workbooks, and dashboards using Tableau to build.
  • Created Python scripts to contact Cassandra RESTAPI, carried out several modifications, and moved the data into Spark.
  • Worked on Spark/Scalaand Python regular expression (regex) projects in the Hadoop/Hive environment using Linux and Windows as big data resources.

Environment: Azure Stack, Data Pipeline, Azure Data Factory, Azure Data Lake, Azure Databricks, Azure Synapse Analytics, Azure Key Vault, Azure SQL Data Warehouse, Azure SQL, Power BI, ETL, Python, MySQL, Jenkins, Snowflake, Azure with Data Factory, Cloudera, HDFS, MapReduce, Hive, Pig, Sqoop, Spark, PySpark, Spark-SQL, Kubernetes, Azure Container Service (ACS), Azure CLI, Docker, CI/CD System, NoSQL, HBase, Hadoop, NoSQL Database Cassandra, Python Scripts, Cassandra REST API, Scala, Linux, Windows, and Big Data.

Education

Master of Science - Data Science

University of Hertfordshire
Hertfordshire, UK
07-2024

Skills

Azure Data Lake, Synapse, Azure Stream Analytics, Azure Event Hub,

Azure Databricks, Azure Data Factory

(ADF), and Cosmos DB

SQL Server 2005 / 2008/2012/2014 /2016

Cloud platform: Microsoft Azure, DevOps

CI/CD DevOps Tools: Git, GitHub, and Bitbucket

Data modelling Tools: Dimensional Data Modelling (Star Schema, Snow-Flake Schema)

NoSQL Databases: Cosmos DB, HBase, Cassandra, MongoDB

Databases: MySQL, SQL Server, SSIS, SSAS, SSRS, OLTP, OLAP

Big Data Tools: Apache Kafka, Apache Spark, Airflow, Hive, Sqoop

Methodologies: AGILE, Scrum, Waterfall

Operating Systems: Windows, Linux, UNIX

Certification

  • Data Analyst Associate
  • SQL Associate

Timeline

AZURE DATA ENGINEER

CGI
03.2021 - 12.2022

DATA ENGINEER

CAPGEMINI
05.2018 - 03.2021

Master of Science - Data Science

University of Hertfordshire
Durga Pothula