Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
Generic

Srikanth Jaggari

Milton Keynes,RDG

Summary

Senior Data Engineer with extensive experience (12+ years) designing and implementing scalable, secure data architectures on AWS Cloud (S3, Glue, Lambda, Athena, EMR) and Big Data platforms. Possesses a strong command of Python, Java, and Scala to create high-performance data pipelines that optimize data delivery and empower data teams with efficient solutions. Proven ability to automate workflows with custom tools and scripts, accelerating data-driven decision making for the organization.

Overview

12
12
years of professional experience

Work History

Senior Data Engineer

Funding Circle
UK
09.2021 - Current
  • Data Platform Architecture: Designed and built in-house data tools, data lake architecture, and robust data ETL pipelines (AWS Glue, EMR, Athena, Airflow, Kubernetes) using generative AI for optimization.Managed and optimized AWS infrastructure (S3, Glue, EMR, Lambda, IAM) for Data Analysts, Engineers, and Scientists.
  • Data Modeling & Automation: Optimized DBT for data modeling and version control, leveraged Terraform (IaC) for automated infrastructure provisioning, and utilized cutting-edge tools like GitHub Actions for streamlined deployments.
  • Data Processing & Quality: Redesigned and developed robust streaming and batch data pipelines by utilizing Apache Spark, Hudi on EMR, and Airflow to ensure high-quality data processing.
  • Monitoring & Alerting: Established comprehensive monitoring and alerting systems (CloudWatch, PagerDuty, Grafana) to ensure timely detection and response to potential issues.
  • Machine Learning Integration: Integrated SageMaker models into pipelines for real-time insights and collaborated with data scientists on model operationalization.
  • Security & Open Source: Enhanced security measures and encryption standards (KMS) to safeguard sensitive data in the data lake. Open Source Contributor: Made significant contributions to the Apache Hudi project and remains an active community member.

Big Data Engineer

Santander
UK
09.2016 - 08.2021
  • Data Ingestion & Transformation:Extracting data from various sources (Kafka, Databases, AWS S3).Utilizing Python, PySpark, SparkSQL for data enrichment and complex transformations.
  • Real-Time & Batch Processing: Restructured streaming, Near Real-Time (NRT) and batch big data applications using Spark and Kafka which enable real-time analysis and batch data preparation.
  • Data Warehousing & Delivery: Creating "golden datasets" by collecting, pre-processing, and enriching data for audits. Pushing transformed data to various destinations (Kafka, ElasticSearch, HDFS, AWS S3, NiFi).
  • Data Visualization & Monitoring: Creating Kibana dashboards and Canvas visualizations based on user requirements. Automating scheduling and monitoring of frameworks using Rundeck with built-in alerts.
  • Workflow Orchestration & Reporting: Scheduling and coordinating Spark and Hive frameworks using Oozie Workflows and coordinator jobs. Developing internal reporting tools to generate comprehensive statistical reports.

Big Data Engineer

IMS Health (IQVIA)
India
06.2015 - 08.2016
  • Data Warehousing & Transformation: Fetching data from data warehouses using Data Virtualization servers and performing transformations on the data using Spark.
  • Data Virtualization: Participating in creating Virtualized Data Sources (VDBs) in JBoss Data Virtualization for external data access.
  • Data Movement & Persistence: Utilizing Spark SQL for data filtering, transformations, and classification. Persisting transformed data back to external data sources via Data Virtualization servers.
  • Cluster Management , Job Monitoring & Logging: Contributing to setting up Spark clusters with HDFS and Hive. Logging job status in both a SQL Server table and a log file for comprehensive tracking.

Big Data Engineer

ITC Infotech
India
02.2015 - 06.2015
  • Created Streaming application using Spark Streaming solution to steam data from Kafka which send from IOT(Internet Of Things) and Enrich data as per requirements then push it to Hive, HDFS
  • Leveraged Spark SQL to load data from Hive and perform various filtering and transformations on input data
  • Analyzed business requirements and designed Spark applications to perform specific analytics tasks
  • These programs delivered final outputs meeting business needs, enabling various downstream analytics use cases.

Big Data Developer

C-DAC R&D
India
03.2012 - 01.2015
  • Election Commission Project: Developed a text analysis application using Spark to identify violations of Election Commission of India code of conduct.
  • Data Cleaning & Reporting: Built MapReduce applications for data cleansing and masking, generating reports in Java, and storing them in HBase for further analysis.
  • System Design & Integration: Contributed to system design and architecture for big data applications, integrating updated modules/plugins into existing systems.
  • Hadoop Cluster Management: Optimized Hadoop clusters by tuning configurations and scaling resources.
  • Ontology & Search Relevance Improvement:Constructed taxonomies and frameworks for ontologies, Implemented a relevancy tuning model for search engines and algorithms to enhance search results using open-source tools and created helper applications like document tagging to Accelerate ontology-based search functionality.

Education

Master of Computer Applications - Computer Engineering Technology

JNTU(Jawaharlal Nehru Technological University)
India
09-2011

Skills

  • Cloud Technologies(AWS)
  • S3
  • Glue
  • Lambda
  • EMR/EMR Serverless
  • Athena
  • LakeFormation
  • CloudFormation
  • CloudTrail
  • Cloudwatch
  • EventBridge
  • Step Functions
  • SNS
  • SQS
  • IAM
  • Sagemaker
  • KMS
  • ELK Stack
  • Elastic search
  • Logstash
  • Kibana
  • Watchers
  • File beat
  • Cerebro
  • Comrade
  • Dashboards
  • Canvas
  • Big Data Ecosystems
  • Spark
  • Spark Streaming
  • SparkSql
  • Pyspark
  • Hadoop
  • HDFS
  • Map Reduce
  • YARN
  • Kafka
  • HIVE
  • NiFi
  • Flume
  • Oozie
  • HBase
  • Pig
  • Solr
  • Zookeeper
  • Hortonworks
  • Cloudera
  • Programming Language & Script
  • Python
  • Scala
  • Java
  • C
  • Shell script
  • Ansible
  • YAML
  • Build & Automation Tool
  • Drone
  • Github Actions
  • CircleCI
  • GitHub
  • Jenkins
  • Rundeck
  • Maven
  • Backstage
  • Cookiecutter
  • Database(DBMS)
  • My SQL
  • PostgreSQL
  • Oracle
  • Data Virtualization
  • Others
  • Airflow
  • DBT
  • Terraform
  • Datahub
  • Tableau
  • Grafana
  • Docker
  • IaC(Infrastructure As Code)
  • Pagerduty
  • Apache Hudi
  • Apache Iceberg
  • Confluent Cloud
  • Kubernaties
  • Generative AI(Chat GPT, Gemini)

Accomplishments

  • Data Platform Optimization & Cost Savings: Delivered significant cost savings exceeding $160K annually through innovative data platform optimization by leveraging cutting-edge tools (Generative AI, Terraform) to design and implement high-performance, scalable systems
  • Real-Time Transformation & Analytics: Spearheaded the development of near real-time data pipelines, dramatically reducing data availability latency from 24 hours to near real-time. This achievement empowered faster analytics and data-driven decision making.
  • Big Data Expertise & Security: Possess in-depth knowledge and proven experience with various big data technologies like Spark, Kafka, and ElasticSearch. Additionally, implemented robust data security practices to safeguard sensitive information.
  • Open-Source Contribution & Collaboration: Actively contributed to the Apache Hudi open-source project, demonstrating a commitment to the broader data engineering community.
  • Data Quality Champion: Championed data quality by implementing DBT for data modeling, resulting in a 50% reduction in data errors. Ensured efficient and reliable data solutions for cross-functional teams, fostering successful collaboration.

Timeline

Senior Data Engineer

Funding Circle
09.2021 - Current

Big Data Engineer

Santander
09.2016 - 08.2021

Big Data Engineer

IMS Health (IQVIA)
06.2015 - 08.2016

Big Data Engineer

ITC Infotech
02.2015 - 06.2015

Big Data Developer

C-DAC R&D
03.2012 - 01.2015

Master of Computer Applications - Computer Engineering Technology

JNTU(Jawaharlal Nehru Technological University)
Srikanth Jaggari