Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Deep Doradla

Milton Keynes,UK

Summary

PROFILE SUMMARY

  • Solutions Architect and Data Engineer with over 10 years of experience in technology space and specialising in data and cloud technologies. Certified and experienced Hadoop and AWS architect with a keen interest in expanding skills to include Azure and GCP.
  • Demonstrable history of working in Banking, Financial Services and Enterprise Software.
  • Developed Spark jobs in AWS Glue, EMR, Databricks, Hortonworks/Cloudera.
  • Created Data Lake, data warehouse solutions on Hadoop, S3, Apache Hive and AWS Athena.
  • Designed and developed real time applications using Spark and Flink that produce to and consume data from Kafka.
  • Used Java, Scala and Python to develop Spark, Flink, Airflow jobs depending on the customer requirements.
  • Led BigData Migration projects from on-premise to AWS using its services like DMS, S3, Glue, Athena, Ec2, RDS.
  • Experienced in devops and orchestration tools like (Alluxio, Kubernetes, Docker, Ansible, Airflow, Jenkins, Git)
  • Managed key customer relationships by working with product, engineering, executive management and other stakeholders and delivered competent data solutions and data engineering practises.
  • Experienced in project management with qualities that include team leadership, customer centric consulting, client conflict resolution, problem solving, produce technical architectures, deliver presentations and demos to showcase value of proposed solution.

Overview

12
12
years of professional experience
1
1
year of post-secondary education
2
2
Certifications

Work History

Senior Solutions Engineer/Architect

Alluxio
03.2022 - Current

Alluxio, spun out of UC Berkeley, brings data closer to compute in on-premise and cloud.

  • Lead pre-sales and post-sales activities in the EMEA region, managing accounts and their renewals.
  • Collaborated with sales teams, owned technical presentations and converted prospects from POC to Production implementations on cloud, hybrid and on-premises.
  • Designed innovative solutions by drawing common patterns from reference architectures, that significantly reduced egress and operational costs.
  • Successfully managed and improved relationships with challenging accounts increasing trust and satisfaction.
  • Helped Product and Engineering teams prioritize product roadmap per customer needs and aligned them with company goals.
  • Deployed Alluxio clusters on bare-metal, EKS (Elastic Kubernetes Service), debugged and enhanced helm charts.
  • Led the development of use cases, architecture design, and cluster sizing, offering best practices and tuning Alluxio clusters for handling high levels of concurrent requests.
  • Developed Spark jobs on both AWS EMR and Databricks for POCs, demonstrating the performance benefits of using Alluxio compared to direct HDFS or AWS S3 access, when consuming Terabytes of data.
  • I work closely with customers to tune Alluxio, increase cache hit ratio, optimize the Spark jobs to eventually reduce the Cloud data access and egress costs.
  • Contributed to internal process improvements, created Account Overview Plans, enhanced support and upgrade procedures.

Senior Big Data Engineer/Architect

BGC Partners
, UK
07.2019 - 03.2022

At BGC, I worked as an architect, data engineer and an admin. Produced architectures for low latency systems for brokers to see trades and orders and for regulatory requirements.

  • Designed Hadoop Data Lake and Data Warehouse solutions on Hortonworks/Cloudera Hive at Terabyte scale. Consumed by BI, audit, and other teams for generating reports and audit for anomalies.
  • Collaborated with business users and created data models for landing, staging and enriched data in DW.
  • Helped initiate Data Governance principles like creating business glossary, data catalogues and define privacy rules.
  • Designed metadata model for Kafka messages to enable lineage tracking for auditing purposes.
  • Developed batch and streaming jobs using Spark and Flink.
  • Tuned and stabilized Spark thrift for efficient ad-hoc queries.
  • Successfully executed the migration of data infrastructure from HDP 2.6 to 3.1, including all data pipelines, Kafka connectors, Spark batch streaming applications, docker containers, Elasticsearch, Kibana, and other standalone services.
  • Developed and managed Kafka connectors for consuming from Solace, producing to Kafka topics, and writing to HDFS.
  • Created various Jenkins jobs for building and deploying platform components using gitlab and Ansible playbooks.
  • Designed and deployed a highly available Airflow 2 cluster in Docker, with scheduler HA, multiple celery executors, Redis HA, and Postgres HA.
  • Developed and managed Docker images and containers for Airflow, Redis, Jenkins, Elasticsearch, Kibana, Logstash, and standalone Java services.
  • Managed multiple environments (QA, Prod, and DR), ensuring smooth operation of services as expected.
  • Responsible for liaising with both source and customers to add new feeds to the platform and also help in resolving any issues they may have with the platform

Senior Big Data Consultant

Cloudwick Technologies
01.2016 - 06.2019

Hortonworks Consultant

Lloyds Bank
  • Ensured the smooth running of both streaming and batch pipelines using Kafka, Spark Streaming, and Hive.
  • Assisted various teams in performing sourcing and ingestion from different sources using Spark Hive pipelines, debugging issues, tuned and optimised existing pipelines.
  • Conducted training sessions on Spark best practices, including deploying Spark jobs on Yarn and how different technologies like Hive, HDFS work together
  • Played a key role in helping sourcing and ingestion teams migrate from Spark 1.6 to Spark 2.2 on HDP 2.6.4, ensured a seamless transition.
  • Developed and deployed Spark direct streaming applications, effectively consuming data from Kafka and ingesting it into HDFS.
  • Increased Spark Streaming throughput and improved SLA's significantly from few hours to under an hour.

Big Data Engineer

The Guardian
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Reviewed and optimized Spark applications, successfully reducing the run times of a major job from 45 to 27 minutes.
  • Analyzed and optimized various Athena queries, provided best practices for using Athena on Terabytes of data.
  • Integrated Netflix S3 committer into existing Spark jobs, significantly improved write speeds on S3 and overcame limitations of S3 rename operations.
  • These optimizations resulted in substantial cost savings on EMR, reducing expenses by a few thousands of dollars per month.

Cloud Architect, Project Management

SMS (Smart Metering Systems)

On premise to cloud (AWS) migration, develop, automate ETL pipelines on top of S3 Data Lake and reporting

  • Developed Spark jobs on AWS Glue using Python. Converted thousands of lines of stored procedures into Spark transformations and ingested into AWS S3 enriched zone which was then consumed by business users for reporting.
  • Created data models for data warehousing solution.
  • Helped in migrating data from on-premise systems MSSQL, MySQL to S3 using AWS DMS.
  • Managed a team of 4 and designed and delegated tasks to the team.
  • Responsible for running the scrum meetings and coordinating with the customer and setting the weekly expectations.
  • Played a key role in successfully automating and deploying the complete solution.

Data Engineering Consultant

Nordea Bank

Created ETL pipelines to consume data from multiple sources, transform into harmonized data and produce monthly account statements for retail and corporate customers.

  • Worked as part of a large team under strict agile methodology
  • Designed, developed, optimized multiple ETL pipelines using Spark, Scala on Cloudera and used data formats like CSV, JSON, Avro, Parquet.
  • Created data models in Hive and used partitioning and bucketing to optimize reads.
  • Sourced data from legacy systems and ingested the processed data into Hadoop
  • Managed and automated development lifecycle using Ansible, Jenkins, Git.
  • Designed multiple integration and unit tests for data quality and correctness.

Big Data Architect

British Library (Hortonworks Consultant)
  • Created and deployed a Hadoop architecture in the no-internet and high security zone
  • Educated the customer understand the capabilities of Hadoop and its ecosystem
  • Installed and configured Atlas for tag based policies, Data Governance and Data Lineage
  • Deployed multiple security components like Kerberos for authentication, Ranger for authorization, Knox for perimeter level security and SSO, SSL for data in transit encryption and Ranger KMS for encryption on data at rest
  • Used Ansible for automation and configuration management
  • Created ETL pipelines using Spark and Hive that processed Terabytes of data per day.

Big Data Consultant

Insurance Company
  • Deployed and managed HDP and HDF clusters and secured them with Kerberos, Ranger, Knox, SSL and integrated with AD
  • Upgraded HDP from 2.5 to 2.6 without causing much impact to production workloads.
  • Configured Kerberos-AD, Ranger-AD, Ambari-AD, SSSD-AD
  • Trained a team of Data Scientists on apache Spark for their ETL requirements
  • Developed and deployed a custom Nifi processor using Java and Maven
  • Designed a solution for visualizing and monitoring the status of various Nifi Processor groups on Kibana using ElasticSearch

Big Data Consultant

UKDA (United Kingdom Data Archive)
  • I was positioned into this project due to unsuccessful deployments by my earlier peers. I then managed the customer expectations, led the deployment into success and then followed it with renewals.
  • Installed and configured Hadoop clusters in an hybrid architecture with VPN (2 HDP clusters one on AWS and the other on-premises and one HDF Nifi cluster).
  • Successfully performed upgrades on Hortonworks stack without causing disruption to production workloads
  • Configured Kerberos-AD, Ambari-AD, Ranger-AD, Knox-AD, AWS-AD and SSSD-AD. This enabled single sign on for users and simplified the authentication process.
  • Resolved issues in the infrastructure, security and ETL pipelines.
  • Used Ansible for configuration management and automating cluster deployment.
  • Other responsibilities include regular maintenance, ongoing integrations and working alongside developers and data scientists by providing them with required support on the cluster
  • Created data models, ETL pipelines using Spark to load data into Hive.
  • Deployed Nifi processors like (SFTP, PutHDFS, S3 etc) to ingest data into HDFS and S3 as required.

Technology Manager/Co-Founder

SmartLoyal
, UK
08.2012 - 12.2015
  • Grew the team from 2 to 15 and led the product design, architecture, development and release that resulted in successful product launch.
  • Developed CMS and APIs for mobile applications (iOS and Android).
  • Automated using Ansible, Jenkins and Github on dedicated Linux servers with installations of Lamp stack, Geolocation modules etc.

UI developer

Devathon
02.2012 - 05.2012

Education

Master of Science - Business Information Technology Systems

University of Strathclyde
Glasgow
09.2010 - 10.2011

Skills

BigData Expertise – Databricks, Alluxio, Spark, SparkSQL, Spark Streaming, Flink, Hive, HDFS, Sqoop, MapReduce, Kerberos, Ranger, Knox, Oozie, Nifi, Kafka, Zeppelin, Apache Iceberg, Delta Lake

undefined

Certification

Hortonworks certified Apache Hadoop Administrator

Timeline

Senior Solutions Engineer/Architect

Alluxio
03.2022 - Current

Senior Big Data Engineer/Architect

BGC Partners
07.2019 - 03.2022

Senior Big Data Consultant

Cloudwick Technologies
01.2016 - 06.2019

Technology Manager/Co-Founder

SmartLoyal
08.2012 - 12.2015

UI developer

Devathon
02.2012 - 05.2012

Master of Science - Business Information Technology Systems

University of Strathclyde
09.2010 - 10.2011

Hortonworks Consultant

Lloyds Bank

Big Data Engineer

The Guardian

Cloud Architect, Project Management

SMS (Smart Metering Systems)

Data Engineering Consultant

Nordea Bank

Big Data Architect

British Library (Hortonworks Consultant)

Big Data Consultant

Insurance Company

Big Data Consultant

UKDA (United Kingdom Data Archive)
Deep Doradla