Summary

Overview

Work history

Education

Skills

Timeline

Amarjeet Mishra

Slough,Berkshire

Summary

PROFILE SUMMARY I am a GCP certified Professional Data Engineer, Machine Learning & Deep Learning enthusiast, and entrepreneur with 4 years of early stage business experience with over 6+ years of industry experience across Big Data Platforms including cloud platforms.
• Design architecture, development & deployment
• Extensive experience in multiple life cycle development projects including gathering business requirements, scope definition, analysis of source systems, and design data strategies for both transactional & analytical systems.
• Architecting & Modeling Data Integrity

• Hands-on experience on major components in AWS like s3,Lambda,Glue,Redshift,Athena, Sagemaker , GCP cloud storage, Dataproc,BigQuery & Hadoop like Spark, HDFS, HIVE, HBase, Zookeeper, Sqoop, Oozie, Flume, as well as Spark, Kafka, Python. Develop scalable and reliable data solutions to move data across systems from multiple sources in real time as well as batch modes.

Overview

years of professional experience

Work history

Lead Data Engineer

Insight International (UK) Ltd

London

05.2022 - Current

Gather requirements from various Stakeholders like Finance, Risk Management for Lloyds Bank (Client)
Knowledge of all phases of Agile with a good understanding of System Study, Design, Client Interaction, Coordination, Development, and Implementation of data product build projects.
Processing Big Data using tools like Hadoop, GCP, Spark & various big data tools.
Actively involved in Group Data Model with a holistic view of Data.
Data cataloging & Lineage using Collibra & manual methods.
Ensure the movement of Data with required transformations between different Layers in EDH & GCP from various sources.
Active participant in the architectural team for the design decisions
Parallelization to implement optimizations in Spark nodes to boost the efficiency of ETL/ELT tasks in the Hadoop ecosystem.
Deep knowledge in incremental imports and partitioning and bucketing concepts in Hive and Spark SQL needed for optimization
Created Hive tables with static & dynamic partitioning strategy & processed data using HQL & Scala-Spark program
Production of synthetic data, ingesting data from files into tables, Processing of data for data products built using Scala-based framework using Dataproc cluster (Hive tables),
Professional experience in using Python & PYSPARK.
Creating BigQuery tables & migration of data from hive to BQ
Set up CI-CD pipeline using Jenkins & UCD for automating the deployment process in higher environments.
Implemented various automation processes to reduce the manual job by 80% using NLP & built-in modules in Python.
Built data product from batch data by analyzing data from scratch.

Big Data Engineer- Trainee

ITC

London

01.2022 - 05.2022

Created AWS Cloud Formation templates to create infrastructure in the cloud
Populated a Data Lake using AWS Kinesis from various data sources such as S3
Processed data stored in S3 using AWS Lambda, Glue, Redshift and AWS Athena
Developed ETL jobs in AWS Glue to extract data from S3 buckets and load it into the data mart in Amazon Redshift.Authored AWS Lambda functions to run Python scripts in response to events in S3
Used Amazon EMR for processing Big Data implementing tools like Hadoop, Spark, and Hive.Executed Hadoop/Spark jobs on AWS EMR using programs, data stored in S3 Buckets
Implemented optimizations in Spark nodes and improved the performance of the Spark Cluster
Orchestrated workflows in Apache Airflow to run ETL pipelines using tools in AWS
Worked with AWS Lambda functions for event-driven processing using AWS boto3 module in Python
Used Spark, Spark SQL, and Spark Streaming for data analysis and processing. Implemented Spark using Scala and SparkSQL for faster testing and processing of data
Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers
Kafka cluster that used a schema to send structured data via micro-batching.

Client Interface Manager

ProPhoenixsoft Pvt Ltd

Bengaluru

01.2017 - 01.2020

Driving Microsoft, Google, Dell, HP, Lenovo, Vodafone‘s Business strategy as part of the One Commercial Partner Organization
Working in a defined customer territory across IT and NON-IT clients & Aligning with respective team to drive territory sales
Exploring the Data generated from Microsoft Data Team & visualizing using Python Streamlit & Tableau
Using ML models targeting the potential customers & pass it to the Lead-gen Team
Actively involved in developing a Fintech App called Paypro & a replica of Tiktok called ALAP.

Software Engineer

Reverie Language Technologies

Bengaluru

01.2015 - 01.2017

Created EC2 instances and auto-scaling. Designed and developed ETL jobs to extract data from AWS S3 and load it in Amazon Redshift
Maintaining the database & Loading tables from MySQL database
Performed exploratory data analysis in Python using Pandas
SQL queries to get the revenue generating customers, response rate & other requirements
Building Webapps using streamlit & productionization using Heroku
Actively involved in finalizing requirement of clients, solution designing, coordinating with Testing & QA team & deploying and interacting with clients & resolving their issues

Operational Manager

Khlonitrix soft pvt Ltd

Hyderabad

01.2013 - 01.2014

Installed and configured software
Designed and managed software projects & websites for clients.