Summary

Overview

Work History

Education

Skills

Timeline

Languages

ML | AI Projects

Projects

Languages

ML | AI Projects

Projects

KANISHK SAXENA

London

Summary

Masters in Data Science graduate from the University of Glasgow with 5 years of experience as a Software Developer and Data Engineer at prestigious MNCs. Committed, self motivated and enthusiastic professional dedicated to contributing to team success through hard work, technical, analytical, and communication skills. Proficient in engaging with third-party vendors, stakeholders, and cross-functional teams to extract valuable insights. Committed to staying at the forefront of the dynamic field of data science, with a passion for continuous learning and professional growth.

Overview

years of professional experience

Work History

Associate Technology L2

Publicis Sapient

07.2021 - 07.2022

Total Wines & More

Worked on a client project, enabling e-commerce support for customers to buy wines and beverages conveniently from their homes. Provided extensive options for beverage selection and offered delivery choices such as in-store shopping, curbside pickup, and home delivery

Built and extended TWM with SAP's Hybris e-commerce suite. Collaborated with the team to develop and manage the Content Management System (CMS). Modularized the project, focusing on the Cart Module, order handling, basket updates, and tax calculations with unit testing of modules, also contributing in technical solutioning, sprint planning, estimation, and retrospectives.

Data Engineer

Infoobjects

01.2020 - 06.2021

Worked on Scheduler a microservice extended the functionalities of AIRFLOW for orchestrating and managing transmission of data from various source banks to destination via ETL pipeline.
SQLAlchemy as the Object-Relational Mapping (ORM) , AWS services like EC2 for scalability, S3 for storage, Aurora as, Implemented Authorization on the project (JWT, /Facebook using OAuth2), EKS for kuberenetes, GraphQL, AWS Cloudfront used for increased security, converting raw links to secured links to restrict access

Systems Engineer

Infosys

07.2018 - 01.2020

Data Migration

A Spring batch project divided into three main job: process the data, manipulate the data and persist the data in another database. Data migration of customers enrolled under banks from MS Access database to UDB database and from CSV's, developed mainly using Java, Spring and Hibernate

System Engineer Trainee

Infosys

01.2018 - 07.2018

Monochrome-to-Color Image Conversion System

Conducted extensive research and development on tools and deep learning technologies for colorizing black and white images. Utilized Convolutional Neural Network (CNN) architecture with different set of layers (pooling, linear and fully connected layers). Transfer learning implemented helping in better feature extraction.

Education

B.Tech - Computer Science And Engineering

SRM University, India

04.2001 -

MSc - Data Science

University of Glasgow, UK, Scotland

04.2001 -

Skills

Javascript, ReactJS, HTML, CSS Bootstrap, Spring Boot, Spring MVC

RoBerta, Web crawling, Apache Spark, Predictive Modelling, Risk Assessment

NLP, Data Science, Machine Learning, Regression, Clustering, Deep Neural Networks, Convolutional Neural Networks

MySQL Server, SQLAlchemy, PostgresQL, Kubernetes, AWS S3

GitHub, BitBucket, GitLab, Scrum/Agile Methodology, OOP’s Concept

JIRA, JENKINS, Docker, OOD Principles, Maven, Microservices

Flask , GraphQL, Airflow, Swings, JDBC, Hibernate

Timeline

Associate Technology L2 - Publicis Sapient

07.2021 - 07.2022

Data Engineer - Infoobjects

01.2020 - 06.2021

Systems Engineer - Infosys

07.2018 - 01.2020

System Engineer Trainee - Infosys

01.2018 - 07.2018

SRM University - B.Tech, Computer Science And Engineering

04.2001 -

University of Glasgow - MSc, Data Science

04.2001 -

Languages

Hindi
English

ML | AI Projects

Predictive Modelling; Risk Assessment | Quadratic Weighted Kappa, Dense Neural Networks, Ensemble Methods:

Aimed to categorize individuals into risk levels based on personal and health data using traditional machine learning models with neural networks and a multi-tier pipeline for stacking and ensemble. Implemented binary classification, multi-classification, and regression tasks using neural networks and traditional models. Utilized Quadratic Weighted Kappa metric for evaluation and optimized the model's performance by applying offset calculated with fmin Powell method.

Topic Modelling | Python, NLP, nltk, sklearn, SciPy, spaCy, Seaborn, pyALDavis

Analyzed the data for short text topic modeling and performed long text topic modeling by clustering the same tweets using a single-pass approach. Generated topics for both datasets and conducted a comparative performance analysis between the two models.

Big Data analysis | Java, Apache Spark

Developed a batch-based text search and filtering pipeline in Apache Spark, The core objective of this pipeline was to process large collection of text documents and a set of user-defined queries, resulting in the retrieval of the top 10 documents. This was achieved by implementing tokenization, utilizing a ranking model and resolving duplicates based on textual distance in document titles (using a provided comparison function). Achieved an outstanding performance milestone by processing 10 GB of data in just 50 seconds, demonstrating the pipeline's efficiency.

Deep learning Cancer Cell Classification | CNN, Ray parameter tuning, torchvision, Captum

Trained and compared two deep neural networks which can take a 100x100 pixel images with a cell nuclei and classify wether tissue is normal or cancerous. One approach was to segment out the nuclei of cells from these images and classify them into different cell types. Comparison between two model where one was a small custom ConvNet and other was made from existing torchvision model which was pre-trained on ImageNet and further trained on the nucleus data – essentially using transfer learning.

Text processing techniques and its applications| NLP, HuggingFace, RoBerta, Transformers

Text-as-Data project to build and evaluate different classifier's performance, like dummy classifier, svc, random forest and logistic regression, implemented k-means clustering also built a deep learning based approach BERT utilizing HuggingFace's 'feature-extraction' pipeline with the 'roberta_base' model to encode text documents for improved classification accuracy.

Projects

https://github.com/KanishkSaxena/data-analysis/tree/main

Languages

Hindi
English

ML | AI Projects

Predictive Modelling; Risk Assessment | Quadratic Weighted Kappa, Dense Neural Networks, Ensemble Methods:

Aimed to categorize individuals into risk levels based on personal and health data using traditional machine learning models with neural networks and a multi-tier pipeline for stacking and ensemble. Implemented binary classification, multi-classification, and regression tasks using neural networks and traditional models. Utilized Quadratic Weighted Kappa metric for evaluation and optimized the model's performance by applying offset calculated with fmin Powell method.

Topic Modelling | Python, NLP, nltk, sklearn, SciPy, spaCy, Seaborn, pyALDavis

Analyzed the data for short text topic modeling and performed long text topic modeling by clustering the same tweets using a single-pass approach. Generated topics for both datasets and conducted a comparative performance analysis between the two models.

Big Data analysis | Java, Apache Spark

Developed a batch-based text search and filtering pipeline in Apache Spark, The core objective of this pipeline was to process large collection of text documents and a set of user-defined queries, resulting in the retrieval of the top 10 documents. This was achieved by implementing tokenization, utilizing a ranking model and resolving duplicates based on textual distance in document titles (using a provided comparison function). Achieved an outstanding performance milestone by processing 10 GB of data in just 50 seconds, demonstrating the pipeline's efficiency.

Deep learning Cancer Cell Classification | CNN, Ray parameter tuning, torchvision, Captum

Trained and compared two deep neural networks which can take a 100x100 pixel images with a cell nuclei and classify wether tissue is normal or cancerous. One approach was to segment out the nuclei of cells from these images and classify them into different cell types. Comparison between two model where one was a small custom ConvNet and other was made from existing torchvision model which was pre-trained on ImageNet and further trained on the nucleus data – essentially using transfer learning.

Text processing techniques and its applications| NLP, HuggingFace, RoBerta, Transformers

Text-as-Data project to build and evaluate different classifier's performance, like dummy classifier, svc, random forest and logistic regression, implemented k-means clustering also built a deep learning based approach BERT utilizing HuggingFace's 'feature-extraction' pipeline with the 'roberta_base' model to encode text documents for improved classification accuracy.

Projects

https://github.com/KanishkSaxena/data-analysis/tree/main

KANISHK SAXENA

Summary

Overview

Work History

Associate Technology L2

Data Engineer

Systems Engineer

System Engineer Trainee

Education

B.Tech - Computer Science And Engineering

MSc - Data Science

Skills

Timeline

Languages

ML | AI Projects

Projects

Languages

ML | AI Projects

Projects

Similar Profiles

Shardul SharmaShardul Sharma

Shardul SharmaShardul Sharma

Shubham SinghShubham Singh

Sahithi GottumukkalaSahithi Gottumukkala