Summary
Overview
Work History
Education
Skills
Timeline
Languages
ML | AI Projects
Projects
Languages
ML | AI Projects
Projects

KANISHK SAXENA

London

Summary

Masters in Data Science graduate from the University of Glasgow with 5 years of experience as a Software Developer and Data Engineer at prestigious MNCs. Committed, self motivated and enthusiastic professional dedicated to contributing to team success through hard work, technical, analytical, and communication skills. Proficient in engaging with third-party vendors, stakeholders, and cross-functional teams to extract valuable insights. Committed to staying at the forefront of the dynamic field of data science, with a passion for continuous learning and professional growth.

Overview

4
4
years of professional experience

Work History

Associate Technology L2

Publicis Sapient
07.2021 - 07.2022

Total Wines & More


Worked on a client project, enabling e-commerce support for customers to buy wines and beverages conveniently from their homes. Provided extensive options for beverage selection and offered delivery choices such as in-store shopping, curbside pickup, and home delivery

  • Built and extended TWM with SAP's Hybris e-commerce suite. Collaborated with the team to develop and manage the Content Management System (CMS). Modularized the project, focusing on the Cart Module, order handling, basket updates, and tax calculations with unit testing of modules, also contributing in technical solutioning, sprint planning, estimation, and retrospectives.

Data Engineer

Infoobjects
01.2020 - 06.2021
  • Worked on Scheduler a microservice extended the functionalities of AIRFLOW for orchestrating and managing transmission of data from various source banks to destination via ETL pipeline.
  • SQLAlchemy as the Object-Relational Mapping (ORM) , AWS services like EC2 for scalability, S3 for storage, Aurora as, Implemented Authorization on the project (JWT, /Facebook using OAuth2), EKS for kuberenetes, GraphQL, AWS Cloudfront used for increased security, converting raw links to secured links to restrict access

Systems Engineer

Infosys
07.2018 - 01.2020

Data Migration

  • A Spring batch project divided into three main job: process the data, manipulate the data and persist the data in another database. Data migration of customers enrolled under banks from MS Access database to UDB database and from CSV's, developed mainly using Java, Spring and Hibernate

System Engineer Trainee

Infosys
01.2018 - 07.2018

Monochrome-to-Color Image Conversion System

  • Conducted extensive research and development on tools and deep learning technologies for colorizing black and white images. Utilized Convolutional Neural Network (CNN) architecture with different set of layers (pooling, linear and fully connected layers). Transfer learning implemented helping in better feature extraction.

Education

B.Tech - Computer Science And Engineering

SRM University, India
04.2001 -

MSc - Data Science

University of Glasgow, UK, Scotland
04.2001 -

Skills

    Javascript, ReactJS, HTML, CSS Bootstrap, Spring Boot, Spring MVC

undefined

Timeline

Associate Technology L2 - Publicis Sapient
07.2021 - 07.2022
Data Engineer - Infoobjects
01.2020 - 06.2021
Systems Engineer - Infosys
07.2018 - 01.2020
System Engineer Trainee - Infosys
01.2018 - 07.2018
SRM University - B.Tech, Computer Science And Engineering
04.2001 -
University of Glasgow - MSc, Data Science
04.2001 -

Languages

  • Hindi
  • English

ML | AI Projects

Predictive Modelling; Risk Assessment | Quadratic Weighted Kappa, Dense Neural Networks, Ensemble Methods:

  • Aimed to categorize individuals into risk levels based on personal and health data using traditional machine learning models with neural networks and a multi-tier pipeline for stacking and ensemble. Implemented binary classification, multi-classification, and regression tasks using neural networks and traditional models. Utilized Quadratic Weighted Kappa metric for evaluation and optimized the model's performance by applying offset calculated with fmin Powell method.

Topic Modelling | Python, NLP, nltk, sklearn, SciPy, spaCy, Seaborn, pyALDavis

  • Analyzed the data for short text topic modeling and performed long text topic modeling by clustering the same tweets using a single-pass approach. Generated topics for both datasets and conducted a comparative performance analysis between the two models.

Big Data analysis | Java, Apache Spark

  • Developed a batch-based text search and filtering pipeline in Apache Spark, The core objective of this pipeline was to process large collection of text documents and a set of user-defined queries, resulting in the retrieval of the top 10 documents. This was achieved by implementing tokenization, utilizing a ranking model and resolving duplicates based on textual distance in document titles (using a provided comparison function). Achieved an outstanding performance milestone by processing 10 GB of data in just 50 seconds, demonstrating the pipeline's efficiency.

Deep learning Cancer Cell Classification | CNN, Ray parameter tuning, torchvision, Captum

  • Trained and compared two deep neural networks which can take a 100x100 pixel images with a cell nuclei and classify wether tissue is normal or cancerous. One approach was to segment out the nuclei of cells from these images and classify them into different cell types. Comparison between two model where one was a small custom ConvNet and other was made from existing torchvision model which was pre-trained on ImageNet and further trained on the nucleus data – essentially using transfer learning.

Text processing techniques and its applications| NLP, HuggingFace, RoBerta, Transformers

  • Text-as-Data project to build and evaluate different classifier's performance, like dummy classifier, svc, random forest and logistic regression, implemented k-means clustering also built a deep learning based approach BERT utilizing HuggingFace's 'feature-extraction' pipeline with the 'roberta_base' model to encode text documents for improved classification accuracy.

Projects

https://github.com/KanishkSaxena/data-analysis/tree/main

Languages

  • Hindi
  • English

ML | AI Projects

Predictive Modelling; Risk Assessment | Quadratic Weighted Kappa, Dense Neural Networks, Ensemble Methods:

  • Aimed to categorize individuals into risk levels based on personal and health data using traditional machine learning models with neural networks and a multi-tier pipeline for stacking and ensemble. Implemented binary classification, multi-classification, and regression tasks using neural networks and traditional models. Utilized Quadratic Weighted Kappa metric for evaluation and optimized the model's performance by applying offset calculated with fmin Powell method.

Topic Modelling | Python, NLP, nltk, sklearn, SciPy, spaCy, Seaborn, pyALDavis

  • Analyzed the data for short text topic modeling and performed long text topic modeling by clustering the same tweets using a single-pass approach. Generated topics for both datasets and conducted a comparative performance analysis between the two models.

Big Data analysis | Java, Apache Spark

  • Developed a batch-based text search and filtering pipeline in Apache Spark, The core objective of this pipeline was to process large collection of text documents and a set of user-defined queries, resulting in the retrieval of the top 10 documents. This was achieved by implementing tokenization, utilizing a ranking model and resolving duplicates based on textual distance in document titles (using a provided comparison function). Achieved an outstanding performance milestone by processing 10 GB of data in just 50 seconds, demonstrating the pipeline's efficiency.

Deep learning Cancer Cell Classification | CNN, Ray parameter tuning, torchvision, Captum

  • Trained and compared two deep neural networks which can take a 100x100 pixel images with a cell nuclei and classify wether tissue is normal or cancerous. One approach was to segment out the nuclei of cells from these images and classify them into different cell types. Comparison between two model where one was a small custom ConvNet and other was made from existing torchvision model which was pre-trained on ImageNet and further trained on the nucleus data – essentially using transfer learning.

Text processing techniques and its applications| NLP, HuggingFace, RoBerta, Transformers

  • Text-as-Data project to build and evaluate different classifier's performance, like dummy classifier, svc, random forest and logistic regression, implemented k-means clustering also built a deep learning based approach BERT utilizing HuggingFace's 'feature-extraction' pipeline with the 'roberta_base' model to encode text documents for improved classification accuracy.

Projects

https://github.com/KanishkSaxena/data-analysis/tree/main

KANISHK SAXENA