Summary
Overview
Work History
Education
Skills
Timeline
Generic
SANTHOSH KUMAR PALWAI

SANTHOSH KUMAR PALWAI

LIVERPOOL

Summary

Seasoned Data Scientist with background in designing and implementing data-driven solutions for various business challenges. Experience includes predictive modeling, artificial intelligence, machine learning, and big data technologies. Strengths lie in strong analytical thinking, problem-solving skills, and ability to translate complex data into actionable insights. Previous work has resulted in significant improvements in decision-making processes and business performance.

Overview

6
6
years of professional experience

Work History

Data Scientist

Department for Work and Pensions
Liverpool
01.2023 - Current
  • Engineered robust data ingestion and augmentation pipelines with Python libraries like Pandas, NumPy, and OpenCV, employing techniques such as rotation, flipping, zooming, and brightness adjustments to process diverse vehicle images under varying conditions.
  • Enhanced model performance through systematic hyperparameter tuning with Keras Tuner, integrating advanced strategies like dropout, batch normalization, and adaptive learning rate scheduling to maximize accuracy and efficiency.
  • Leveraged advanced NLP methodologies by integrating web scraping using BeautifulSoup, and text preprocessing with NLTK, to extract, clean, and analyze large volumes of customer review data for actionable retail insights.
  • Developed custom Named Entity Recognition (NER) models using spaCy, and implemented topic modeling techniques (LDA, NMF) to extract critical entities and uncover hidden patterns in unstructured text, supporting data-driven decision-making.
  • Orchestrated scalable deployment using Docker and Kubernetes, coupled with real-time performance monitoring via TensorBoard and dynamic visualizations with Matplotlib and Seaborn, ensuring production-grade solutions and strategic operational insights.

Data Scientist

Turing
Hyderabad
07.2019 - 09.2021
  • Standardized data across extensive resources like PCR and PVR portals for precise model development applications.
  • Engineered machine learning models to classify paragraphs and extract key lessons from evaluation reports using RAG techniques, enhancing insight accuracy.
  • Integrated and fine-tuned large language models (LLMs) for domain-specific lesson classification, topic modeling, and Q&A generation, employing prompt engineering and few-shot learning to minimize hallucinations.
  • Designed and optimized semantic search and retrieval mechanisms for accurate document chunking, and context-aware response generation.
  • Developed a U-Net-based segmentation model to precisely localize car damage regions, streamlining automated vehicle inspection processes.
  • Applied image augmentation techniques—such as rotation, flipping, zooming, and brightness adjustments—to improve model generalization and prevent overfitting in damage detection.
  • Optimized deep learning models through hyperparameter tuning (learning rate, batch size, layer configurations) to boost overall accuracy and efficiency.
  • Constructed efficient data pipelines for seamless integration between preprocessing, model training, and validation across multiple projects.
  • Leveraged NLP and text-mining techniques to analyze customer reviews for sentiment classification and named entity extraction, generating actionable retail insights.
  • Collaborated with cross-functional teams and utilized Agile best practices to align AI-driven solutions with strategic business objectives, supporting data-driven decision-making.

Education

MBA - International Business Management

Heriotwatt University
Edinburgh, Scotland
12-2022

Bachelors of Engineering - Information Technology

Osmania University
Hyderabad, India
05-2019

Skills

  • Programming Languages: Python, R, Scala, PySpark, GO, NodeJS
  • Data Analysis & Visualization: Pandas, NumPy, Matplotlib, Seaborn, Dash, Excel, Power BI, Tableau
  • Databases: SQL Server, PostgreSQL, MySQL, MongoDB, Neo4j
  • Machine Learning Frameworks: scikit-learn, PyCaret, H2Oai, Azure ML, Statsmodels
  • Deep Learning Frameworks and Models: TensorFlow, PyTorch, Keras, Transformers, CNN architectures, U-Net, Natural Language Processing (NLP) and
  • Web Scraping: NLTK, spaCy, BeautifulSoup
  • LLM Integration and Generative AI: Langchain, Llama Index, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) systems, prompt engineering, and few-shot learning
  • Computer Vision & Image Processing: OpenCV (CV2), image augmentation techniques
  • Big Data & Distributed Processing: PySpark
    Deployment, Orchestration & Monitoring: Docker, Kubernetes, TensorBoard
  • Agile methodologies and data pipelines: Sprint planning, iterative model development, and robust data pipeline construction for end-to-end model lifecycle management

Timeline

Data Scientist

Department for Work and Pensions
01.2023 - Current

Data Scientist

Turing
07.2019 - 09.2021

MBA - International Business Management

Heriotwatt University

Bachelors of Engineering - Information Technology

Osmania University
SANTHOSH KUMAR PALWAI