Summary
Overview
Work History
Education
Skills
Disclaimer
Personal Information
Timeline
Generic

Supratik Datta

Lead AI Engineer/Architect

Summary

Highly accomplished and results-oriented Lead AI Engineer with over 13 years of progressive experience in designing, developing, and deploying cutting-edge AI/ML solutions. Experienced in leading cross-functional teams, driving innovation, and delivering tangible business value. Demonstrated success in architecting and implementing scalable AI/ML pipelines, optimizing performance, and migrating legacy systems to modern cloud-based platforms such as GCP, AWS, and Azure. Proficient in a wide range of AI/ML technologies, including Generative AI, Agentic AI, and MLOps practices.

Most recently ideated and utilized frameworks like Langgraph, Pydantic AI to create a robust and scalable agent(first in the bank) in IB divsion of Deutsche Bank capable of understanding and responding to diverse modalities.

Overview

14
14
years of professional experience
3
3
Languages

Work History

Assistant Vice President, Lead Data and AI Engineer

Deutsche Bank
11.2022 - Current
  • Led a 15+ member cross-functional team (internals and vendors) to conceptualize and deploy "dbSherpa," the bank's first serverless agentic copilot on GCP for the CIB Trade Surveillance compliance department.
  • Delivered immense business and industry impact, resulting in a 70% reduction in investigation analyst man-hours across three critical workflows (FX, FRO, FI-Wash). The dbSherpa product won two major industry awards (Innovation World Cup and Best Cloud Project of the Year at the TDI Google Annual Awards) and was featured in Bloomberg.
  • Engineered an innovative workflow automation system that captures analyst click-stream actions to automatically generate Standard Operating Procedures (SOPs) and process maps, which are then used to dynamically build and configure targeted agentic behaviors.
  • Architected robust multimodal NL-Query Agents capable of interacting with structured (Oracle/BigQuery), unstructured (documents with images), and semi-structured (Solr) data to generate diverse outputs including text, document-based images, graphs, audio, and PDF reports.
  • Designed advanced agentic architectures, conceptualizing specific "Insight" and "Scenario" agents within dbSherpa. Utilized LangChain for advanced context engineering, building complex scenarios as discrete agent skills.
  • Architected intelligent dynamic model-switching (routing) pipelines to optimize latency and compute costs based on context complexity. Leveraged open-weights models hosted internally via Ollama, intelligently routing simpler, high-volume tasks to faster (flash) Models (SLMs) while reserving high-parameter reasoning models (Pro/Thinking) for complex, multi-step analytical workflows.
  • Implemented robust memory solutions custom as well as with mem0 to strengthen context engineering and improve agent harness.
  • Utilized "nanoclaw" to create specialized mini-agents, optimizing memory management and significantly reducing storage space requirements to enhance systemic reasoning and efficiency leveraging open source models like glm 5, kimi k2.5 via llama.cpp and ollama.
  • Developed Model Context Protocol (MCP) servers to seamlessly integrate live market news (Bloomberg, Reuters) and real-time communications data.
  • Built a low-latency, serverless GCP Cloud Run environment from scratch, collaborating with DevOps to establish CI/CD pipelines, dockerize all agents, and manage comprehensive GCP service provisioning, including API/IP whitelisting. Managed a dedicated GKE cluster with GPU provisioning to support the internal model hosting.
  • Drove rigorous LLMOps and performance optimization by implementing comprehensive Evaluation (Evals) frameworks. Optimized retrieval strategies using Agentic RAG and targeted model fine-tuning.
  • Navigated complex enterprise governance by securing full approvals from the Model Risk Management (MoRM) board, Architecture Review Board, central AI team, and SDO for the novel agentic design. Implemented robust guardrails and strict memory context management to ensure deterministic outputs and compliance.
  • Championed cross-functional integration, working closely with UX/UI teams to seamlessly embed the AI application into a revamped Trade Surveillance dashboard, and regularly presenting strategic solutions to Department Heads and CTOs to showcase the transformational value of Agentic AI.
  • Conducted POCs to evaluate and implement next-generation trade surveillance systems using advanced AI techniques with bith code(adk,langraph, crewai, pydantic AI) as well as no code(n8n , make.com)

Data and AI Manager

British Telecom UK
06.2021 - 11.2022
  • Led the design and implementation of the migration of data and assets from AWS to GCP, architecting a comprehensive and scalable solution
  • Established and enforced data engineering best practices, ensuring consistency, quality, and maintainability of data pipelines
  • Built and optimized AWS Glue Jobs, addressing performance bottlenecks and ensuring data integrity
  • Created and managed infrastructure artifacts using Terraform for both AWS and GCP environments
  • Developed and implemented streaming applications on AWS Kinesis and replicated them using Cloud Pub/Sub in GCP, ensuring seamless data flow during migration
  • Translated and migrated managed workflow jobs from AWS to Airflow on GCP, Glue jobs to Databricks, and Athena queries to BigQuery
  • Explored and implemented hybrid cloud solutions to bridge the gap between AWS and GCP during the migration process
  • Implemented elements of GCP MLOps, leveraging Vertex AI for model training and deployment and Cloud Build for CI/CD pipelines for ML models

Senior Data/AI Engineer

Fractal
08.2017 - 06.2021

Sky UK Ltd

  • (Oct 2019 – Jun 2021): Architected and built data marts and warehouses on GCP, migrating from SAP IQ and implementing robust data pipelines in Airflow
  • Developed an ingestion framework to capture and process data from multiple sources with varying frequencies
  • Implemented GCP MLOps practices, including versioning models in the Vertex AI Model Registry and using Kubeflow pipelines for ML workflows
  • Set up continuous deployment to Google Kubernetes Engine (GKE) using Jenkins and GCP Cloud Build.

Mars UK Ltd

  • (Mar 2018 – Oct 2019): Led the development of a scalable recommendation system in Azure using Azure ML, Databricks, and Power BI
  • Implemented Azure MLOps, leveraging Azure Machine Learning Service for model management and deployment, and Azure DevOps for CI/CD
  • Employed a variety of ML techniques, including Random Forest, K-means clustering, Holt-Winters time series forecasting, and Matrix Factorization
  • Optimized ML models using Spark MLlib for feature selection and clustering, and designed a Datastore in Hive connected with Power BI

Mondelez Inc

  • (Aug 2017 – Feb 2018): Architected an advanced analytics platform on AWS, migrating datasets from Hadoop to S3 and Redshift
  • Designed and implemented Redshift tables and optimized data ingestion and transformation processes
  • Orchestrated CI/CD pipelines using GitLab and Jenkins

Big Data/Machine Learning Lead Developer

Infosys Ltd
09.2015 - 08.2017

Anthem Inc

  • (Aug 2015 – Aug 2017): Migrated SAS applications to Scala-based Spark jobs, designing data lake solutions using Apache Kudu
  • Implemented ETL workflows and predictive models using Python and Spark MLlib
  • Ported existing ML code from R to Python and scaled it using Spark ML
  • Built and optimized predictive models using Python libraries (pandas, scikit-learn, numpy), performing exploratory analysis and feature engineering
  • Implemented ensemble methods like GEB and XGBoost, and used Airflow for orchestration

Big Data Developer

Tech Mahindra Ltd
05.2012 - 08.2015

AT&T

  • (Aug 2014 – Jun 2015): Developed data processing pipelines for prospective client data, implementing real-time streaming using Spark and Kafka
  • Automated processing using Azkaban and Docker, and worked on 7-node (test) and 23-node (production) Hadoop clusters

British Telecom Italia

  • (May 2012 – Aug 2014): Extracted data from Oracle 11g to HDFS using Sqoop, creating and optimizing Sqoop jobs for incremental loads
  • Developed Oozie workflows for ETL and managed the Hadoop cluster using Cloudera Manager

Education

Bachelor of Technology - Electrical And Electronics Engineering

Camellia Institute of Technology
Kolkata, India
04.2001 -

Skills

Generative AI

Agentic AI (Langchain, Langraph, PydanticAI, pgvector)

Scikit-learn

Spark MLlib

TensorFlow

PyTorch

Kubeflow pipelines

MLflow

SageMaker

Azure ML

Vertex AI MLOps

Azure MLOps

CI/CD for ML models

GCP (Dataproc, BigQuery, Cloud Composer, Vertex AI)

AWS (EMR, Glue, Athena)

Azure (Databricks, Data Lake)

Hadoop

Hive

Spark

Kafka

Python

Scala

SQL

Java

Shell Scripting

Cloud Build

Jenkins

GitLab

Docker

Kubernetes

Terraform

Oracle

Teradata

Redshift

BigQuery

Disclaimer

I, Supratik Datta, declare that the above information is true and accurate to the best of my knowledge.

Personal Information

  • Date of Birth: 07/23/89
  • Nationality: Indian

Timeline

Assistant Vice President, Lead Data and AI Engineer

Deutsche Bank
11.2022 - Current

Data and AI Manager

British Telecom UK
06.2021 - 11.2022

Senior Data/AI Engineer

Fractal
08.2017 - 06.2021

Big Data/Machine Learning Lead Developer

Infosys Ltd
09.2015 - 08.2017

Big Data Developer

Tech Mahindra Ltd
05.2012 - 08.2015

Bachelor of Technology - Electrical And Electronics Engineering

Camellia Institute of Technology
04.2001 -
Supratik DattaLead AI Engineer/Architect