
Researcher and software engineer with a PhD in NLP, working at the intersection of research and building systems. Focused on evaluating and improving the interpretability of transformer-based large language models through an extensible diagnostic framework developed during my PhD. Experienced in researching and building end-to-end data pipelines for large-scale multilingual text, as well as interactive dashboards that translate research outputs into practical tools used by clients. I have also worked on evaluating the robustness of customer-service chatbots by developing semi-automated LLM-as-a-judge evaluation workflows to support faster iteration and production monitoring. Overall, my focus is on translating research into practical tools for clients and product teams.
Programming & Tools
Python, SQL, Hugging Face, Transformer-based models, API-based LLMs, Docker
Machine Learning & NLP
Natural Language Processing, Multilingual NLP, Representation Learning, Transformer Models, Named Entity Recognition, Text Classification, Machine Translation, Natural Language Inference, Zero-shot Learning
Model Evaluation & Analysis
Model Evaluation, Diagnostic Analysis, Representation Analysis, Robustness Testing, Error Analysis, Interpretability, LLM-as-a-Judge Evaluation
Data & Pipelines
Large-scale Text Processing, Semantic Mapping, Clustering, Embedding-based Retrieval, Data Annotation Workflows, Reproducible Pipelines
Visualisation & Reporting
Interactive Dashboards, Analytical Reporting, Visual Analytics