Summary
Overview
Work History
Education
Skills
Software
Interests
Accomplishments
Conferences
Publications
Work Availability
Timeline
SoftwareDeveloper
Pejvak Moghimi

Pejvak Moghimi

Machine Learning Research Scientist
London

Summary

As a Machine Learning Research Scientist, I have 5+ years of experience in applied deep learning and 7+ years in developing data science pipelines in Python; developed a high-performance data science pipeline for processing billions of genomics sequence data; reimplemented and enhanced existing analytical software, increasing memory-efficiency and speed 1000 fold; created deployable deep neural network models while adhering to responsible-AI principles, achieving state-of-the-art results. I was awarded the prestigious Wellcome Trust ISSF fellowship for continuing high-impact PhD research. I value collaboration and knowledge sharing, taking initiatives in community building and collaboration.

Overview

8
8
years of professional experience
9
9
years of post-secondary education

Work History

Postdoctoral Fellow

Wellcome Sanger institute
Hinxton, England
10.2022 - 02.2023
  • Computational/ML lead of the Prenatal Skin Cell Atlas project within the Human Cell Atlas global consortium for 3D mapping of human body at single cell resolution.
  • Discovered the signalling & gene regulatory network for angiogenesis in Prenatal skin and the top-25 relevant genes.
  • Troubleshot, refactored and reproduced the results of >10,000 lines of code in 3 weeks to generate 50 publication-ready figures.
  • Analysed large spatial and single-cell genomics data.
    Made significant contributions the research and the article being submitted to Science.
  • Lead contributor to team's GitHub repository; creating clean, reproducible and beautifully documented GitHub repository, with ML-based interactive graphical mapping of this huge project.
  • Established groundwork for reproducible data science.
  • Updated department leads on research-based findings and provided any shortfalls or advances of note on weekly basis.
  • Collaborated with biology domain experts to advance research and gain deeper understanding of topics.
  • Reviewed and analysed historical and emerging literature to develop and practice new technologies.
  • Contributed to and influenced discussions, seminars and lectures at Human Cell Atlas projects' events.

ISSF Postdoctoral Research Fellow

Institute of Molecular Biology- Birkbeck University
London, England
04.2022 - 10.2022
  • Awarded ISMB fellowship for potential to continue delivering high-impact research in machine learning following successful demonstration of PhD results.
  • Developed transformer-based LLM-type models of immune receptor molecules to augment NLP-based deep noisy long-tailed regression modelling of population-wide antibody commonality.
  • Implemented self-supervised and transfer learning approaches for improving downstream supervised deep regression modelling.
    Led 4, and actively contributed to 1, open-source GitHub projects.
  • Optimized previously published algorithms, such as Sumrep, and developed highly efficient graph-theoretical Python package capable of computing 100s of millions of sequences on personal laptops.
  • Adhered to responsible-AI principles for creating model ready for clinical use.
  • Achieved the most granular model of antibody commonality, as well as state-of-the-art predictive performance, by my deep conformal regressor model which quantifies epistemic and aleatoric uncertainty.

Teaching assistant

Birkbeck University
London
10.2017 - 01.2019
  • Tutored 30 students on 3 modules of the “Bioinformatics with Systems Biology” MSc degree.
  • Statistics: tutored 13 practical sessions on the highlighted topics of resampling and subsampling techniques, statistics with R, and unsupervised ML.
  • Genomics: tutored 13 practical sessions on the highlighted topics of D&C algorithms, dynamical programming and Hidden Markov Models.
  • BioComputing with Python: tutored 13 practical sessions on many computer science and Python programming topics.

Computational Biologist

University of York
York
2017.06 - 2017.09

Upon the success of my continuous computational genomics research in the group, I was hired by Prof. Thomas to continue my work to develop the function-discovery pipeline further (see Integrated Masters - research). I expanded the Python-based pipeline I had developed for querying databases, which helped with exploring the genomes of gram-positive bacteria to discover exporters of novel natural antibiotic compounds, which could be used as potential drug targets.

BBSRC Summer Internship

University of York
York
2016.06 - 2016.09

My first introduction to Python programming software development and data science. I completed multiple online courses on these topics from world-class universities, such as MIT and Harvard, while receiving hands-on training and supervision from computer scientists senior researchers in my group as well as the head of Genomics department.

I automated my phylogenomics pipeline - developed during my previous internship - through Python programming, which allowed me to access remote databases more readily and extend the power of the pipeline to a much larger and broader datasets. Notably, as part of an inter-departmental collaboration, I used this pipeline for analysing ABC transformers, among other more challenging transporter families, which resulted in successful identification of a broader range drug-targets.

Harry Smith Vacation Internship

University of York
York
2015.06 - 2015.06

First introduction to computer science, algorithms and computational biology and genomics, frequentist Vs. Bayesian statistical methods, statistical learning and mathematical inference.

Particularly focused on learning the foundations of the following within the context of the mathematical and statistical underpinnings of phylogenetic methods: optimisation algorithms, Monte Carlo sampling, maximum-likelihood estimation, Hidden Markov Model, MCMC algorithm, Bayesian inference, Dynamical programming, traversal algorithms, search algorithms, multiple sequence alignment algorithms.

Extracted sequences from the BLAST database and developed a rigorous phylogenetics pipeline for functional relationships among bacterial transporter molecules for drug target identification.

Education

Ph.D. - Applied Deep Learning

Institute of Molecular Biology - Birkbeck
London
09.2017 - 03.2022

Integrated Master’s (MBiol) - Computational Biology

University of York
York
09.2013 - 09.2017

Skills

Greenfield ML/DL models, HPC, Deep Learning, Data Science, Computational Biology

undefined

Software

Python, Tensorflow, Keras, PySpark, Dask, Modin, Polars, Numpy, SciPy, Pandas, Scikit-Learn, Graph-tool, NetworkX, Jupyter, Plotly, Matplotlib, BioPython

Bash, PyTorch, PyTorch Lightning, JAX, Powerlaw, MrBayes, BLAST, Vaex, RAPIDS

Cython, R, Bioconductor, Bokeh, FLAX, SQL, CUDA

Interests

NLP, CV, LLM, Zero-shot Learning, Generative Learning, Supervised Learning, Self-Supervised Learning, Semi-Supervised Learning, Transformers, Deep Long-Tailed learning, Deep Noisy Learning, Forecasting, Regression, Classification, VAE, GAN, Graph Neural Networks, Diffusion Models, Reinforcement Learning

Accomplishments

  • National-Level Biology Olympiad.
  • 2015 Harry Smith Vacation Studentship Award (12 weeks, £1600).
  • 2016 BBSRC Summer studentship Award (15 weeks, £2600).
  • Masters Degree with a Distinction Grade.
  • Successful Completion of a Fully-Funded PhD.
  • 2022 ISSF Research Fellowship Award (6 months, £25000).
  • Publications in High-Impact Peer-Reviewed Journals.
  • Founder of the LondonBioML society, pitched in the successful LiDO renewal grant (£70 million).
  • Oral presentations at the Immrep22 and other conferences.

Conferences

  • Microbiology Society (2016, Poster presentation): Presented the findings of the first version of my phylogenomics pipeline for drug-target discovery across MFS transporters of pathogenic strains of Salmonella. I showed that biological databases, such as BLAST, do not necessarily identify possible drug targets as homologs of known drug targets, however, integrating methods such as mine with these databases can help.
  • Microbiology Society (2017, Poster presentation): Presented the findings of the 2nd version of my phylogenomics pipeline for pathogen-specific drug-target discovery applied to a much wider range of molecules to further demonstrate the proof of principle to domain experts.
  • ImmRep22 (2022, Oral presentation): Presented the results of my deep noisy long-tailed regression modelling of population-wide antibody commonality to the foremost computational immunology experts in the world.

Publications

  • Olson BJ, Moghimi P, Schramm CA, Obraztsova A, Ralph D, Vander Heiden JA, Shugay M, Shepherd AJ, Lees W and Matsen FA IV (2019) "Sumrep: A Summary Statistic Framework for Immune Receptor Repertoire Comparison and Model Validation." Front. Immunol. 10:2533. doi: 10.3389/fimmu.2019.02533 (IF: 6.43).
  • Zareian N, Eremin O, Pandha H, Baird R, Verma C, Eremin J, Choy D, Hargreaves S, Moghimi P, Shepherd A, Lobo D, Eremin J, Kordasti S, Farzaneh F and Spicer J. (2022) “Phase 1 trial of a human telomerase reverse transcriptase (hTERT) vaccination strategy addressing T effector cells and immune-suppressor mechanisms.” (Accepted for publication at NPJ Vaccines, IF:9.4).
  • Moghimi P, Shepherd AJ (2023) Deep noisy long-tailed regression modelling of a novel population-wide measure of antibody commonality (Manuscript plus the open-source GitHub repository in preparation for submission to Nature Machine Intelligence, IF: 25.9).
  • Moghimi P (2023) "ParaGraphPy: An embarrassingly parallel cache-coherent Python package for pairwise-edit-distance network reconstruction and graph-theoretic metrics calculations of massive sequence data." (Manuscript plus the open-source GitHub repository in preparation for submission to Nature Methods - Brief Communications, IF:48.0).
  • Moghimi P (2023) "SummarGen: A High-Performance computing Python package for statistical summarisation of massive genomic sequence data." (In preparation for submission to Bioinformatics, IF:6.94).
  • Gopee H, Moghimi P, et al. (2023) "Prenatal human skin atlas reveals key insights into the regulation of morphogenesis and immunity." (in preparation for submission to Science, IF:63.7).

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Timeline

Postdoctoral Fellow

Wellcome Sanger institute
10.2022 - 02.2023

ISSF Postdoctoral Research Fellow

Institute of Molecular Biology- Birkbeck University
04.2022 - 10.2022

Teaching assistant

Birkbeck University
10.2017 - 01.2019

Ph.D. - Applied Deep Learning

Institute of Molecular Biology - Birkbeck
09.2017 - 03.2022

Computational Biologist

University of York
2017.06 - 2017.09

BBSRC Summer Internship

University of York
2016.06 - 2016.09

Harry Smith Vacation Internship

University of York
2015.06 - 2015.06

Integrated Master’s (MBiol) - Computational Biology

University of York
09.2013 - 09.2017
Pejvak MoghimiMachine Learning Research Scientist