Summary
Overview
Work History
Education
Skills
Software
Certification
Research experience
Research interests & Publications
Timeline
Generic

Mohamed Ragab

Data Engineer & Research Fellow (Post-Doctoral)
Southampton

Summary

Detail-oriented team player with extensive experience in distributed systems and processing large knowledge graphs. Specialized in data engineering, particularly in transitioning SPARQL to SQL workloads. Positioned to make significant contributions to graph analytics and database projects. Strong technical proficiency ensures precision and innovation in complex data environments.

Overview

11
11
years of professional experience
7
7
Certifications
3
3
Languages

Work History

Postdoctoral Research Fellow

Southampton University
02.2023 - Current
  • Working on a research project for developing decentralized search capabilities on
    personal online data-stores (pods) within the Solid framework.
  • We develop various (RDF)indexes, search algorithms, and query routing mechanisms for enabling the decentralized query system.
  • Managed to develop a full automation pipeline for managing Solid services as well
    as for the IBM GaianDB services on a cluster of 50 VMs.

Konwldge Engineeer

Southampton University (Uniworkforce)
06.2023 - 08.2023
  • Integrated maritime data from various data sources (CSV, and XML, and JSON) into an unified knowledge graph. This allows getting a holistic view and clean single source of truth for querying maritime data.

Data Engineer

Tartu University
07.2022 - 01.2023
  • Developed a CI/CD pipeline for building an API gateway for integrating various
    explainable AI microservices (written in various languages Python and Java).
  • Skills: Kong API Gateway · Ansible · OpenAPI. Amazon EC2 · FastAPI · Githubactions.

Junior Research Fellow

Tartu University (Big Data Systems Group)
09.2018 - 01.2023
  • Project 1: Managing big graph datasets over Apache Spark-SQL.Converted RDF graph data to relational schemas, employed varied partitioning techniques
    on an HDFS cluster, and managed a Hive DataWarehouse for extensive querying. Skills: Spark-SQL, HDFS, Scala, SPARQL & RDF. Published 12 publications out of this research project.
  • Project 2: ”Minaret”; a tool recommendation for academic reviewers for scentific
    journals.
    Implemented Python web scraping to gather scientific scholars’ data from contemporary
    academic platforms, and established scholar profiles within MongoDB. Skills: Django, MongoDB, Python Web-Scraping tools. Published 1 publications out of this research project.

Data Engineer

Ominva
03.2022 - 10.2022
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Analyzed complex data and identified anomalies, trends, and risks to provide useful insights to improve internal controls.
  • Automated 25 routine tasks using Python scripts, increasing team productivity and reducing manual errors.
  • Developed database architectural strategies at modeling, design, and implementation stages to address business or industry requirements.

Data Engineer & Researcher at SWRG Group

Semantic Web Research Group
09.2015 - 03.2018
  • Worked on a project for managing trust over academic social networks, Building a
    DWH that integrates data about scholars by querying several public academic APIs
    and datasets.
  • Skills: MySQL, Python, SPARQL.

Web Developer

ITIS
09.2013 - 03.2015
  • Coded websites using HTML, CSS, JavaScript, and jQuery languages.
  • Worked on developing 3 healthcare website using ASP.NET and C#. Skills: SQL, C#.
  • Determined coding requirements for specialized scripts
  • Applied emerging technologies to update and maintain site applicability

Education

Ph.D. - Computer Science

Tartu University
Tartu, Estonia
04.2001 -

Master of Science - Information Systems

Cairo University
Cairo, Egypt
04.2001 -

Bachelor of Science - Information Systems (Excellent With Honors)

Fayoum University
Fayoum, Egypt
04.2001 -

Skills

  • High-Level Languages: Python, Java, Scala
  • undefined

    Software

    Git

    Systems (Linux and MacOS)

    Docker

    Apache Spark

    Apache Kafka

    HDFS

    Apache Hive

    Cloud (Azure, Amazon AWS)

    Certification

    IBM Data engineering Coursera Specialization.

    Research experience

    • QSMS: Query streaming management system, Database group at Liris Lab. Nov. 2021 – Feb.2022
      I have been working on a project aims at taming the high velocity of SPARQL query workloads that requires a new paradigm of streaming applications. I managed to develop a query management system for SPARQL query workloads on top of simple RSP4J framework.
    • Bench-Ranking: Towards prescriptive analysis of big graph processing: the case of SparkSQL. Nov. 2019 – Sep. 2021
      In this Project, we present a prescriptive analysis of the Apache Spark engine for processing large (RDF) graphs. I have managed to publish several (13) publications from this project in top tier conferences and journals (CACM, SWJ, IEEE Big Data, DOLAP, ISWC, SBD, and others).
    • Stream Bakery: Towards a Streaming Linked Data Life-Cycle. Jan. 2020 – May 2020
      The project is Collaborative Research project between Tartu and Politecnico di Milano Universities. It describes in detail the resources and proposes a series of recipes for processing the published Linked Data streams. I have managed to process the data in Apache Spark-SQL, and we published this as a full paper in the ISWC 2021.
    • Minaret: A framework for Scientific Reviewers Recommendation. Jul. 2019 – Oct. 2019
      I developed the MINARET framework, a recommendation framework that facilitates exploiting the available information.
      on modern scholarly websites for identifying relevant candidate reviewers to the topic of the manuscript, excluding those with a potential conflict of interest, and ranks them based on several configurable metrics. I have published this work in EDBT 2019, Portugal.

    Research interests & Publications

    Big Data, Semantic Web, Graph Data Management. You can have a look at my publications in my Google Scholar (https://scholar.google.com/citations?user=DFVt7CYAAAAJ&hl=en).

    Timeline

    IBM Data engineering Coursera Specialization.

    08-2023

    Konwldge Engineeer

    Southampton University (Uniworkforce)
    06.2023 - 08.2023

    Postdoctoral Research Fellow

    Southampton University
    02.2023 - Current

    IBM Continuous Integration and Continuous Delivery (CI/CD) specialization

    12-2022

    Data Engineer

    Tartu University
    07.2022 - 01.2023

    Big Data analytics with Apache Spark from Databricks academy.

    04-2022

    ETL and Data Pipelines with Shell, Airflow and Kafka

    04-2022

    Data Engineer

    Ominva
    03.2022 - 10.2022

    ETL and Data Pipelines with Shell, Airflow and Kafka

    03-2022

    Microsoft Azure for Data Engineering

    12-2021

    Junior Research Fellow

    Tartu University (Big Data Systems Group)
    09.2018 - 01.2023

    Data Engineer & Researcher at SWRG Group

    Semantic Web Research Group
    09.2015 - 03.2018

    Web Developer

    ITIS
    09.2013 - 03.2015

    ORACLE Database Administration Certificates.

    08-2013

    Ph.D. - Computer Science

    Tartu University
    04.2001 -

    Master of Science - Information Systems

    Cairo University
    04.2001 -

    Bachelor of Science - Information Systems (Excellent With Honors)

    Fayoum University
    04.2001 -
    Mohamed RagabData Engineer & Research Fellow (Post-Doctoral)