Summary

Overview

Work History

Education

Skills

Software

Certification

Research experience

Research interests & Publications

Timeline

Mohamed Ragab

Data Engineer & Research Fellow (Post-Doctoral)

Southampton

Summary

Detail-oriented team player with extensive experience in distributed systems and processing large knowledge graphs. Specialized in data engineering, particularly in transitioning SPARQL to SQL workloads. Positioned to make significant contributions to graph analytics and database projects. Strong technical proficiency ensures precision and innovation in complex data environments.

Overview

years of professional experience

Certifications

Languages

Work History

Postdoctoral Research Fellow

Southampton University

02.2023 - Current

Working on a research project for developing decentralized search capabilities on
personal online data-stores (pods) within the Solid framework.
We develop various (RDF)indexes, search algorithms, and query routing mechanisms for enabling the decentralized query system.
Managed to develop a full automation pipeline for managing Solid services as well
as for the IBM GaianDB services on a cluster of 50 VMs.

Konwldge Engineeer

Southampton University (Uniworkforce)

06.2023 - 08.2023

Integrated maritime data from various data sources (CSV, and XML, and JSON) into an unified knowledge graph. This allows getting a holistic view and clean single source of truth for querying maritime data.

Data Engineer

Tartu University

07.2022 - 01.2023

Developed a CI/CD pipeline for building an API gateway for integrating various
explainable AI microservices (written in various languages Python and Java).
Skills: Kong API Gateway · Ansible · OpenAPI. Amazon EC2 · FastAPI · Githubactions.

Junior Research Fellow

Tartu University (Big Data Systems Group)

09.2018 - 01.2023

Project 1: Managing big graph datasets over Apache Spark-SQL.Converted RDF graph data to relational schemas, employed varied partitioning techniques
on an HDFS cluster, and managed a Hive DataWarehouse for extensive querying. Skills: Spark-SQL, HDFS, Scala, SPARQL & RDF. Published 12 publications out of this research project.
Project 2: ”Minaret”; a tool recommendation for academic reviewers for scentific
journals.
Implemented Python web scraping to gather scientific scholars’ data from contemporary
academic platforms, and established scholar profiles within MongoDB. Skills: Django, MongoDB, Python Web-Scraping tools. Published 1 publications out of this research project.

Data Engineer

Ominva

03.2022 - 10.2022

Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
Analyzed complex data and identified anomalies, trends, and risks to provide useful insights to improve internal controls.
Automated 25 routine tasks using Python scripts, increasing team productivity and reducing manual errors.
Developed database architectural strategies at modeling, design, and implementation stages to address business or industry requirements.

Data Engineer & Researcher at SWRG Group

Semantic Web Research Group

09.2015 - 03.2018

Worked on a project for managing trust over academic social networks, Building a
DWH that integrates data about scholars by querying several public academic APIs
and datasets.
Skills: MySQL, Python, SPARQL.

Web Developer

ITIS

09.2013 - 03.2015

Coded websites using HTML, CSS, JavaScript, and jQuery languages.
Worked on developing 3 healthcare website using ASP.NET and C#. Skills: SQL, C#.
Determined coding requirements for specialized scripts
Applied emerging technologies to update and maintain site applicability

Education

Ph.D. - Computer Science

Tartu University

Tartu, Estonia

04.2001 -

Master of Science - Information Systems

Cairo University

Cairo, Egypt

04.2001 -

Bachelor of Science - Information Systems (Excellent With Honors)

Fayoum University

Fayoum, Egypt

04.2001 -

Skills

High-Level Languages: Python, Java, Scala

Low-Level & Scripting Languages: C, Bash

Databases: SQL-DBS (MySQL, PostgreSQL), SPARQL, NoSQLDBs (MongoDB, Redis,
and Neo4j)

Software

Git

Systems (Linux and MacOS)

Docker

Apache Spark

Apache Kafka

HDFS

Apache Hive

Cloud (Azure, Amazon AWS)

Certification

IBM Data engineering Coursera Specialization.

Research experience

QSMS: Query streaming management system, Database group at Liris Lab. Nov. 2021 – Feb.2022
I have been working on a project aims at taming the high velocity of SPARQL query workloads that requires a new paradigm of streaming applications. I managed to develop a query management system for SPARQL query workloads on top of simple RSP4J framework.
Bench-Ranking: Towards prescriptive analysis of big graph processing: the case of SparkSQL. Nov. 2019 – Sep. 2021
In this Project, we present a prescriptive analysis of the Apache Spark engine for processing large (RDF) graphs. I have managed to publish several (13) publications from this project in top tier conferences and journals (CACM, SWJ, IEEE Big Data, DOLAP, ISWC, SBD, and others).
Stream Bakery: Towards a Streaming Linked Data Life-Cycle. Jan. 2020 – May 2020
The project is Collaborative Research project between Tartu and Politecnico di Milano Universities. It describes in detail the resources and proposes a series of recipes for processing the published Linked Data streams. I have managed to process the data in Apache Spark-SQL, and we published this as a full paper in the ISWC 2021.
Minaret: A framework for Scientific Reviewers Recommendation. Jul. 2019 – Oct. 2019
I developed the MINARET framework, a recommendation framework that facilitates exploiting the available information.
on modern scholarly websites for identifying relevant candidate reviewers to the topic of the manuscript, excluding those with a potential conflict of interest, and ranks them based on several configurable metrics. I have published this work in EDBT 2019, Portugal.

Research interests & Publications

Big Data, Semantic Web, Graph Data Management. You can have a look at my publications in my Google Scholar (https://scholar.google.com/citations?user=DFVt7CYAAAAJ&hl=en).

Timeline

IBM Data engineering Coursera Specialization.

08-2023

Konwldge Engineeer

Southampton University (Uniworkforce)

06.2023 - 08.2023

Postdoctoral Research Fellow

Southampton University

02.2023 - Current

IBM Continuous Integration and Continuous Delivery (CI/CD) specialization

12-2022

Data Engineer

Tartu University

07.2022 - 01.2023

Big Data analytics with Apache Spark from Databricks academy.

04-2022

ETL and Data Pipelines with Shell, Airflow and Kafka

04-2022

Data Engineer

Ominva

03.2022 - 10.2022

ETL and Data Pipelines with Shell, Airflow and Kafka

03-2022

Microsoft Azure for Data Engineering

12-2021

Junior Research Fellow

Tartu University (Big Data Systems Group)

09.2018 - 01.2023

Data Engineer & Researcher at SWRG Group

Semantic Web Research Group

09.2015 - 03.2018

Web Developer

ITIS

09.2013 - 03.2015

ORACLE Database Administration Certificates.

08-2013

Ph.D. - Computer Science

Tartu University

04.2001 -

Master of Science - Information Systems

Cairo University

04.2001 -

Bachelor of Science - Information Systems (Excellent With Honors)

Fayoum University

04.2001 -