A highly motivated professional with strong problem-solving skills and a willingness to learn, proficient in Python, SQL, R, and Excel. Experienced in using data analysis tools such as Scikit-learn, PyTorch, and Pandas to drive insights and support decision-making. Skilled in big data technologies including Apache Spark, Kafka, and Hadoop for efficient data processing. Adept at utilising Tableau for data visualisation and Salesforce for customer relationship management. Committed to resolving issues with resilience and adaptability while collaborating effectively using Microsoft Teams and Zoom.
Languages/Tools:
Libraries/Frameworks:
Big Data/Pipelines:
Other Tools:
Soft Skills:
Car Price Prediction with Ensemble Models {Summer 2025}
Developed a scalable regression pipeline on 400K+ AutoTrader listings with advanced feature engineering.
Improved prediction accuracy by 8\% using a Voting Regressor (XGBoost, Random Forest, Gradient Boosting).
Used SHAP and PDP plots to explain model predictions to non-technical audiences.
Impact: Enabled transparent, data-driven pricing insights in a large-scale automotive dataset.
\textbf{Melanoma Detection with CNN Architectures} \hfill \textit{Summer 2025}
\begin{itemize}
\item Fine-tuned 4 CNNs (ResNet, DenseNet, EfficientNet-B3, custom) using transfer learning and data augmentation.
\item Prepared a curated dataset of 3,924 dermoscopic images, ensuring clinical quality and class balance.
\item Achieved 95\% accuracy (F1-score) with EfficientNet-B3, outperforming baseline models.
\item \textit{Impact: Showcased the potential of deep learning for accurate, explainable melanoma screening support.}
\end{itemize}
\textbf{Big Data Streaming and Hypothesis Testing Pipeline} \hfill \textit{Spring 2025}
\begin{itemize}
\item Built a real-time pipeline using Kafka → HDFS → Spark (Scala + PySpark) for large-scale data ingestion.
\item Processed millions of Amazon reviews to explore the correlation between review length and rating using Spark SQL.
\item Validated a significant positive correlation through hypothesis testing and statistical analysis.
\item \textit{Impact: Delivered scalable real-time insights using enterprise tools to simulate production-grade analytics.}
\end{itemize}