1. Mushroom Classification and Analysis:
- Conducted a comprehensive analysis of the "Mushroom.csv" dataset, applying data cleaning and exploratory data analysis techniques to understand the attributes and relationships within the dataset.
- Developed and implemented classification models, including decision trees, random forests, and logistic regression, to accurately predict the edibility of mushrooms, achieving [98.2%] accuracy in classification.
- Communicated project findings through effective data visualizations and documented the entire project process, methodologies, and results in a comprehensive report.
2.Deep Learning Model Development and Evaluation:
- Developed a deep learning model for ECG Heartbeat Classification, leveraging techniques such as exploratory data analysis (EDA) and hyperparameter tuning.
- Applied suitable deep learning models and preprocessed the dataset to ensure compatibility with the chosen models.
- Built, trained, and validated deep learning models, optimizing their performance through hyperparameter tuning and evaluating their effectiveness using metrics like precision, recall, F1-score, ROC-curve, and PR-curve.
3. UK Violent Crime and Firearms Analysis:
- Conducted analysis to verify claims made in the documentary regarding violent crime in the UK.
- Utilized publicly available datasets, including UK Home Office's Street Level Crime Data and English Indices of Deprivation Data.
- Employed Apache Spark on a cloud IaaS platform to efficiently process the data.
- Filtered the dataset for relevant crimes and assessed trends in violent crimes.
- Investigated the claim of higher firearms incidents per capita in Birmingham compared to other UK locations.
- Explored the association between firearms incidents and drugs offenses.
- Created informative visualizations to support the analysis
- Presented findings and analysis in a Jupyter Notebook.
4. TRAVELNORTH Database Analysis and Implementation
- Applied advanced data modeling, database design, and implementation techniques to address a complex data management problem within the TRAVELNORTH scenario.
- Utilized Oracle database system to develop data warehousing and data analytics solutions.
- Analyzed, designed, and implemented a logical relational schema, ensuring data reliability and adherence to naming conventions.
- Populated the database with sample data and performed queries using Relational Algebra and SQL.
- Explored aspects of object-relational and NoSQL database implementations for enhanced functionality.
- Produced a comprehensive report addressing professional, legal, ethical, and security considerations for TRAVELNORTH.
5. Data Warehousing and Data Mining
- Improved performance in data warehousing tasks by designing and implementing optimized indexes.
- Created new materialized views to enhance query performance and user experience.
- Utilized database optimization techniques to rewrite queries and leverage materialized views.
- Conducted data mining analysis on the UnitedCreditCards dataset to predict credit card default payments.
- Developed predictive models using suitable algorithms and evaluated their capabilities.
- Provided recommendations to the UNITED FINANCE company based on the findings.
- Critically evaluated data quality and adherence to standards in the SH data warehouse and UnitedCreditCards dataset.
6. Dissertation: House Price Prediction using Machine learning incorporating non-visual parameters.
- Developed a machine learning framework to predict house prices by incorporating non-visual characteristics, such as socioeconomic factors, neighborhood features, transit accessibility, and environmental considerations.
- Implemented various machine learning algorithms, including Random Forest, XGBoost, and LightGBM, to build predictive models.
- Conducted feature engineering to extract meaningful insights from the data and improve model performance.
- Evaluated model performance using metrics like RMSE, R-squared, and MAE to assess accuracy and predictive power.
- Collaborated with a team to collect and preprocess relevant datasets, perform data exploration, and validate the models.
- Presented project findings to a panel of industry experts, highlighting the significance of non-visual factors in house price prediction.