Portfolio
HarvardX PH125.9x: Movielens Project
R project for partial completion of the edX course HarvardX PH125.9x: “Data Science: Capstone”.
The objective of this project was to build a movie recommendation system using the MovieLens dataset. The final model presented uses linear regression with matrix factorisation on the residuals, and acheives a root mean squared error of 0.782 when estimating the ratings (out of 5) of movies in the test set.
HarvardX PH125.9x: Higgs dataset classification
R project for partial completion of the edX course HarvardX PH125.9x: “Data Science: Capstone”.
In this project, neural networks (via Keras in R) was used to predict particle collision events using the HIGGS dataset. The final NN was generated with three hidden layers of 2048 nodes each, generating a final area under the ROC curve (AUC) of 0.877. It was found that using high-level (derived) features provided only a small improvement in AUC, compared to using low-level features only.
SimPy Examples
These short examples were created to teach discrete-event simulation to Electrical Engineering final-year project students at the City University of Hong Kong. The SimPy Python library was used for this purpose. SimPy is based on the use of Python generators and a Environment object that is shared between components of the simulation.