Projects

Here you can find a collection of different problems, challenges and frameworks divided by field that I have solved and implemented during the last few years and I can share publicly (or at least part of them). If you are interested in further details don't hesitate to contact me privately.

Machine Learning

Deep Learning
  • Causal Motion Forecasting [Code: (50+ ), Paper: , Presentation: ]

    Official implementation for the paper 'Towards Robust and Adaptive Motion Forecasting: A Causal Representation Perspective'.

  • Pneumonia Diagnosis [Code: , Paper: , Presentation: ]

    Proposing 2 ideas towards a robust and adaptable diagnosis of pneumonia from Chest X-Ray images: (i) discard the spurious feature replacing ERM with a robust training routine (i.e. IRM and v-REx); (ii) replace the straight-forward deep neural networks with a new modular architecture, encoding separately the invariant features (in a self-supervised fashion) and the style confounders.

  • New photovoltaic panels detection [Code: (20+ ), Report: ]

    Detecting available rooftop area from satellite images to install photovoltaic panels using a U-Net (FCNN) based model. Core project for 'Quantification of the available area for rooftop photovoltaic installation from overhead imagery using convolutional neural networks' by Castello et al. 2021.

  • Second order method for Deep Learning [Code: , Report: ]

    Analysis of the convenience in using a state-of-the-art Second Order Method (ADAHessian) in Non-Convex Optimization training a Deep Convolutional Neural Network (RESNET18) on MNIST data-set comparing with traditional First Order Methods.

  • Digits Comparison [Code: , Report: ]

    Comparison between different Deep Neural Networks predicting the inequality among 2 gray-scale images representing handwritten digits from MNIST database.

  • Mini Deep Learning Framework [Code: , Report: ]

    Design and implemention of a mini deep learning framework based only on Pytorch's tensor definition and elementary mathematical operations (without preexisting deep learning toolbox).

  • Geographic Area Classifier [Code: ]

    Classification of satellite images among Residential, Industrial and No Available Rooftop Area using Deep Convolutional Neural networks.

Challenges
  • Generali S.p.a. - Data Challenge 2020 [Code: PRIVATE, Report: ]

    Supervised Churn Classification challenge on a insurance policy dataset provided by Generali Italia S.p.a. I proposed and implemented an original architecture, combining in a hierarchical structure several binary classifiers through ensemble learning. My model won the 1st prize in the qualitative ranking (best report and code) and it produced the 2th (7th) best prediction on the public (private) test set out of 280 participants.

  • EPFL - Higgs Boson Challenge 2020 [Code: , Report: ]

    Classification problem on the interaction of microparticles based a huge dataset simulated by the ATLAS (CERN experiment) and proposed by Machine Learning course (CS-433) at EPFL. In a team of 3 we solved the problem combining different data engineering, trasformations and classification algorithms, implementing only in Numpy (no other Python external library). Getting an accuracy on the prediction of 0.841 and a F1 Score of 0.757 we secured the 2-nd place (among offical submissions) out of 290+ teams in the final ranking on AICrowd.

  • Oracle Labs and Politecnico di Milano - Graph Machine Learning Contest 2019 [Code: , Report: ]

    Multi-label vertex classification problem on Protein-Protein Interactions dataset provided by Politecnico di Milano in conclusion to High Performance and Graph Analytics course and in partnership with Oracle Labs. In a team of two, we studied and implented several state-of-art techniques to generate a powerful embedding of the graph and we solved multi-classification problems with the highest F1-score (1st prize).

  • Politecnico di Milano - Machine Learning for Networking Contest 2019 [Code: ]

    Supervised Churn Classification problem on a mobile cellular network operator dataset provided by Politecnico di Milano in conclusion to Machine Learning for Networking course. Combining several preoprocessing transformations and state-of-the-art classifiers I won the 1st prize in the Kaggle final challenge for the higher Accuracy and F1-score.

Reinforcement Learning
  • Lunar Lander (Open AI Gym) [Code: ]

    Teaching to an agent to play the Lunar Lander game from OpenAI Gym using a Policy Gradient methods (REINFORCE and variants with adaptive baseline).


Causal Inference

  • Network Causal Tree [Code: , Paper: ]

    The Network Causal Tree R package introduces a machine learning method that uses tree-based algorithms and an Horvitz-Thompson estimator to assess the heterogeneity of treatment and spillover effects in clustered network interference. Causal inference studies typically assume no interference between individuals, but in real-world scenarios where individuals are interconnected through social, physical, or virtual ties, the effect of a treatment can spill over to other connected individuals in the network. To avoid biased estimates of treatment effects, interference should be considered. Understanding the heterogeneity of treatment and spillover effects can help policy-makers scale up interventions, target strategies more effectively, and generalize treatment spillover effects to other populations.

  • Bayesian Causal Forest with Instrumental Variable [Code: , Paper: ]

    Bayesian Causal Forest with Instrumental Variable (BayesIV) R Package introduces an innovative Bayesian machine learning algorithm to draw interpretable inference on heterogeneous causal effects in the presence of imperfect compliance (e.g., under an irregular assignment mechanism).

  • Casual Rule Ensemble Algorithm [Code: (4000+ downloads), Website: , Paper: , Master Thesis: ]

    Identifying subgroups of a study population where a treatment or exposure has a notably larger or smaller effect on an outcome compared to the population average is crucial in social and health sciences. While estimating the conditional average treatment effect (CATE) given a pre-specified set of covariates is a common approach, it only allows for estimating causal effects on subgroups that have been specified a priori by the researchers. We propose Causal Rule Ensemble (CRE) algorithm, a flexible and robust method for interpretable discovery and estimation of heterogeneous treatment effects in terms of decision rules.


Data Science

  • Robust Journey Planning [Code: , Presentation: ]

    Robust public transport route planner based for Zurich Area in Switzerland. Given a desired arrival time, the route planner computes the fastest route between departure and arrival stops within a provided confidence tolerance expressed as interquartiles. Designed and implemented from scratch extending Connection Scan Algorithm (CSA).

  • PosTune [Code: , Report: PRIVATE]

    PosTune is a mobile app using deep learning to track your upper body posture in real time through your camera. Our hope is that the user posture is improved by PosTune real-time feedback, then reducing several pains and illnesses related to a wrong posture.

  • Languages Influence Social Network Patterns: A Case Study [Code: , Website: ]

    Investigation of different usage behavior (activitness, seasonality, ecc.) of different countries on Twitter.


Optimization

  • Second order method for Deep Learning [Code: , Report: ]

    Analysis of the convenience in using a state-of-the-art Second Order Method (ADAHessian) in Non-Convex Optimization training a Deep Convolutional Neural Network (RESNET18) on MNIST data-set comparing with traditional First Order Methods.

  • Metropolis-Hastings for Optimization [Code: , Report: ]

    Design of a roadmap to test a new 5G network while optimizing the cost of the maintenance of the new installations using a Metropolis-Hastings based algorithm.

  • Mathematical Programming for activity planning in an Oncology Day-Hospital [Code: PRIVATE, Report:]

    BSc Thesis in Mathematical Engineering at Politecnico di Milano
    Under the supervision of Prof. Giuliana Carello and in collaboration with Ospedale San Martino di Genova I used operational research in master chemotherapy planning and clinicians rostering in a hospital outpatient cancer centre.


Statistics

  • Discrete Data Analyis on anti-typhoid inoculation [Code: PRIVATE, Report: ]

    Study on anti-typhoid inoculation treatment efficacy in 1906-7 from a census of the army in the British Empire and colonies using Discrete Data Analysis.

  • ANalysis Of VAriance on moisture content in tree branches [Code: , Report: ]

    Study on statistically signifcant factors (among location of the cut, transpiration conditions and species of tree) on the Moisture level using the ANalysis Of VAriance (ANOVA).

  • Handbook for coaches NCAA [Code: PRIVATE, Report: ]

    Statistical analysis of the defensive and offensive characteristics of a basketball team in the National Collegiate Athletic Association (NCAA) using Linear Regression, Logistic Regression and standard Machine Learning classifier (Random Forest and SVM).


Robotics

  • Homemade Lego Plotter [Code: PRIVATE, Presentation: ]

    A machine in LEGO Mindstorms (NXT) able to graphically represent real functions in real variables and geometric figures in the Cartesian plane.

© Copyright 2024 Riccardo Cadei.