Academic career


Currently I am a PhD student in Data Science and Computation of the XXXIV cycle at the University of Bologna (winning the INFN scholarship). My supervisors are Prof. Daniele Bonacorsi and Claudio Grandi.
Currently I have a contract of association with INFN and with CERN as User. I am a member of the CMS collaboration of the LHC accelerator at CERN.

About my PhD, you can find here the poster about my thesis project and overview of the 1st year of PhD, presented at the 1st seminar of the PhD school in Data Science and Computation, held in Bologna on the 18/10/2019.



Master degree in Nuclear and Subnuclear Physics
Achieved on the 23/03/2018 at the University of Bologna, Bologna (Italy)
Vote: 110/110 with honours
Thesis entitled: Prototype of Machine Learning “as a Service” for CMS Physics in Signal vs Background discrimination
Supervisor: Prof. Daniele Bonacorsi
Co-Supervisors: Doct. Valentin Kuznetsov and Prof. Andrea Castro

Abstract:
This thesis aims at contributing to the construction of a Machine Learning “as a service” solution for CMS Physics needs, namely an end-to-end data-service to serve Machine Learning trained model to the CMS software framework. To this ambitious goal, this thesis work contributes firstly with a proof of concept of a first prototype of such infrastructure, and secondly with a specific physics use-case: the Signal versus Background discrimination in the study of CMS all-hadronic top quark decays, done with scalable Machine Learning techniques.

Direct download of the thesis here



Bachelor’s degree in Physics
Achieved on the 25/09/2015 at the University of Bologna, Bologna (Italy)
Vote: 110/110 with honours
Thesis entitled: Predicting CMS datasets popularity with Machine Learning
Supervisor: Prof. Daniele Bonacorsi
Co-Supervisor: Doct. Valentin Kuznetsov


Abstract:
This thesis presents the design, development and exploitation of a supervised Machine Learning classification system aimed at attacking the very concrete need of the prediction of the popularity of the CMS datasets on the Grid. The CMS experiment has completed its first data taking period at the LHC (Run-1). After a long shutdown (LS1), CMS is now collecting data on p-p collisions at 13 TeV of centre-of-mass energy in Run-2. The amount of experience collected in CMS computing operations during the last few years is enormous, and the volume of metadata in CMS database systems which describes such experience in operating all the CMS workflows on all the Worldwide LHC Computing Grid Tiers is huge as well. Data mining efforts into all this information have rarely been done, but are of crucial importance for a better understanding of how CMS did successful operations, and to reach an adequate and adaptive modelling of the CMS operations, in order to allow detailed optimizations and eventually systems behaviour predictions. A Data Analytics project has been launched in CMS and, within this area of work, a specific activity on exploiting machine learning techniques to predict dataset popularity has been launched as a pilot project. The popularity of a dataset is an important observable to predict, as its control would allow a more intelligent data placement, large optimizations in the storage utilization at all Tiers levels, and would form the basis of a solid, self-tuning, adaptive dynamic data management system. This thesis describes the work done exploiting a new pilot prototype called DCAFPilot, entirely written in python, to attack this kind of challenge.

Direct download of the thesis here



AND WHAT BEFORE ...
I obtained my Scientific High School Diploma in July 2012, with the vote of 100/100, at the Liceo Scientifico G. Torelli in Fano (Marche, Italy)