A Sports Data Science TNA visit in Pisa
A TNA experience report by Bram Janssens from Ghent University, Belgium. The goal of the stay was to combine cycling analytics with geospatial analytics, as geospatial analytics can heavily improve the current solutions in the field.
The past two months, from October 1, 2023 to November 30, 2023, I had the privileged opportunity to spend two months at the Knowledge Discovery and Data Mining Laboratory at the Consiglio Nazionale delle Ricerche (CNR) in Pisa, Italy. It has been a truly inspiring period.
The goal of the stay was to combine my previous efforts in improving cycling analytics with the extensive knowledge present here on geospatial analytics, as geospatial analytics can heavily improve the current solutions in the field.
The stay both allowed me to develop interesting new projects as well as to meet several inspiring people, as well as seeing some parts of beautiful Tuscany. The majority of my work here was in close collaboration with Dr. Luca Pappalardo, who has expertise in both sports data science and geospatial analytics.
I really enjoyed spending some time outside of my usual ‘home’ working environment and just experiencing how different work environments can also give you new ideas. Specifically, we are currently in the process of developing two projects. The first project focuses on the publication of an open-source cycling analytics data set which will allow future researchers to combine very detailed information on both race courses as well as race results, which will hopefully foster new innovative studies, as seen in other sports data science fields after the publication of similar data sets. The data set allows for several interesting applications, such as a data-driven grouping of races, which allows for performance evaluation per specialization: a sprinter, time trialist, and climber should not be evaluated on the same set of races. However, analyses performed on the data set can also be more individualized. For instance, in the paper we are planning to submit together with the data set, we are already capable of determining which races will suit a certain athlete more or less based on previous performances. Below you can find an example of the races that suit Mark Cavendish the most and the least. Imagine the potential this bears with regard to automated team rostering and race scheduling.
Moreover, we are also in the process of quantifying the influence teammates have on the performance of their team leaders, which is a complicated task in the field of cycling analytics, which is a complicated hybrid between an individual and team sports and where limited information is available on in-race events.
Besides these two already more developed studies, I also hope that the stay will allow me to engage in research projects that I would otherwise be unable to. I have learned a lot about the field of geospatial and mobility analytics, as I had the opportunity to follow a course on this topic here.
Moreover, I also met several people with compatible expertise compared to my own, and I hope to maintain contact with all these people. I have already had several interesting discussions about potential future directions of my research, as well as some useful feedback on earlier research of mine outside of the scope of the project here.
I can recommend a TNA visit to anyone, as it has helped me a lot on the professional level. But I also personally enjoyed my stay here, as I enjoyed Pisa and the surroundings a lot. Each weekend offered the opportunity for a mini-holiday, while the evenings in the city were also lovely. It has certainly taught me a lot about myself and has improved my work-related mobility.
Figure: Race Affinity Mark Cavendish: Official Race Profiles 2023 Tour of Oman Stage 1 (most favorable; left panel), and 2023 Giro dItalia Stage 18 (least favorable; right panel). Our simple initial approach was already capable of detecting which type of races the sprinter Mark Cavendish would thrive in and which races he would have difficulties in.