Methodology for Public Transport Mode Detection Using Telecom Big Data Sets: Case Study in Croatia
Determining the number of passengers using public transport services is a challenging and time-consuming task that relies either on manual observations (e.g. manual counting passengers in vehicles or at stations) or the application of technical solutions (using data from automatic fare collection system (AFC) or automatic passenger counters (APC), which is characterized either by the provision of an incomplete picture (AFC) or by a solution which in practice is installed in a small number of vehicles if any (APC). The new approach which uses anonymized telecom-originated big data sets and data science principles can be used as a smart data driven approach for determining the use of public transport. Anonymized telecom big data sets represent “digital breadcrumbs” that people leave while moving through the city. When paired with additional data sets (e.g. public transport timetables, location of public transport stations, information on public transport lines, etc.), it can be used for modal split detection. In this paper, a new methodological approach is proposed that uses anonymized telecom big data sets and a statistical modelling approach to identify possible public transport trips among all other trips. This methodology has been tested in a case study in the City of Rijeka and validated using ground truth data obtained from traditional sources.