26 February 2024

APCđź’–MPAYG merger: a data enhancement breakthrough

APCđź’–MPAYG merger: a data enhancement breakthrough

Fusing data sets offers detailed insights into the whole journey chain

Data is the new gold. For the public transport industry, mining this rich seam of information is essential for the planning and implementation of services and products that dovetail with passenger needs. But how exactly do public transport users get around? Who travels where and when? How many passengers have to wait for a connecting service, where do they wait and for how long? What are passengers' preferred routes? Merging automatic passenger count data (APC) and Mobile Pay As You Go (MPAYG) data from FAIRTIQ provide comprehensive and relevant answers to these and other questions but without the time-consuming and labour-intensive collection process.

For Joe Molloy, one of FAIRTIQ's most experienced data scientists and a member of the 'APCđź’–MPAYG' pilot team, "The merging of automatic passenger count data and Mobile Pay As You Go data is a game changer for transport planners." His colleague Simon Weber adds, "Until now, transport providers have not been able to capture every single link in a passenger's whole journey chain. Merging APC and FAIRTIQ MPAYG data gives us a much better understanding of how passengers move around the network."

The insights generated from the merged data provide planners a precise and reliable basis to make informed decisions on issues like route planning and the optimisation of transfer connections. Dashboards facilitate the observation and comparison of whole journey chains over time, which in turn makes it possible to visualise the effects of service and product changes and gauge the success of marketing campaigns. Merged APC and FAIRTIQ data can also provide a more detailed basis for revenue distribution decisions.

Extraction of maximum informational value with minimum effort  

Most transport companies and transport associations already have automatic passenger count devices installed in their vehicles and at different stops. APC data provide information on how many passengers get on and off at a given stop, but not on other key details, such as how they travelled to that point, what means of transport they will take for the next stage of their journey or how long they have to wait for the connecting service. To fill in these gaps, transport providers use passenger surveys, but these are time-consuming and suffer from sample-specific limitations.

Thanks to FAIRTIQ's innovative MPAYG-enabled check-in/check-out technology, it is now possible to observe whole journey chains in granular detail. Although the data also contain information on the type of transport taken and waiting times, it is limited to passengers who use the FAIRTIQ app.

The merging of the two data sets – APC and FAIRTIQ – exploits their strengths (quantity and high level of detail respectively) while cancelling out their weaknesses.

Spatial and temporal matching of data sets and scaling across the whole journey chain

The data is collated in three steps. First, the automatic passenger count data supplied by the transport company or transport association is read and analysed. Second, FAIRTIQ's data scientists compare this data with the data from the FAIRTIQ app in the space and time dimension. During the spatial matching process, one of the team's tasks is to check whether heavily frequented stops (observed in the APC data) are also frequented by a high number of FAIRTIQ app users. As for the temporal matching process, the FAIRTIQ team checks that the patterns of both data sets match. For example, weekday/seasonal fluctuations in passenger demand should resemble each other in the two sets.

In the third and final step, the FAIRTIQ data is scaled up to the APC level. Given that the FAIRTIQ app is not used by all public transport users (as yet), the data sets generally contain fewer journeys than the APC data sets. The scaling-up factors can vary between stops depending on the discrepancy between the number of observations in the FAIRTIQ and APC data sets at a particular stop. A special algorithm is used to perform the computations. The aim of this algorithm is to identify a scaling-up factor for every whole journey chain so that the FAIRTIQ data matches the patterns observed in the APC data. 


Feedback from the pilot projects

Since the second quarter of 2023, FAIRTIQ has been trialling the procedure in three separate pilot projects with partners in Switzerland. The findings so far show that the data merge works and produces meaningful results. The sole prerequisite is that both data sources are readily available and readable; no additional hardware is needed. The informational value of APC data is improved by merging it with FAIRTIQ data. As regards MPAYG data collection, the take-away message is simple: the more FAIRTIQ app users the better. According to Joe Molloy, however, even with FAIRTIQ app's 2–3% market share in startup regions, the merged data still generates meaningful results. 

Remarkably few surprises

Anyone who works with data knows that synchronising data sets from different sources often throws up surprises. According to Simon Weber, one of the biggest surprises from the APC-FAIRTIQ data merge is how smooth the process has been, "Of course there are challenges, like incomplete APC data sets, which makes scaling-up more difficult. But we have always managed to find solutions so far." The data scientist is amazed at how well the network distribution of FAIRTIQ users matches the overall volume of travellers. His positive assessment is also shared by FAIRTIQ's partners in the three Swiss pilot projects. All are impressed with the quality of the correlated data. 

Real-time insights thanks to regular updates

The merged data can be used to produce complete origin–destination matrices. The results can be visualised clearly and practically on interactive origin–destination maps and dashboards, which can serve as an effective support tool for decision-makers and enable updates for the entire network on a quarterly or even weekly basis.

The MPAYG data from the FAIRTIQ app is updated daily. In contrast, there is often a time lag between the collection and delivery of APC data, automated access would speed this up. More frequent updates would provide planners and decision-makers with a constant stream of key information, almost in real time, on passenger behaviour across the network - and allow them to continuously and productively exploit "public transport's data goldmine" to deliver better passenger services and products.