Privacy-Preserving Data Fusion for Traffic State Estimation

Researchers from National University of Singapore (NUS) are proposing a privacy-preserved data fusion algorithm for traffic state estimation (FedTSE) to facilitate collaboration and data sharing between multiple data owners, such as traffic authorities and mobility companies.

Transportation systems are undergoing rapid transformation, characterized by the expansion of ridesharing services and the development of connected automated vehicles. These paradigms have enormous potential to facilitate human-centred transportation operations, not only by improving traffic efficiency but also by ensuring responsible data usage (e.g., privacy protection, fairness, etc.).

In the current phase, researchers from the National University of Singapore (NUS) investigated traffic state estimation by merging trajectory data provided by these paradigms with traditional roadside detector data owned by traffic authorities. While such data fusion has been widely demonstrated by researchers and practitioners to significantly enhance traffic operations, existing algorithms rarely consider the fact that sharing such trajectory data can raise privacy concerns for both individual travellers (e.g., their origins and destinations) and mobility companies (e.g., their operation algorithms).

To address the privacy concerns, the researchers proposed a privacy-preserving data fusion algorithm for traffic state estimation, leveraging a combination of federated learning (FL) and traffic flow theory.

The aim is to develop a privacy-preserving data fusion algorithm for traffic state estimation that accurately estimates key traffic states such as flow, density, and speed for better traffic operations while safeguarding the privacy of mobility companies that provide the trajectory data.

To achieve this, the researchers first utilize the promising framework of FL, which enables multiple parties to collaboratively train a model without exchanging private data. The researchers formulate the traffic state estimation problem as a vertical FL problem and build their algorithm on a recently developed framework, reducing communication overhead through local gradient updates and easily integrating with graph neural networks.

Next, the researchers proposed a physics-informed FL approach that integrates traffic models with FL to improve data efficiency. This ensures the applicability of the proposed FedTSE in common traffic state estimation scenarios with limited ground-truth availability. Physics-informed deep learning integrates physical models into learning-based approaches to improve the data efficiency in the training process and/or to preserve the desired physical properties of the trained models.

Vertical federated learning: It enables data fusion between two entities without the need to disclose their raw data.

 

Traffic flow modelling: It ensures that the estimated traffic states satisfy key constraints defined by traffic flow dynamics.

 

HPC resources: This project was allocated 3,000 GPU hours from NSCC Singapore to train their models.

Case studies demonstrated that the models developed by the researchers can preserve the privacy of data owners with minimum impact on estimation performance, significantly outperforming baselines where mobility companies only share partial or no data due to privacy concerns. Moreover, by protecting privacy, mobility companies can be more incentivized to participate actively in data fusion and use higher-resolution data, thereby enhancing estimation accuracy.

On the practical side, the proposed strategy can help eliminate the data silos in the transportation industry by promoting collaboration between stakeholders, allowing them to share data safely and trustfully.

 

To find out more about how NSCC’s HPC resources can help you, please contact [email protected].

NSCC NewsBytes April 2024

Other Case Studies