4.7 Article

Privacy-Preserving Federated Deep Reinforcement Learning for Mobility-as-a-Service

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TITS.2023.3317358

Keywords

Privacy-preserving machine learning; federated reinforcement learning; mobility-as-a-service; passenger behavior; secret sharing

Ask authors/readers for more resources

This research proposes a federated deep deterministic policy gradient (FDDPG) algorithm with privacy preservation to enhance the profitability and passenger satisfaction of Mobility-as-a-service (MaaS). Experimental results demonstrate that this method can increase MaaS profit and passenger satisfaction by approximately 90% and 15% respectively, while maintaining stable training against agent dropout. The approach and findings of this study could enhance MaaS utility and promote passenger trust and participation in MaaS and other data-driven transportation systems.
Mobility-as-a-service (MaaS) is a new transport model that combines multiple transport modes in a single platform. Dynamic passenger behavior based on past experiences requires reinforcement-based optimization of MaaS services. Deep reinforcement learning (DRL) may improve passenger satisfaction by offering the most appropriate transport services based on individual passenger experiences and preferences. However, this produces a new privacy risk to the MaaS platform using the centralized DRL method. Information leakage will occur if the platform is not carefully designed with privacy-preserving mechanisms. In this paper, we propose a federated deep deterministic policy gradient (FDDPG) that maximizes passenger satisfaction and MaaS long-term profit while preserving privacy. We enforce an equally weighted experience sampling mechanism to prevent sampling bias such that the solution quality of FDDPG is statistically equivalent to the centralized algorithm. During the model training and inference, information is processed locally, and only the gradients are shared, which prevents information leakage to any semi-honest participants and eavesdroppers. Secure aggregation protocol in line with the dynamic property of the mobile agent is also used in the gradient sharing step to ensure that the algorithm is prevented from inference attacks. We perform experiments on New York City-based real-world and synthetic scenarios. The results show that the proposed FDDPG can improve the MaaS profit and passenger satisfaction by about 90% and 15%, respectively, and maintain stable training against agent dropout. Our approach and findings could enhance MaaS utility as well as facilitate passenger trust and participation in MaaS and other data-driven transportation systems.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available