Most transit agencies are trying to increase their ridership. To achieve this goal, they are looking to maintain or even improve their level of service. This is very hard, since traffic congestion is normally increasing. As a result, bus travel times are higher and less reliable, which makes harder to predict travel times and avoid bunching. Being able to accurately predict bus travel speeds and update this prediction with real-time information could improve the quality and reliability of the information given to users, and increase the effectiveness of control schemes.
In this work we implement and compare different machine learning methods (Artificial Neural Networks, Support Vector Machines and Bayes Networks) to predict bus travel speeds using real-time information about traffic conditions. The proposed algorithms are compared against two common approaches used to predict travel speeds. In order to feed our models, we apply traffic shockwaves theory to select our predictors. The input data used in each model was the speed obtained and processed from GPS devices installed in each of the buses from Transantiago, the public transportation system from Santiago, Chile. Two types of speed were available: historical speed and a real-time speed, each for a given route segment and day period. Our results show that machine learning algorithms can outperform naïve predictions that use either only historical data or only real-time data with a 10-min delay. In particular, the Artificial Neural Network (ANN) algorithm achieved the best results, obtaining improvements of up to 23% in the root mean square error (RMSE) compared with the best benchmark model, and up to 3.3% in the RMSE versus the second best machine learning algorithm studied. Moreover, we validated our hypothesis that real-time data helps improve the accuracy of predictions up to 35% in the RMSE.