Leanna L. House


The paper develops models for modeling the availability of bikes in the San Francisco Bay Area Bike Share System applying machine learning at two levels: network and station. Investigating BSSs at the station-level is the full problem that would provide policymakers, planners, and operators with the needed level of details to make important choices and conclusions. We used Random Forest and Least-Squares Boosting as univariate regression algorithms to model the number of available bikes at the station-level. For the multivariate regression, we applied Partial Least-Squares Regression (PLSR) to reduce the needed prediction models and reproduce the spatiotemporal interactions in different stations in the system at the network-level. Although prediction errors were slightly lower in the case of univariate models, we found that the multivariate model results were promising for the network-level prediction, especially in systems where there is a relatively large number of stations that are spatially correlated. Moreover, results of the station-level analysis suggested that demographic information and other environmental variables were significant factors to model bikes in BSSs. We also demonstrated that the available bikes modeled at the station-level at time t had a notable influence on the bike count models. Station neighbors and prediction horizon times were found to be significant predictors, with 15 minutes being the most effective prediction horizon time.


Leanna L. House

Publication Details

Date of publication:
September 20, 2020
Cornell University
Publication note:

Huthaifa I. Ashqar, Mohammed Elhenawy, Hesham A. Rakha, Mohammed H. Almannaa, Leanna House: Network and Station-Level Bike-Sharing System Prediction: A San Francisco Bay Area Case Study. CoRR abs/2009.09367 (2020)