International Choice Modelling Conference, International Choice Modelling Conference 2017

Font Size: 
Andrew Bwambale, Charisma F Choudhury, Stephane Hess

Last modified: 28 March 2017


Route choice modelling plays a critical role in evaluating individuals’ attitudes towards certain route attributes which helps planners to forecast transport network usage under different scenarios (1). Traditionally, data for route choice modelling has been collected by asking individuals to describe the routes taken for particular trips using paper or web-based questionnaires, however, such approaches are expensive, prone to reporting errors, and yield low response rates. This problem is particularly prevalent in Africa due to budget constraints. Recently, there has been increased advocacy for passive data collection using GPS sensors (e.g. smart phones), however, these are also expensive and data collection usually covers small samples (2). On the other hand, there has been growing interest in the use of ubiquitous data to understand travel behaviour. In particular, mobile phone data has emerged as a promising source of mobility information due to the high penetration rate of mobile phones in several parts of the world, including Africa (3).

Previous studies have used mobile phone signalling data to capture and analyse route choice behaviour (e.g. 4). Signalling data contains roughly continuous mobile phone positions associated with hand over and location area update events which occur more regularly.  Such data is however not readily available because network operators usually discard it due to storage capacity constraints. On the other hand, network operators usually maintain large scale call detail records (CDRs) for billing purposes. These observations however only contain approximate locations associated with users’ calling and texting events thereby resulting in discontinuous location information. Such discontinuities make CDRs unsuitable for inferring routes taken for short trips since the probability that users will travel from origin to destination without making calls or sending texts is high thereby making it difficult to capture critical intermediate locations needed for route identification. Besides, the location accuracy used in CDRs is not suitable for identifying local trips within network cell areas. On the other hand, there is a higher probability that users will make calls or send texts during long-distance trips which makes it easier to capture intermediate locations for route identification between regions of interest. Whereas methods of using CDRs to infer the routes taken for long-distance trips have been proposed (5), there is no study using such data to investigate the factors affecting route choice behaviour.

In this study we analyse the effect of route variables such as; road characteristics (pavement condition and type);  terrain (gradient); perceptions of road safety (accident rates); adjacent land use, the number of metropolitan areas traversed by the road, route length, and travel time on route choice for long-distance trips. We assume that individuals travelling between any two regions of interest are more likely to choose a route that offers them the highest utility. We define the utility for each alternative route as a function of the route variables mentioned above and use discrete choice models to estimate the parameters associated with each variable. Since route choice involves a large number of alternatives which are sometimes difficult to identify, we assume that choice sets are latent and vary across individuals. Based on these assumptions, we propose a new technique for choice set generation prior to model estimation and compare this with existing choice set generation strategies (6).

We implement this study using data from Orange’s Data for Development (D4D) challenge collected from Senegal (7). There are three datasets in total and all are based on the CDRs of over 9 million Orange subscribers in Senegal covering the whole of 2013. These are; (1) aggregate hourly antenna-to-antenna traffic; (2) cell level mobility data for random sub-samples of approximately 300,000 users on a rolling fortnight basis for a year; and (3) arrondissement level mobility data for random sub-samples of approximately 150,000 users on a  rolling fortnight basis for a year. Our study relies on the second dataset. We first define the origin and destination regions of interest and then assign the cells located within each region’s boundary to the region. We extract users who are detected to have travelled directly between the regions of interest. From these, we only consider users whose estimated travel time is within the realistic range for the route. We define the cells traversed by the alternative routes between the regions of interest and distinguish between the cells shared by alternative routes and those unique to each route. We track each individual’s calling and texting events associated with the cells unique to each route to identify the most probable route taken. We then extract the variables associated with each route alternative using secondary data sources such as; national statistics, land use maps and google maps and estimate a route choice model incorporating bias correction (8) since the estimation sample only contains individuals who own and frequently use their mobile phones.  We analyse the signs and relative magnitudes of each parameter to evaluate the effect of each variable and investigate the predictive performance of the model using a validation sample. The results of the study the demonstrate the potential for using CDRs for route choice studies.


1.       Prato, C.G., Route choice modeling: past, present and future research directions. Journal of Choice Modelling, 2009. 2(1): p. 65-100.

2.       Bierlaire, M., J. Chen, and J. Newman, Modeling route choice behavior from smartphone GPS data. Report TRANSP-OR, 2010. 101016: p. 2010.

3.       GSMA Intelligence. The Mobile Economy 2015. 2015  26 July 2016]; Available from:

4.       Schlaich, J., Analyzing route choice behavior with mobile phone trajectories. Transportation Research Record: Journal of the Transportation Research Board, 2010(2157): p. 78-85.

5.       Doyle, J., et al., Utilising mobile phone billing records for travel mode discovery. 2011.

6.       Bovy, P.H., On modelling route choice sets in transportation networks: a synthesis. Transport reviews, 2009. 29(1): p. 43-68.

7.       de Montjoye, Y.-A., et al., D4D-Senegal: the second mobile phone data for development challenge. arXiv preprint arXiv:1407.4885, 2014.

8.       Heckman, J.J., Sample selection bias as a specification error (with an application to the estimation of labor supply functions). 1977, National Bureau of Economic Research Cambridge, Mass., USA.


Conference registration is required in order to view papers.