Workshop 2: Working with repeated choice data
Last modified: 8 July 2011
Abstract
A growing majority of choice modelling applications now make use of data in which multiple choices are captured for each individual, with such data primarily coming from stated choice surveys. Independently of the type of data, the presence of multiple choice observations for each respondent can have significant advantages. First of all, it is clearly cheaper to collect multiple observations from the same respondent than to collect a single observation from multiple respondents. More importantly however, the presence of multiple responses allows a model to analyse choices for the same respondent across a range of different settings, which permit the analyst to identify statistically the differences between individuals as well as the differences between choice situations for the same respondent. This is not possible when using cross-sectional data. While modellers have been quick to appreciate the advantages of panel data, there has only been limited work on adapting the existing model structures for dealing with this kind of data. Indeed, the majority of choice modelling tools were developed to analyse individual choices in cross-sectional data, while, with panel data, we are dealing with a situation in which a respondent faces multiple choice situations.
The aim of this workshop is to discuss appropriate modelling approaches to deal with the specific characteristics of repeated choice data. The two presentations in this workshop present contrasting views.
The first presentation focuses on how the existing cross-sectional framework should be adapted to estimate sample level models on repeated choice data. The presentation focuses on issues to do with correlation across choices for the same respondent, heterogeneity across respondents and across choices for the same respondent, correction of standard errors when using cross-sectional estimation techniques, and the potential impact of behavioural phenomena such as fatigue/boredom and learning.
The second presentation shows how the presence of multiple observations for each individual can be exploited to estimate individual-specific models, rather than sample level models. Two methods are presented. The first approach is ordinary least squares regression combined with parameter scaling. The second approach is a modified maximum likelihood estimation technique. Results are presented for simulated and empirical data sets. Single individual model performance is compared to performance using individual level parameters derived from the matrix of posterior means using hierarchical Bayes techniques. Performance is compared both in terms of parameter recovery and out of sample holdout prediction. Implications of sample size, number of choice tasks per individual, and the complexity of the underlying data generating process are discussed.
Full Text: PPTX