International Choice Modelling Conference, International Choice Modelling Conference 2015

Font Size: 
Introducing Non-Normality of Latent Psychological Constructs in Choice Modeling with an Application to Bicyclist Route Choice
Chandra R Bhat, Subodh Dubey, Kai Nagel

Last modified: 11 May 2015


Economic choice modeling has continually seen improvements and refinements in specification, partly because of the availability of new techniques to estimate models. One such development is the incorporation of random taste heterogeneity (i.e., taste variations in response to explanatory variables) across decision makers using discrete (non-parametric) or continuous (parametric) or mixture (combination of discrete and continuous) random distributions for model coefficients. Such a specification also leads to correlations across alternative utilities when one or more random coefficients appear in the utility specifications of multiple alternatives. A second development is the explicit consideration of latent psychological constructs (such as attitudes, perceptions, values and beliefs) within the context of economic choice models, which has the advantage (over the random taste heterogeneity approach) that it imparts more structure to the underlying choice process based on theoretical concepts and notions drawn from the psychology field. Additionally, it provides the opportunity to efficiently introduce random taste variations and the concomitant correlations across alternative utilities. This second development, commonly referred to as integrated choice and latent variable (ICLV) models, may be viewed as a variation of the traditional structural equation methods (SEMs) to accommodate an unordered-response outcome.

Another area of intense research in the recent past, but originating more from the statistical field, is the consideration of non-normal distributions in modeling data.  This has been spurred by the increasing presence of multi-dimensional data that potentially exhibit non-normal features such as asymmetry, heavy tails, and even multimodality. Parametric approaches to accommodate non-normality span the gamut from finite mixtures of normal distributions to skew-normal distributions (and more general skew-elliptical distributions) to mixtures of skew-normal distributions (and mixtures of more general skew-elliptical distributions). Many recent studies use either a multivariate skew-normal or a skew-t distribution as the basis for accommodating non-normality, with different proposals on how to characterize these skew distributions (see Lee and McLachlan (2013) for a recent review and synthesis of the many different proposals). However, it is well recognized now that the underlying basis for all of the different proposals for the multivariate skew-normal distribution originates in the pioneering work of Azzalini and Dalla Valle (1996).

In the current paper, we bring together the two developments discussed above - the ICLV model structure and the treatment of non-normality through a multivariate skew-normal or MSN distribution specification. In particular, we allow the latent constructs in the ICLV model to be skew-normal. After all, there is no theoretical basis for specifying these constructs as normal (as is typically assumed in the literature); thus, there is substantial appeal in specifying a more general non-normal specification that is then characterized empirically.  To our knowledge, this is the first such ICLV model proposed in the econometric literature, which has several important features. First, it recognizes the very real possibility that latent variables are non-normally distributed after conditioning on exogenous variables. Incorrectly imposing normality will, in general, lead to econometrically inconsistent and inefficient estimation in all of the ICLV model components. Second, our proposal to include non-normality exploits the latent factor structure of the ICLV model. That is, our approach constitutes a flexible, yet very efficient approach (through dimension-reduction) to accommodate a multivariate non-normal structure across all indicator and outcome variables through the specification of a much lower-dimensional multivariate skew-normal distribution for the structural errors. Third, taste variations (i.e., heterogeneity in sensitivity to response variables) can also be introduced efficiently and in a non-normal fashion through interactions of explanatory variables with the latent variables. Thus, for example, in a bicyclist route choice model, bicyclists who are more safety conscious (say a latent variable) than their peers may be more sensitive to motorized traffic volumes and on-street parking. By interacting safety consciousness with exogenous variables corresponding to motorized traffic  volumes and on-street parking, we then allow non-normal taste variation in response to both these exogenous attributes, but originating from a single skew-normal distribution associated with the safety conscious latent variable. Fourth, the multivariate skew-normal (MSN) distribution that we use has properties that make it an ideal one for incorporation into the ICLV model. Finally, the MSN distribution has specific properties that enable the use of Bhat's (2011) maximum approximate composite marginal likelihood (MACML) inference approach for estimation of the resulting skew-normal ICLV (or SN-ICLV) model. This substantially simplifies the estimation approach because the dimensionality of integration in the composite marginal likelihood (CML) function that needs to be maximized to obtain a consistent estimator (under standard regularity conditions) for the SN-ICLV model parameters is independent of the number of latent variables and the number of ordinal indicator variables in the model system.

The proposed SN-ICLV model is applied to model bicyclists' route choice behavior. In this study, two latent variables - pro-bicycle attitude and safety consciousness in the context of traffic crashes - are specified to moderate the effect of route attributes in bicyclist route choice decisions. A stated preference methodology using a web-based survey of Texas bicyclists provides the route choice data to implement the SN-ICLV model. The results show that individual-specific observed variables impact route choice through the latent constructs we develop and not directly, providing substantial support for the ICLV model structure and the specification used in the paper. Importantly, the results show evidence for non-normality in the latent constructs, with the proposed SN-ICLV model soundly rejecting the traditional ICLV model (with normal latent constructs) and a multinomial probit model (with unstructured heterogeneity in the influence of unobserved factors on the sensitivity to route attributes) based on data fit considerations.  Further, the results suggest that the most unattractive features of a bicycle route are long travel times (for commuters), heavy motorized traffic volume,  absence of a continuous bicycle facility, and high parking occupancy rates and long lengths of parking zones along the route.



Azzalini, A., Dalla Valle, A., 1996. The multivariate skew-normal distribution. Biometrica 83(4), 715-726.

Bhat, C.R., 2011. The maximum approximate composite marginal likelihood (MACML) estimation of multinomial probit-based unordered response choice models. Transportation Research Part B 45(7), 923-939.

Lee, S., McLachlan, G.J., 2013. Finite mixtures of multivariate skew t-distributions: Some recent and new results. Statistics and Computing 24(2), 181-202.

Conference registration is required in order to view papers.