International Choice Modelling Conference, International Choice Modelling Conference 2015

Font Size: 
Incorporating a Multiple Discrete-Continuous Outcome in the Generalized Heterogeneous Data Model: Application to Residential Self-Selection Effects Analysis in an Activity Time-use Behavior Model
Chandra R Bhat, Sebastian Astroza, Subodh Dubey

Last modified: 11 May 2015


The joint modeling of multiple outcomes is of substantial interest in several fields. In econometric terminology, this jointness may arise because of the impact (on the multiple choice outcomes) of common underlying exogenous observed variables, or common underlying exogenous unobserved variables, or a combination of the two. If the elements causing jointness in the multiple outcomes are solely due to observed exogenous factors, then the analyst may proceed by modeling each outcome independently and individually. However, when one or more unobserved factors affect the multiple outcomes, independently modeling the outcomes results in the inefficient estimation of covariate effects for each outcome because such an approach fails to borrow information on other outcomes (Teixeira-Pinto and Harezlak, 2013). But, more importantly, if some of the endogenous outcomes are used to explain other endogenous outcomes, and if the outcomes are not modeled jointly in the presence of unobserved exogenous variable effects, the result is inconsistent estimation of the effects of one endogenous outcome on another (see Bhat and Guo, 2007, and Mokhtarian and Cao, 2008).

The joint modeling of multiple outcomes has been a subject of interest for many years, dominated by the joint modeling of multiple continuous outcomes. However, in many cases, the outcomes of interest are not all continuous, and will be non- commensurate (i.e., not of the same type, such as continuous, count and nominal outcomes). The joint modeling of non-commensurate outcomes makes things more difficult because of the absence of a convenient multivariate distribution to jointly (and directly) represent the relationship between discrete and continuous outcomes. This is particularly the case when one of the dependent outcomes is of a multiple discrete-continuous (MDC) nature. An outcome is said to be of the MDC type if it exists in multiple states that can be jointly consumed to different continuous amounts. Examples of MDC situations include activity-time use, household vehicle holdings and usage, brand choice and purchase quantity for frequently purchased grocery items (such as cookies, ready-to-eat cereals, soft drinks, yoghurt, etc.), and stock selection and investment amounts (see Bhat et al., 2009 and Pinjari and Bhat, 2014 for detailed reviews of MDC contexts).  

In this paper, we introduce a joint mixed model that includes an MDC outcome and a nominal discrete outcome, in addition to count, binary/ordinal outcomes, as well as continuous outcomes. These outcomes are modeled together by specifying latent underlying unobserved individual lifestyle, personality, and attitudinal factors that impact the many outcomes, and generate the jointness among the outcomes. Reported subjective attitudinal indicators for the latent variables help provide additional information and stability to the model system. We build on the Generalized Heterogeneous Data Model (GHDM), Bhat's (2014) combination of the structural equations modeling method with the integrated choice and latent variable, to develop a powerful and parsimonious way of jointly analyzing mixed outcomes including an MDC outcome. In addition, we formulate and implement a practical estimation approach for the resulting GHDM-MDC (GHDM including an MDC outcome) using Bhat's (2011) maximum approximate composite marginal likelihood (MACML) inference approach.

At the same time, we focus on examining residential self-selection in the context of an activity-based modeling (ABM) paradigm. As pointed out by Pinjari et al. (2008) and more recently by Chen et al. (2014), despite the fact that the ABM paradigm is increasingly now accepted even in practice as the approach of choice for travel analysis, there has been little consideration of residential self-selection issues within the ABM modeling paradigm (the central basis of the ABM paradigm is that individuals' activity-travel patterns are a result of their time-use decisions; individuals have 24 hours in a day (or multiples of 24 hours for longer periods of time) and decide how to use that time among activities and travel (and with whom) subject to their sociodemographic, spatial, temporal, transportation system, and other contextual constraints; see Pinjari and Bhat, 2011). In the activity-based approach, the impact of land-use and demand management policies on time-use behavior is an important precursor step to assessing the impact of such polices on individual travel behavior. Accordingly, in this paper, we jointly model residential location-related choices along with auto-ownership and activity time-use in different activities in a way that has a social-pyschological underpinning through latent variables while also explicitly considering residential self-selection issues. Residential location choice in this paper is represented as a nominal discrete choice among a multinomial set of four land-use density categories as defined by housing unit density (housing units per square mile) within census blocks. In addition, we characterize residential location choice by a second continuous outcome representing the average commute distance across workers in the household. This is because it has been well established in the literature that commute distance is one of the most important determinants of residential location (see, for example, Clark et al., 2003, Rashidi et al., 2012). The use of density and commute distance to represent physical location helps circumscribe the actual spatial location of residences in a downstream model.  Among the other two choice dimensions considered here, household auto ownership is a count outcome, and individual activity time-use by activity type in non-work activities is an MDC outcome.

The empirical application utilizes data from the 2006 Puget Sound Household Travel Survey and expressly acknowledges that there are latent psychological constructs affecting choice, and that these constructs get manifested in observed psychological indicators as well as observed residential choice, auto ownership and activity time-use outcomes. The magnitude of the residential self-selection effect is quantified, and the implication of ignoring this effect when assessing the effect of the built environment on activity time use is examined.



Bhat, C.R., 2011. The maximum approximate composite marginal likelihood (MACML) estimation of multinomial probit-based unordered response choice models. Transportation Research Part B 45(7), 923-939.

Bhat, C.R. (2014). A new generalized heterogeneous data model (GHDM) to jointly model mixed types of dependent variables. Technical paper, Department of Civil, Architectural and Environmental Engineering, The University of Texas at Austin. Available on:

Bhat, C.R., and Guo, J.Y. (2007). A comprehensive analysis of built environment characteristics on household residential choice and auto ownership levels. Transportation Research Part B, 41(5), 506-526.

Bhat, C.R., Sen, S. and Eluru, N. (2009), The Impact of Demographics, Built Environment Attributes, Vehicle Characteristics, and Gasoline Prices on Household Vehicle Holdings and Use, Transportation Research Part B, 43(1), 1-18.

Chen, C., Mei, Y., and Liu, Y. (2014). Does distance still matter in facilitating social ties? The roles of mobility patterns and the built environment. Transportation Research Board 93rd Annual Meeting.

Clark, W. A., Huang, Y., and Withers, S. (2003). Does commuting distance matter?: Commuting tolerance and residential change. Regional Science and Urban Economics, 33(2), 199-221.

Mokhtarian, P.L., and Cao, X. (2008). Examining the impacts of residential self-selection on travel behavior: A focus on methodologies. Transportation Research Part B, 42(3), 204-228.

Pinjari, A.R., and Bhat, C.R. (2011), Activity Based Travel Demand Analysis, A Handbook of Transport Economics, Chapter 10, 213-248, edited by A. de Palma, R. Lindsey, E. Quinet, and R. Vickerman, Edward Elgar Publishing Ltd.

Pinjari, A.R., and Bhat, C.R. (2014). Computationally efficient forecasting procedures for Kuhn-Tucker consumer demand model systems: application to residential energy consumption analysis. Technical paper, Department of Civil and Environmental Engineering. University of South Florida.

Pinjari, A. R., Eluru, N., Bhat, C.R., Pendyala, R.M., and Spissu, E. (2008). Joint model of choice of residential neighborhood and bicycle ownership: accounting for self-selection and unobserved heterogeneity. Transportation Research Record: Journal of the Transportation Research Board, 2082, 17-26.

Rashidi, T.H., Auld, J., and Mohammadian, A.K. (2012). A behavioral housing search model: Two-stage hazard-based and multinomial logit approach to choice-set formation and location selection. Transportation Research Part A, 46(7), 1097-1107.

 Teixeira-Pinto, A., and Harezlak, J. (2013). Factorization and latent variable models for joint analysis of binary and continuous outcomes. Analysis of Mixed Data, pp. 81-91,  Chapman and Hall/CRC.

Conference registration is required in order to view papers.