Last modified: 28 March 2017

#### Abstract

Modeling individual choices has been a very important avenue of research in diverse fields such as marketing, transportation, political science, and environmental, health, and urban economics. In all these areas the most widely used method to model choice among mutually exclusive alternatives has been the Conditional or Multinomial Logit model (MNL) (McFadden, 1974), which belongs to the family of Random Utility Maximization (RUM) models. The main advantage of the MNL model has been its simplicity in terms of both estimation and interpretation of the resulting choice probabilities and elasticities. However, it has been recognized that MNL not only imposes constant competition across alternatives ---as a consequence of the independence of irrelevant alternatives (IIA) property ---but also lacks the flexibility to allow for individual-specific preferences.

In the last two decades, advances in simulation-aided inference have allowed researchers to specify and estimate econometric models with random parameters. In discrete choice models, random parameters represent unobserved heterogeneity in preferences ---marginal utilities--- or willingness to pay, depending on the re-parameterization of the parameter space. Continuous mixture models, and more specifically the Mixed Logit (MIXL; the most popular discrete choice model in current applied work), assume parametric heterogeneity distributions for the unknown preferences parameters to model how tastes vary in the population (Hensher and Greene, 2003).

In addition to MIXL, there are other models that can represent unobserved preference heterogeneity. Latent class (LC) discrete choice models offer and alternative to MIXL by replacing the continuous distribution assumption with a discrete distribution in which preferences heterogeneity is captured by membership in distinct classes or segments (Boxall and Adamowicz 2002; Greene and Hensher 2003; Shen 2009). The standard LC specification with class-specific multinomial Logit model of choice (LC-MNL) is useful if the assumption of preference homogeneity holds within segments. In effect, in a LC-MNL all individuals in a given class share the same parameters (fixed parameters within a class), but the parameters vary across classes (heterogeneity across classes).

As consumer preferences and sensitivities become more diverse, it is less efficient to simply consider the market in the aggregate or even just describe the whole distribution of random tastes. In fact, considering differing sensitivities is the basis for targeted communications programs and promotions in the marketing context (Allenby and Rossi, 1999). Similarly, it would be interesting to characterize the individuals-specific willingness-to-pay (WTP) for some attributes to derive insights about consumer-oriented product design. Thus, beyond the characterization of the parameters that describe the random tastes, it is also equally important to know what is the likely location of a given individual on the (continuous or discrete) heterogeneity distribution. Revelt and Train (2000) have shown in the context of continuous heterogeneity how researchers can make useful analyses of individuals' specific tastes by moving from the unconditional (i.e. sample population) distribution to a conditional distribution of preferences at the individual level. The derived individual-specific estimates then can be used to develop segments, identify outliers and simulate market choices (Huber and Train, 2001). In the same vein, Hess and Rose (2007) pointed out the advantages (flexibility gains while reducing the impact of the unconditional parametric assumptions) and disadvantages (out sample forecasting) of using the conditional approach as opposed to the unconditional approach. However, despite the increasing number of applications of MIXL that look into individual-specific preferences in practice (see for example Greene et al., 2005; Sillano and de Dios Ortuzar, 2005; Hensher et al., 2006; Hess and Hensher, 2010), it is still difficult to find empirical applications that use the conditional approach partly due to a limited implementation in available software (Hess, 2006).

It is important to emphasize that it is equally important to tell something about the statistical significance of the individual-specific estimates. Consider a coefficient of 0.001 for individual i for some attribute. A question that arises is: is this coefficient for that particular individual truly positive or is statistically not different from zero? This simple question reveals that it would be also interesting to know whether the conditional mean of a certain individual is truly positive, negative, or zero by constructing confidence intervals (cf. Craig et al., 2005; Daziano and Achtnicht, 2014; Greene et al, 2014).

The necessity of standard errors becomes more relevant in the context of computing individuals' WTP. In particular, the WTP for an attribute is derived from the distribution of the ratio of individual coefficients. This procedure has the disadvantage that the resulting distribution of WTP may not have finite moments (Daly et al., 2011). To overcome this problem, researchers have increasingly being used individual-specific estimates to allow for individual heterogeneity in WTP measures. The argument is that WTP measures using individual-specific estimates provide more reasonable distribution than merely taking the ratio of two random coefficients. However, this procedure ignores the fact that under a small number of choice situations the distribution can be inconsistent. In fact, the variance of the conditional distribution is smaller than the true variance in the population. In other words, if the researcher observes lower variance in the conditional WTP distribution is no a sign of better specification and reasonable estimate, but rather it may be a sign of inconsistency of individual-specific estimates leading to incorrect claims.

Curiously, most of the applied applications focus on the estimation of the individual-specific estimates but not in their statistical significance. However, there exists two approaches to compute the standard errors of the conditional means. The first one is an estimator of the conditional variance of the conditional distribution based on the point estimates (Hensher et al., 2003), whereas the second takes into consideration the sampling distribution of the parameters by using a re-sampling procedure (Revelt and Train, 2000; Greene et al., 2014).

However, we know very little about the statistical and asymptotic behavior of these estimators for the standard errors. For example, Revelt and Train (2000) derive and analyze the asymptotic behavior of the individual-specific estimates under the MIXL context as the number of choice situations increase without bound, but they do not provide methods for computing their standard errors. Other authors use these measures in empirical work without worrying if they really are appropriate and consistent measures of the standard errors (see for example Craig et al., 2005; Greene et al, 2014). The problem is that inappropriate choice of the procedure to compute the standard error for the conditional estimates can lead to biased inference and misguided policy decisions (Hensher et al., 2014). It is therefore imperative to analyze and identify potential problems when computing the standard errors of individual-specific estimates with the aim of establishing useful guidelines for empirical work.

This paper attempts to address this problem by performing a full Monte Carlo study to analyze the accuracy of the standard errors of the conditional estimates. In specific, the main objective is to extend Revelt and Train (2000)'s simulation experiment by analyzing two methods for computing the standard errors of individual-specific estimates and by also including the less computational demanding LC-MNL model. This will allows us to understand the asymptotic behavior of both estimators and also investigate their properties under scenarios with and without simulation techniques.

The asymptotic properties of the individual-specific estimates depend on the number of choice situations per individual (Revel and Train, 2000). In simple terms, if we have more information about the choices made by a certain individual, then it is possible to more precisely identify his preferences. Thus, the specific hypotheses that will be tested are the following: (1) The mean of the conditional distribution converges to the true individual-specific parameter as the number of choice situations rises without bound. However, the number of choice situation per individual must be sufficiently large to achieve acceptable levels of asymptotic bias. (2) Both methods analyzed in this paper specify the variance of the individual-estimate correctly. However, the method that incorporates sampling variability of the estimated parameters will exhibit wider confidence intervals. (3) Since the LC-MNL model does not use simulation procedures, its individual-specific estimates should be estimated more precisely compared with the MIXL, which incorporates simulation noise.

Misspecification of preferences is not just a technical problem: biased confidence intervals of the individual-specific estimates lead to poor policy decisions. In this respect, it is expected that the results of this project will help to improve and understand methods for statistical inference on choices at the individual level, and thus to contribute to better decision by diverse stakeholders, including policymakers, firms, and researchers and analysts.

As a first step, we have replicated the experimental approach conducted by Revelt and Train (2000) but including also the LC-MNL model. Briefly, we assumed two random parameters and two fixed parameters for the continuous case. Our results show that given a certain number of individuals, the density of the conditional means converges to the conditional population as the number of choice situation increases. In other words, as we have more information of the choices made by the individuals, we are in better shape to identify each individual-specific estimate. However, it can be seen that a relatively large number of choice situation is needed to approximate correctly the true underlying distribution. This confirms our first hypothesis.