Estimating Mixed Logit Models with Quasi-Monte-Carlo
Sequences Allowing Practical Error Estimation
F. Bastin, C. Cirillo, Ph. L. Toint
Report TR2004/11
Mixed Multinomial Logit Models (MMNL) are now a popular and efficient
framework in discrete choice theory. However, it is well known that the
numerical cost associated to the evaluation of multidimensional integrals in
MMNL models remains high even if Monte Carlo (MC) or quasi-Monte Carlo (QMC)
techniques are used instead of classical quadrature methods, while no
analytical solution can be found. Our current approach, developed in the
context of modern trust-region optimization techniques at FUNDP (Facultes
Universitaires Notre Dame de la Paix), uses statistical inference of
Monte-Carlo approximations to speed up computations. We have shown that
numerical efficiency is considerably increased by the exploitation of new
results on the accuracy and bias estimates relative to the objective
function. The crucial ingredient of our algorithm is that, at each iteration,
we are able to use only a subset of the random draws, whose size is adapted
from iteration to iteration. Convergence of the algorithm has been also
demonstrated, towards points satisfying first- and second-order optimality
conditions (Bastin et al., 2004b). The methodolgy has been successfully
applied to both simulated and real data sets. The results, even on large-scale
model estimation, show that the proposed optimization algorithm is competitive
with existing tools, including softwares based on quasi-Monte Carlo techniques
using Halton sequences.
In this paper, we propose to extend our study and to compare our variable
sample size Monte Carlo algorithm with randomized quasi-Monte Carlo
sequences. We use Sobol sequences, that are expected to perform better than
Halton ones, as suggested by Garrido (2003). There are different ways to
randomize quasi-random sequences; some of them have been already explored by
the transportation community. Bhat (2003) has suggested that scrambled Halton
sequences avoid the problem of poor coverage of the integration domain in high
dimensions, and has used random shifts to evaluate the quality of the
sequences in the context of MMNL estimation. Hess et al. (2003) have proposed
the use of randomly shifted and shuffled uniform vectors and have reported
better performances. Garrido (2003) has also proposed to use Owen scrambling
technique for Sobol sequence.
Since the sequences used in QMC approaches are deterministic, it is not
possible to use the classical analytical tools for error estimation as we have
done in the MC variable sample size strategy by using the delta method. It is,
therefore, desirable to develop techniques, which combine the potential higher
accuracy of QMC approximation with the practical error estimation ability of
MC methods. By introducing some randomness in low discrepancy sequences, one
can use statistical methods for error analysis. Our objective is to
investigate how those techniques can be applied in the mixed logit model
estimation. In particular, we are interested in seeing how randomized QMC
sequences can reduce the variance in comparison with MC methods and how they
can improve the performance of the original deterministic sequences, in
combination with the variable sample size strategy.
We will apply the methodologies on both simulated and real data sets. In
particular our real case study is a mode choice model based on stated
preference data, collected in 2003 in the Walloon Region (Belgium).