In statistics and econometrics, the maximum score estimator is a nonparametric estimator for discrete choice models developed by Charles Manski in 1975.
Unlike the multinomial probit and multinomial logit estimators, it makes no assumptions about the distribution of the unobservable part of utility.
However, its statistical properties (particularly its asymptotic distribution) are more complicated than the multinomial probit and logit models, making statistical inference difficult.
To address these issues, Joel Horowitz proposed a variant, called the smoothed maximum score estimator.
Assume latent utility is linear in the explanatory variables, and there is an additive response error.
are the q-dimensional observable covariates about the agent and the choice, and
are the factors entering the agent's decision that are not observed by the econometrician.
includes the characteristics both of the agent t, such as age, gender, income and ethnicity, and of the coffee i, such as price, taste and whether it is local or imported.
which characterizes the effect of different factors on the agent's choice.
Usually some specific distribution assumption on the error term is imposed, such that the parameter
The parametric model[3] is convenient for computation but might not be consistent once the distribution of the error term is misspecified.
This is the latent utility representation[5] of a binary choice model.
response errors, are latent utility of choosing choice 1 and 2.
Then the log likelihood function can be given as: If some distributional assumption about the response error is imposed, then the log likelihood function will have a closed-form representation.
[2] For instance, if the response error is assumed to be distributed as:
This model is based on a distributional assumption about the response error term.
Adding a specific distribution assumption into the model can make the model computationally tractable due to the existence of the closed-form representation.
The basic idea of the distribution-free model is to replace the two probability term in the log-likelihood function with other weights.
The general form of the log-likelihood function can written as: To make the estimator more robust to the distributional assumption, Manski (1975) proposed a non-parametric model to estimate the parameters.
is the ranking of the certainty part of the underlying utility of choosing i.
The intuition in this model is that when the ranking is higher, more weight will be assigned to the choice.
Under certain conditions, the maximum score estimator can be weak consistent, but its asymptotic properties are very complicated.
In the binary context, the maximum score estimator can be represented as: where and
The intuition of this weighting scheme is that the probability of the choice depends on the relative order of the certainty part of the utility.
Horowitz (1992) proposed a smoothed maximum score (SMS) estimator which has much better asymptotic properties.
[8] The basic idea is to replace the non-smoothed weight function
Define a smooth kernel function K satisfying following conditions: Here, the kernel function is analogous to a CDF whose PDF is symmetric around 0.
Here, the intuition is the same as in the construction of the traditional maximum score estimator: the agent is more likely to choose the choice that has the higher observed part of latent utility.
Under certain conditions, the smoothed maximum score estimator is consistent, and more importantly, it has an asymptotic normal distribution.
Therefore, all the usual statistical testing and inference based on asymptotic normality can be implemented.