---
title: "Discrete choice from data to policy, in a dozen lines"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Discrete choice from data to policy, in a dozen lines}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 6.5,
  fig.height = 4
)
options(digits = 4)
```

**choicer** estimates discrete-choice models for applied economics. The
likelihoods, analytical gradients and Hessians are written in C++ with OpenMP parallelization,
so estimation is fast and scales to specifications with hundreds of
alternative-specific constants. Just as important, *one consistent
post-estimation interface* takes you from a fitted model to the quantities you
actually report: market shares, elasticities, diversion ratios,
willingness-to-pay with standard errors, and the welfare effects of a policy
counterfactual.

This vignette runs the whole workflow on a real data set in a few lines.

```{r setup}
library(choicer)
set_num_threads(2) # set for CRAN compilation; raise this on your own machine
```

## The data

`mode_choice` is the classic Greene & Hensher intercity travel-mode data: 210
travellers, each choosing among **air, train, bus and car**. It ships with the
package in choicer's long layout — one row per traveller and alternative.

```{r data}
data(mode_choice)
head(mode_choice, 4)
```

`wait` (terminal waiting time) and `travel` (in-vehicle time) are in minutes;
`vcost` is the travel cost. These vary across modes within a traveller, which
is exactly what a choice model needs.

## Fit a multinomial logit

Point at the identifier, alternative and choice columns, list the covariates,
and fit. Alternative-specific constants are included by default.

```{r fit}
fit <- run_mnlogit(
  data           = mode_choice,
  id_col         = "id",
  alt_col        = "mode",
  choice_col     = "choice",
  covariate_cols = c("wait", "travel", "vcost")
)
summary(fit)
```

Estimation takes a fraction of a second. Travellers dislike waiting time,
in-vehicle time and cost — all three coefficients are negative — and the
`summary()` footer reports McFadden's R² and the in-sample hit rate alongside
the usual fit statistics.

## Predicted shares

```{r shares}
shares <- drop(predict(fit, type = "shares"))
names(shares) <- as.character(fit$alt_mapping[[2]])
round(shares, 3)

barplot(shares, col = "steelblue", ylab = "Predicted share",
        main = "Predicted mode shares")
```

## Elasticities and diversion

How does demand respond to cost, and where does it go? The same fitted object `fit`
answers both, with no extra bookkeeping.

To recover elasticities with respect to the `vcost` variable, simply run
```{r elast}
elasticities(fit, elast_var = "vcost")
```

The own-cost elasticities sit on the diagonal; off-diagonal entries are the
cross-elasticities. For example, the own-elasticity of train mode is -0.546, and
the cross-elasticity from train to bus is ~0.17.

The corresponding diversion matrix answers a slightly different question: among
the travellers who leave one mode after a marginal cost increase, where do they
go?

```{r diversion}
diversion_ratios(fit)
```

Columns indicate the origin and rows the destination. In this fit, a marginal
increase in train's cost diverts the displaced travellers roughly 28% to air,
22% to bus and 50% to car.

The MNL makes this calculation especially transparent. Conditional on the
included covariates, diversion is governed by fitted choice probabilities rather
than by an unobserved notion of closeness among alternatives. That is a useful
baseline, and also a restriction. If the empirical object turns on grouped
substitution, unobserved taste heterogeneity, or counterfactuals in which the
composition of the choice set changes, the [nested logit](nl.html),
[mixed logit](mxl.html), and [multinomial probit](mnp.html) move that
substitution structure in different directions. The comparison is summarized in
[Choosing among choice models](#choosing-among-choice-models) below.

## Willingness to pay

With `vcost` playing the role of price, `wtp()` returns the marginal value of
each attribute in money units, with delta-method standard errors.

```{r wtp}
wtp(fit, price_var = "vcost")
```

These are the travellers' implied **value of time**: how much they would pay to
shave a minute of in-vehicle or terminal time.

## A policy counterfactual

Suppose a subsidy cuts the cost of **train** travel by 25%. Perturb the data and
predict — no refitting required.

```{r counterfactual}
mc_cf <- mode_choice
mc_cf$vcost[mc_cf$mode == "train"] <- mc_cf$vcost[mc_cf$mode == "train"] * 0.75

shares_cf <- drop(predict(fit, type = "shares", newdata = mc_cf))
names(shares_cf) <- names(shares)

rbind(baseline = shares, counterfactual = shares_cf) |> round(3)
```

Train's share rises, drawing travellers away from the other modes. We can put a
money value on the change in welfare with the expected consumer surplus:

```{r welfare}
cs0 <- consumer_surplus(fit, price_var = "vcost")
cs1 <- consumer_surplus(fit, price_var = "vcost", newdata = mc_cf)

cs1$mean_cs - cs0$mean_cs # change in mean consumer surplus
```

The cheaper train fare raises expected consumer surplus, as it should.
The number is a model-based demand calculation, not a causal estimate by itself:
it inherits the maintained utility specification, the price coefficient, and the
assumption that the counterfactual changes only the variables you changed in
`newdata`.

## One interface, every model

Everything above — `predict()`, `elasticities()`, `diversion_ratios()`,
`wtp()`, `consumer_surplus()`, plus `blp()` for share inversion — works
identically on **mixed logit** (`run_mxlogit()`) and **nested logit**
(`run_nestlogit()`) fits, and choicer also offers a Bayesian **multinomial
probit** sampler (`run_mnprobit()`). Learn each model in its own vignette:

- [Multinomial logit](mnl.html) — the workhorse, and a parameter-recovery check
- [Mixed logit](mxl.html) — random coefficients and preference heterogeneity
- [Nested logit](nl.html) — grouped substitution through nests
- [Bayesian multinomial probit](mnp.html) — correlated utility shocks via MCMC

## Choosing among choice models

The useful starting point is not the name of the model. It is the object you
intend to report. Own-price elasticities, diversion ratios, WTP, logsum welfare,
entry effects and merger simulations put different demands on the substitution
structure. A model that reproduces observed shares can still give poor answers to
a counterfactual if the relevant substitution margin is supplied mainly by
functional form.

For that reason, the model choice question is:

> What variation identifies the substitution pattern needed for the object I want
> to report?

IIA is one implication of the MNL at the individual level: conditional on
covariates, the odds between two alternatives do not depend on the rest of the
choice set. That fact is worth knowing, but it should not organize the empirical
discussion. The important question is which restrictions generate substitution
beyond observed covariates, and whether the data contain variation that can
discipline the corresponding parameters.

| Model | What supplies substitution structure? | What has to be defended? |
|---|---|---|
| **Multinomial logit** | Included covariates and observed heterogeneity across choice situations. | The maintained utility index and the absence of unobserved closeness among alternatives. Individual-level diversion is proportional to fitted choice probabilities. |
| **Nested logit** | Shared unobservables within pre-specified nests, summarized by one dissimilarity parameter per nest. | The nesting tree and the random-utility interpretation of the estimated dissimilarity parameters. Correlation exists only where the tree permits it. |
| **Mixed logit** | A mixing distribution $f(\beta)$ for tastes, possibly with correlated random coefficients. | The distributional family, its tails, simulation accuracy, and the variation that identifies heterogeneity: repeated choices, rich market or choice-set variation, or experimental variation. |
| **Multinomial probit** | A covariance matrix for utility-difference shocks. | Scale normalization, a rapidly growing covariance parameterization, MCMC diagnostics and a clear rule for extending the covariance structure in counterfactual choice sets. |

A few principles follow.

- **Start from the estimand.** If the object is an own-price elasticity on the
  observed menu, a transparent MNL or nested logit may be the right empirical
  discipline. If the object is merger diversion, entry, exit, or individual-level
  welfare under a changed menu, the model must say who is close to whom.

- **Separate variation from functional form.** Flexible models are valuable when
  the data contain the variation needed to estimate the relevant heterogeneity or
  covariance. Without that variation, flexibility is still present, but it is
  coming from the maintained distribution, tree or covariance structure.

- **Use the simplest model that carries the relevant margin.** A thin
  cross-section with one choice per person and a fixed menu rarely identifies a
  rich distribution of tastes. A well-motivated nest may be more credible than a
  weakly identified random-coefficients specification; a panel or BLP-style
  setting may justify the richer model.

- **Report substitution, not just fit.** For policy work, the relevant output is
  often the diversion matrix, WTP distribution or welfare change, not only the
  likelihood. Compare those objects across plausible specifications. If the
  policy conclusion moves with the nesting tree or the tail of $f(\beta)$, that
  sensitivity is part of the result.

Each model's own vignette develops these tradeoffs in detail:
[MNL](mnl.html#substitution-restrictions), [nested logit](nl.html),
[mixed logit](mxl.html#identification-and-tails).

## How choicer compares

Several excellent R packages estimate discrete-choice models — among them
[mlogit](https://CRAN.R-project.org/package=mlogit),
[logitr](https://CRAN.R-project.org/package=logitr),
[gmnl](https://CRAN.R-project.org/package=gmnl),
[apollo](https://CRAN.R-project.org/package=apollo) and
[mixl](https://CRAN.R-project.org/package=mixl). choicer's focus is a fast C++
core with analytical gradients and Hessians, and a single, consistent
post-estimation toolkit aimed at the demand and welfare quantities applied
economists report.