MaxDiffMixedLogit.predict_choices#

MaxDiffMixedLogit.predict_choices(task_df, random_seed=None, new_respondents='error', draw_batch_size=None)[source]#

Fully generative (best, worst) simulation under a new task design.

Unlike sample_posterior_predictive(), this method does not condition the worst softmax on the observed best pick. For each posterior draw, it:

  1. Computes per-task utilities from beta_item_r (or beta_item).

  2. Samples a best position from softmax(U).

  3. Samples a worst position from softmax(-U) with the sampled best position excluded.

The resulting worst_pick is conditional on the freshly sampled best_pick — the correct generative joint distribution.

Parameters:
task_dfpd.DataFrame

New long-format task data. is_best / is_worst columns may be dummy values; they are ignored for prediction.

random_seedRandomState, optional

Seed for the numpy Generator used for new-respondent population draws and for sampling best / worst. The full output is deterministic given this seed, regardless of draw_batch_size.

new_respondents{“error”, “population”}, default “error”

How to handle respondents in task_df that were not in the training data, when random_intercepts=True:

  • "error" (default): raise ValueError.

  • "population": for each unknown respondent, draw a fresh respondent-level utility vector from the fitted population distribution Normal(beta_item, sigma_item) per posterior sample. This is the standard mixed-logit extrapolation to a brand-new customer.

Ignored when random_intercepts=False (no respondent-level parameters exist).

draw_batch_sizeint, optional

If provided, compute the per-task utilities in chunks of this many posterior draws rather than materializing the full (chain, draw, tasks, positions) tensor at once.

Note

The per-respondent utility tensor pred_beta_r of shape (chains, draws, n_respondents, items) is always fully allocated before batching. draw_batch_size reduces only the per-task softmax tensor (chains, draw_batch, tasks, k_max). For large studies, peak memory is dominated by the per-respondent tensor, not by the per-task tensor.

Output is bit-identical to the unbatched path for a given random_seed.

Returns:
xr.Dataset

posterior_predictive-shaped dataset with best_pick and worst_pick variables of shape (chain, draw, tasks) and p_best / p_worst of shape (chain, draw, tasks, positions).