Add chapter on multiple imputation. by abner-hb · Pull Request #935 · stan-dev/docs

abner-hb · 2026-03-12T14:58:43Z

Submission Checklist

Builds locally YES
New functions marked with <<{ since VERSION }>> YES (no new functions)
Declare copyright holder and open-source license: see below

Summary

Add a chapter on multiple imputation to the Stan User's Guide.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Abner Heredia Bustos

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC BY-ND 4.0 (https://creativecommons.org/licenses/by-nd/4.0/)

WardBrian · 2026-03-12T15:35:05Z

Hi @abner-hb, thanks for this! I just made a couple commits to remove the changes to the built docs -- we let our Jenkins jobs build those for us during releases.

I'll ask @bob-carpenter to take a look at the contents when he gets a chance

bob-carpenter · 2026-03-12T17:14:27Z

I can review this.

bob-carpenter

Thanks so much for contributing this.

I am really sorry that I left 73 comments on such a short chapter. It was meant pedagogically and I hope it helps other things you write. It took a couple years of this kind of back-and-forth with Gelman and Vehtari and Goodrich before they stopped marking up everything I wrote this way. Gelman and Vehtari are excellent role models for writing clarity.

If you'd rather not do this, I'm happy to make all the changes I suggested myself.

src/stan-users-guide/multiple-imputation.qmd

bob-carpenter · 2026-03-12T18:12:48Z

src/stan-users-guide/multiple-imputation.qmd

+their precision. So, it is often necessary to account explicitly for
+the missing data when fitting a model of interest[^bda].
+
+[^bda]: Chapter 18 in @GelmanEtAl:2013 offers a Bayesian perspective 


I just skimmed Chapter 18. Gelman et al. do not provide a fully Bayesian perspective, he instead uses multiple imputation. The fully Bayesian perspective is given in the User's Guide chapter on missing data.

I would also put this into the main text. Only use footnotes in doc as a last resort.

It's not fully Bayesian but isn't it partially Bayesian?

I think it's fair call it "approximately Bayesian," as that's how Gelman talks about anything from maximum likelihood point estimates to VI.

src/stan-users-guide/multiple-imputation.qmd

bob-carpenter · 2026-03-12T19:34:52Z

src/stan-users-guide/multiple-imputation.qmd

+<!-- there is no need to use any special pooling 
+rules[^rubin] to account for the uncertainty in the imputation.  -->
+
+[^rubin]: Chapter 3, page 45 of @carpenter-etal:2023 summarizes one


Remove---this seems to be an unmoored comment. At least it doesn't show up when I render the html.

bob-carpenter · 2026-03-12T19:35:08Z

src/stan-users-guide/multiple-imputation.qmd

+
+## Cut models
+
+[**NOTE**: I would greatly appreciate any comments or changes to


Remove this note.

If you're not comfortable writing it, you can reduce this section to something really simple.

The point is just that using ad hoc multiple imputation like this is equivalent to doing cut in BUGS as described by Plummer, because there's no information flow from the second-stage inference back to the multiple imputation as you would get in the fully Bayesian model described in the chapter on missing data earlier in the user's guide.

bob-carpenter · 2026-03-12T19:35:40Z

src/stan-users-guide/multiple-imputation.qmd

+[**NOTE**: I would greatly appreciate any comments or changes to
+improve this subsection.]
+
+A full bayesian probability model includes a feedback flow of


bayesian -> Bayesian

The model doesn't have any feedback or flow per se---it's just how joint distributions work.

bob-carpenter · 2026-03-12T19:37:20Z

src/stan-users-guide/multiple-imputation.qmd

+influence only some parameters in the model. From @plummer:2015,
+p. 37:
+
+> Cut models arise in applications with multiple data sources that


Remove comments.

bob-carpenter · 2026-03-12T19:37:41Z

src/stan-users-guide/multiple-imputation.qmd

+p. 37:
+
+> Cut models arise in applications with multiple data sources that
+provide information about different parameters in the model [...] 


OK, rather than filling this all in, I'd just write a one-line summary and point to Plummer's article.

src/stan-users-guide/multiple-imputation.qmd

bob-carpenter

I went over this again and marked many of the grammatical nitpicky comments as resolved. I think they'd be better the way I was suggesting, but it's more important to actually publish this.

If you don't want to make these changes, @abner-hb, just let me know and I can go and make them myself.

I really appreciate your taking the time to write this despite the huge flurry of comments I've left. I hope they've been more helpful than frustrating, as that was my intention.

bob-carpenter · 2026-04-13T19:59:37Z

src/bibtex/all.bib

+
+@article{plummer:2015,
+  author    = {Plummer, Martyn},
+  title     = {Cuts in Bayesian graphical models},


this needs {B}ayesian in the title or "Bayesian" will get lower-cased.

We generally don't need the url, doi, or publisher, but they're OK to leave in.

And thanks for citing Martyn's paper.

src/stan-users-guide/multiple-imputation.qmd

bob-carpenter · 2026-04-13T20:04:38Z

src/stan-users-guide/multiple-imputation.qmd

+
+Suppose that we have a matrix $x$ with columns
+$x_{\cdot, 1}, \ldots, x_{\cdot, K}$ that we want to use to sample 
+values from a vector of quantities of interest called $\theta$. With


want to use to sample values -> that make up covariates in a regression with quantities of interest $\theta$.

bob-carpenter · 2026-04-13T20:05:05Z

src/stan-users-guide/multiple-imputation.qmd

+posterior distribution of $\theta$ as $p(\theta \mid x^\text{comp})$.
+But with missing data, our matrix is split into $x^{\text{obs}}$ (the
+observed values of $x$) and $x^{\text{mis}}$ (the missing  values of
+$x$). This can be problematic because, in general, our knowledge may 


I don't understand the last sentence. Just remove "This can be problematic ...".

bob-carpenter · 2026-04-13T20:06:49Z

src/stan-users-guide/multiple-imputation.qmd

+\end{align*}
+where $x^{\text{imp}}$ is a data set that includes imputed values of
+$x^{\text{mis}}$.
+


You might want to note to continue line 52 that this depends on a model of $x$, which you typically don't have with a regression, because the inferences for parameters are independent of the model of $x$ when $x$ is fully observed.

src/stan-users-guide/multiple-imputation.qmd

abner-hb and others added 7 commits March 11, 2026 17:47

Add section on multiple imputation.

36e9442

Added references for multiple imputation.

a704b14

Added multiple imputation to yml files.

d81c629

Fixed math notation to allow pdf rendering.

a46c02a

Built scripts adding multiple imputation.

ffb823a

Avoid unnecessary changes to docs/

394e659

Avoid unnecessary changes to docs/

fe9af32

WardBrian requested a review from bob-carpenter March 12, 2026 15:35

WardBrian mentioned this pull request Mar 12, 2026

Instructions for working with PRs from forks #936

Open

bob-carpenter reviewed Mar 12, 2026

View reviewed changes

Rewrote multiple parts based on some of Bob's comments.

5446eab

bob-carpenter reviewed Mar 13, 2026

View reviewed changes

src/stan-users-guide/multiple-imputation.qmd Outdated Show resolved Hide resolved

Added more changes based on pull-request comments.

ac5309e

WardBrian requested a review from bob-carpenter April 13, 2026 19:20

bob-carpenter requested changes Apr 13, 2026

View reviewed changes


		## Cut models

		[NOTE: I would greatly appreciate any comments or changes to

Uh oh!

Conversation

abner-hb commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Submission Checklist

Summary

Copyright and Licensing

Uh oh!

WardBrian commented Mar 12, 2026

Uh oh!

bob-carpenter commented Mar 12, 2026

Uh oh!

bob-carpenter left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bob-carpenter left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

abner-hb commented Mar 12, 2026 •

edited

Loading

bob-carpenter left a comment •

edited

Loading