Let’s load the necessary packages:
We will use the “broken stick” approach to simulate data from the Dirichlet - trinomial model. This model assumes that the group proportions for each observation are Dirichlet, but the observed values are either 0, the total sample size (N) or a number between 0 and N.
Our broken_stick
function can be called as follows,
The object y
is a list with 2 elements, (1) the true underlying compositions (p) and the realized data (X_obs). They can be accessed as
By default, the simulation function assumes a uniform prior for the Dirichlet, with hyperparameters = 1. We can change this by specifying our own values of hyperparameters. Using the argument p
, we can simulate new values with a slightly larger effective sample size, and pass that into broken_stick