Marginalize over discrete parameters
Design in AePPL
Marginalization is an operation that happens at the level of the probability density. First, we could add a marginalize
function that takes a density and a set of values to marginalize out, and rewrites the density to return its marginalized counterpart:
import aeppl logprob, (y_vv, i_vv) = aeppl.joint_logprob(Y_rv, i_rv) marginalized_logprob = aeppl.marginalize(logprob, i_vv)
Another solution is to add marginalize
keyword to joint_logprob
:
import aeppl
logprob, (y_vv,) = aeppl.joint_logprob(Y_rv, i_rv, marginalize=(i_rv,))
What this keyword hides is a function that acts at the measure level in AePPL's intermediate representation. The availability of this intermediate representation in joint_logprob
makes it easier to perform marginalization at this level. The internals of joint_logprob
would then look like in pseudo-code:
def joint_logprob(*rvs, *, to_marginalize): rvs_to_values = aeppl.internals.create_value_variables(rvs) measures = aeppl.internals.to_ir(rvs_to_values) marginalized_measures = aeppl.internals.marginalize(measures, to_marginalize) logdensity = aeppl.internals.disintegrate(marginalized_measures)
This makes me think that AePPL's intermediate representation should be a first-class citizen.
Related issues
Different examples of marginalization
TODO Discrete mixtures
Switchpoint model
Consider the following example from the Stan documentation:
import aesara import aesara.tensor as at srng = at.random.RandomStream(0) r_e = at.scalar('r_e') r_l = at.scalar('r_l') T = at.iscalar('T') e_rv = srng.exponential(r_e) l_rv = srng.exponential(r_l) s_rv = srng.integers(1, T) t = at.arange(1, T) rate = at.where(at.ge(s_rv, t), e_rv, l_rv) D_rv = srng.poisson(rate) # Draw from the prior predictive distribution fn = aesara.function([r_e, r_l, T], D_rv) print(fn(1., 3., 10))
Here we can marginalize over integers
to ease sampling, and recover the posterior distribution using posterior predictive sampling.