macroeco.compare.lrt¶

macroeco.compare.lrt(data, model_full, model_reduced, df=None)¶

Compare two nested models using a likelihood ratio test

Parameters:

data : iterable

Data for analysis

model_full : obj

A frozen scipy distribution object representing the full model (more complex model).

model_reduced : scipy distribution object

A frozen scipy distribution object representing the reduced model (simpler model).

df : int

The degrees of freedom for the lrt (optional). If none, df is calculated as the difference between the number of parameters in the full and reduced models.

Returns:

tuple :

G^2 statistic, p-value

Notes

Parameters of distribution objects must be given as keyword arguments. Ex. norm = stats.norm(loc=0, scale=1)

A p-value < alpha suggests significant evidence for the full (more complex) model. In other words, the null hypothesis is that the reduced model is correct

The LRT only applies to nested models. The G^2 statistic and G-test rely on the assumption that the test statistic is approximately chi-squared distributed. This assumption breaks down for small samples sizes.

Examples

>>> import macroeco.models as md
>>> import macroeco.compare as comp

>>> # Generate random data
>>> rand_samp = md.nbinom_ztrunc.rvs(20, 0.5, size=100)

>>> # Fit Zero-truncated NBD (Full model)
>>> mle_nbd = md.nbinom_ztrunc.fit_mle(rand_samp)

>>> # Fit a logseries (limiting case of Zero-truncated NBD, reduced model)
>>> mle_logser = md.logser.fit_mle(rand_samp)

>>> # Compare models with LRT
>>> comp.lrt(rand_samp, md.nbinom_ztrunc(mu=mle_nbd[0], k_agg=mle_nbd[1]), md.logser(p=mle_logser[0]))
(15.33429080890221, 9.0066719644695982e-05)

>>> # Reject the null hypothesis that the logseries is a better model and
>>> # choose the Zero-truncated NBD.