Fit of univariate distributions to non-censored data

[under review]

If you wish to figure out the distribution (and so its corresponding parameters, e.g normal distribution has mean and variance as parameters) your univariate data may best fit into, “fitdistrplus” helps to generate estimate of these parameters corresponding to a specified distribution. It fits univariate distributions to non-censored data by maximum likelihood (mle), moment matching (mme), quantile matching (qme) or maximizing goodness-of-fit estimation (mge). The latter is also known as minimizing distance estimation. Generic methods are print, plot, summary, quantile, logLik, vcov and coef.

1. fit a gamma distribution by maximum likelihood estimation

Install the package “fitdistrplus”. Use data groundbeef in the package “fitdistrplus”. groundbeef is a data of one variable “serving” (univariate data).

data(groundbeef)
serving <- groundbeef$serving
fitg <- fitdist(serving, “gamma”)

view(fitg)

summary(fitg) #this produces estimates of the parameters of the gamma distribution which are the shape and the rate.

plot(fitg)

plot(fitg, demp = TRUE)

plot(fitg, histo = FALSE, demp = TRUE)

cdfcomp(fitg, addlegend=FALSE)

denscomp(fitg, addlegend=FALSE)

ppcomp(fitg, addlegend=FALSE)

qqcomp(fitg, addlegend=FALSE)

2. use the moment matching estimation (using a closed formula)

fitgmme <- fitdist(serving, “gamma”, method=”mme”)
summary(fitgmme)

plot(fitgmme, demp=T)

3. Comparison of various fits

fitW <- fitdist(serving, “weibull”)
fitg <- fitdist(serving, “gamma”)
fitln <- fitdist(serving, “lnorm”)
summary(fitW)
summary(fitg)
summary(fitln)

cdfcomp(list(fitW, fitg, fitln), legendtext=c(“Weibull”, “gamma”, “lognormal”))

denscomp(list(fitW, fitg, fitln), legendtext=c(“Weibull”, “gamma”, “lognormal”))

qqcomp(list(fitW, fitg, fitln), legendtext=c(“Weibull”, “gamma”, “lognormal”))

ppcomp(list(fitW, fitg, fitln), legendtext=c(“Weibull”, “gamma”, “lognormal”))

gofstat(list(fitW, fitg, fitln), fitnames=c(“Weibull”, “gamma”, “lognormal”))

4. defining your own distribution functions, here for the Gumbel distribution
for other distributions, see the CRAN task view dedicated to probability distributions

dgumbel <- function(x, a, b) 1/b*exp((a-x)/b)*exp(-exp((a-x)/b))
pgumbel <- function(q, a, b) exp(-exp((a-q)/b))
qgumbel <- function(p, a, b) a-b*log(-log(p))

fitgumbel <- fitdist(serving, “gumbel”, start=list(a=10, b=10))
summary(fitgumbel)

plot(fitgumbel, demp=T)

5. fit discrete distributions (Poisson and negative binomial)

data(toxocara)
number <- toxocara$number
fitp <- fitdist(number,”pois”)
summary(fitp)

plot(fitp)

fitnb <- fitdist(number,”nbinom”)
summary(fitnb)

plot(fitnb, demp=T)

cdfcomp(list(fitp,fitnb))

gofstat(list(fitp,fitnb))

6. how to change the optimisation method?

data(groundbeef)
serving <- groundbeef$serving
fitdist(serving, “gamma”, optim.method=”Nelder-Mead”)
fitdist(serving, “gamma”, optim.method=”BFGS”)
fitdist(serving, “gamma”, optim.method=”SANN”)

Find out more here:

https://rdrr.io/cran/fitdistrplus/man/fitdist.html

Teaching/Research Assistant in Mathematics/(Bayesian) Statistics, Writer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store