Ecological Archives E095-272-A2

Heather J. Lynch, James T. Thorson, Andrew Olaf Shelton. 2014. Dealing with under- and over-dispersed count data in life history, spatial, and community ecology. Ecology 95:3173–3180. http://dx.doi.org/10.1890/13-1912.1

Appendix B. Probability mass functions and R code to fit the Poisson, Negative Binomial, Generalized Poisson, and CMP distributions (and their zero-truncated versions).

Poisson distribution:

The probability mass function of the Poisson distribution has one parameter, \( \lambda>0 \), and is given by

\[ P(X=x|\lambda) = \frac{\lambda^{x}e^{-\lambda}}{x!}, x=0,1,2,3,... \]

The negative log-likelihood is fit with the following function

Pois.NegLogLikeFn = function(par,Y){
  lambda<-par[1]
  LogLike = rep(NA, length(Y))

  # Likelihood
  for(ObsI in 1:length(Y)){
    LogLike[ObsI] = Y[ObsI]*log(lambda)-lambda-lfactorial(Y[ObsI])
  }
  Return = sum(LogLike)
  return(-Return)
}

We find the maximum likelihood parameter estimates using

Opt = nlminb(start=c(0.3), objective=Pois.NegLogLikeFn, control=list(trace=1), Y=Y)
Pois.est.lambda<-Opt$par[1]

Confidence intervals on the parameter estimates are fit using the profile methods as described in Bolker (2008). We calculate the negative log-likelihood (NLL) over a range of values and find the bounds within which the NLL is within \( \frac{1}{2}\chi^{2}_{0.95,df=1} \) of the NLL minimum.

lambda.vector<-seq(max(0.01,Pois.est.lambda-0.2),Pois.est.lambda+0.2,by=0.005)

LL.profile.lambda<-numeric(length(lambda.vector))
for (i in 1:length(lambda.vector))
{
  LL.profile.lambda[i]<-Pois.NegLogLikeFn(lambda.vector[i],Y) 
}

We plot the NLL profile to confirm regularity

plot(lambda.vector,LL.profile.lambda,ylim=c(min(LL.profile.lambda,na.rm=T),min(LL.profile.lambda,na.rm=T)+4))
abline(h=min(LL.profile.lambda,na.rm=T)+qchisq(0.95,df=1)/2)

Here we find the exact parameter value at which the NLL crosses the threshold, interpolating as needed

profile.lower = LL.profile.lambda[1:which.min(LL.profile.lambda)]
profile.lambda = lambda.vector[1:which.min(LL.profile.lambda)]
Pois.LL.lambda<-approx(profile.lower, profile.lambda, xout = min(LL.profile.lambda)+qchisq(0.95,df=1)/2)$y
profile.upper = LL.profile.lambda[which.min(LL.profile.lambda):length(LL.profile.lambda)]
profile.lambda = lambda.vector[which.min(LL.profile.lambda):length(LL.profile.lambda)]
Pois.UL.lambda<-approx(profile.upper, profile.lambda, xout = min(LL.profile.lambda)+qchisq(0.95,df=1)/2)$y

We calculate the AIC from which the AICc can be calculated

k<-1
AIC.Pois<-2*min(LL.profile.lambda)+2*k

We modify this to calculate a zero-truncated Poisson distribution s follows:

\[ P(X=x|X>0,\lambda)=\frac{P(X=x|\lambda)}{P(X>0|\lambda)} \]

\[ P(X=x|X>0,\lambda)=\frac{P(X=x|\lambda)}{1-P(X=0|\lambda)} \]

\[ P(X=0|\lambda) = e^{-\lambda} \]

Therefore, the zero-truncated Poisson (ZTPois) log likelihood is related to the Poisson log-likelihood by

\[ LL_{ZTPois} = LL_{Pois}-Log(1-e^{-\lambda}) \]

and can be calculated similarly to above with the following modification to the NLL function

ZTPois.NegLogLikeFn = function(par,Y){
  lambda<-par[1]
  LogLike = rep(NA, length(Y))

  # Likelihood
  for(ObsI in 1:length(Y)){
    LogLike[ObsI] = Y[ObsI]*log(lambda)-lambda-lfactorial(Y[ObsI])-log(1-exp(-lambda))
  }
  Return = sum(LogLike)
  return(-Return)
}

Generalized Poisson distribution:

The probability mass function of the Generalized Poisson distribution has two parameters, \( \theta>0 \) and \( \lambda \), and is given by

\[ P(X=x|\theta, \lambda) =\left\{ \begin{array}{cl} \frac{\theta(\theta+x\lambda)^{x-1}e^{-\theta-x\lambda}}{x!} &\mbox{for x=0,1,2,...,m} \\ 0 &\mbox{for n>m when $\lambda$<0} \end{array} \right. \]

where the parameter \( \lambda \) is bounded by \( max(-1,-\theta/m)\leq\lambda<1 \) and \( m \) is the largest integer satisfying \( \theta+m\lambda>0 \). In cases where \( \lambda<0 \), the probability mass function may not sum to unity; in these cases the support for the probability distribution is truncated and the remaining probability mass function normalized explicitly. More details on the complications arising from using the Generalized Poisson for underdispersed data may be found in Scollnik (1998) and references therein. In general, the CMP provides a more robust framework for formally dealing with under-dispersion than the generalized Poisson. The most notable difference is that “the generalized Poisson distribution cannot be used with full justification when the second parameter [\( \lambda \)] is negative” (Consul and Famoye 1989). This fact arises because the required normalizing constant depends upon the parameter values. Additionally, when \( \lambda \) < 0, as is the case of under-dispersion, the theoretical moments derived for the generalized Poisson with 0<\( \lambda \)<1 no longer hold (Consul 1989). Sometime users of the generalized Poisson ignore this fact (see references provided in Scollnik 1998, Kendall and Wittmann 2010) or work with a truncated generalized Poisson distribution (Consul and Famoye 1989), though truncation introduces it's own set of biases. Errors introduced by this truncation are small in many cases but as under-dispersion becomes more extreme, so does the truncation error (see Consul 1989, Scollnik 1998). In sum, while methods for truncating generalized Poisson distributions are available and well defined, the detailed mechanics of having one distribution for an over dispersed case and different distribution for a under-dispersed data undermines the utility of the generalized Poisson. One instance in which this fact will become particularly apparent is in cases where researchers try to estimate the dispersion parameter as a continuous function of covariates and dispersion moves from \( \lambda \)<0 to \( \lambda \)>0. This point is not commonly acknowledged in the existing uses of the generalized Poisson in the ecological literature (e.g., Kendall and Wittman 2010).

The negative log-likelihood is fit with the following function

GPois.NegLogLikeFn = function(Y,theta,lambda){
  LogLike = rep(NA, length(Y))

  # Likelihood
  for(ObsI in 1:length(Y)){
    LogLike[ObsI] = log(theta)+(Y[ObsI]-1)*log(theta+Y[ObsI]*lambda)-theta-Y[ObsI]*lambda-lfactorial(Y[ObsI])
  }
  Return = sum(LogLike)
  return(-Return)
}

The AIC is calculated as

k<-2
AIC.GPois<-2*min(LL.profile.lambda,LL.profile.theta)+2*k

We modify this to calculate a zero-truncated Generalized Poisson distribution s follows:

\[ P(X=x|X>0,\theta,\lambda)=\frac{P(X=x|\theta,\lambda)}{P(X>0|\theta,\lambda)} \]

\[ P(X=x|X>0,\theta,\lambda)=\frac{P(X=x|\theta,\lambda)}{1-P(X=0|\theta,\lambda)} \]

\[ P(X=0|\theta,\lambda) = e^{-\theta} \]

Therefore, the zero-truncated Generalized Poisson (ZTGPois) log likelihood is related to the Generalized Poisson (GPois) log-likelihood by

\[ LL_{ZTGPois} = LL_{GPois}-Log(1-e^{-\theta}) \]

and can be calculated similarly to above with the following modification to the NLL function

ZTGPois.NegLogLikeFn = function(Y,theta,lambda){
  LogLike = rep(NA, length(Y))

  # Likelihood
  for(ObsI in 1:length(Y)){
    LogLike[ObsI] = log(theta)+(Y[ObsI]-1)*log(theta+Y[ObsI]*lambda)-theta-Y[ObsI]*lambda-lfactorial(Y[ObsI])-log(1-exp(-theta))
  }
  Return = sum(LogLike)
  return(-Return)
}

Negative Binomial distribution:

The probability mass function of the Negative Bimonial distribution (\( \mu \) parameterization) has two parameters, \( \mu>0 \) and \( \theta>0 \), and is given by

\[ P(X=x|\mu, \theta) = \frac{\Gamma(\theta+x)}{\Gamma(\theta)x!}\left(\frac{\theta}{\theta+\mu}\right)^{\theta}\left(\frac{\mu}{\theta+\mu}\right)^{x}, x=0,1,2,3,,, \]

The negative log-likelihood is fit with the following function

NegBin.NegLogLikeFn = function(Y,mu,theta){
  LogLike = rep(NA, length(Y))
  for(ObsI in 1:length(Y)){
    LogLike[ObsI] = lgamma(Y[ObsI]+theta)-lgamma(theta)+theta*log(theta)-theta*log(theta+mu)+Y[ObsI]*log(mu)-Y[ObsI]*log(theta+mu)-lfactorial(Y[ObsI])
  }
  Return = sum(LogLike)
  return(-Return)
}

The AIC is calculated as

k<-2
AIC.NegBin<-2*min(LL.profile.mu,LL.profile.theta)+2*k

We modify this to calculate a zero-truncated Negative Binomial distribution as follows:

\[ P(X=x|X>0,\mu,\theta)=\frac{P(X=x|\mu,\theta)}{P(X>0|\mu,\theta)} \]

\[ P(X=0|\mu, \theta) = \left(\frac{\theta}{\theta+\mu}\right)^{\theta} \]

Therefore, the zero-truncated Negative Binomial (ZTNegBin) log likelihood is related to the Negative Binomial (NegBin) log-likelihood by

\[ LL_{ZTNegBin} = LL_{NegBin}-Log(1-((\theta/(\theta+\mu))^{\theta})) \]

and can be calculated similarly to above with the following modification to the NLL function

ZTNegBin.NegLogLikeFn = function(Y,mu,theta){
  LogLike = rep(NA, length(Y))
  for(ObsI in 1:length(Y)){
    LogLike[ObsI] = lgamma(Y[ObsI]+theta)-lgamma(theta)+theta*log(theta)-theta*log(theta+mu)+Y[ObsI]*log(mu)-Y[ObsI]*log(theta+mu)-lfactorial(Y[ObsI])-log(1-((theta/(theta+mu))^theta))
  }
  Return = sum(LogLike)
  return(-Return)
}

Conway-Maxwell-Poisson distribution:

The probability mass function of the Conway-Maxwell-Poisson distribution has two parameters, \( \mu > 0 \) and \( \nu \), and is given by

\[ P(X=x|\mu,\nu) = \frac{1}{S(\mu,\nu)}(\frac{\mu^{x}}{y!})^{\nu}, x=0,1,2,3,... \]

The negative log-likelihood is fit with the following function

ComputeConst = function(Lambda, Nu, Mu, Tol, Max, Log=TRUE, Type){       # Mu = Lambda^(1/Nu)
  if( (!is.na(Lambda) & Lambda > 10^Nu) | (!is.na(Mu) & Mu^Nu > 10^Nu) ){
    if(Type=="Z"){
      #Const = exp(Nu*Lambda^(1/Nu)) / ( Lambda^((Nu-1)/(2*Nu)) * (2*pi)^((Nu-1)/2) * sqrt(Nu) )
      ln_Const = Nu*Lambda^(1/Nu) - ((Nu-1)/(2*Nu))*log(Lambda) - ((Nu-1)/2)*log(2*pi) - (1/2)*log(Nu)
    }
    if(Type=="S"){
      #Const = exp(Nu*Mu) / ( Mu^((Nu-1)/(2)) * (2*pi)^((Nu-1)/2) * sqrt(Nu) )
      ln_Const = Nu*Mu - ((Nu-1)/(2))*log(Mu) - ((Nu-1)/2)*log(2*pi) - (1/2)*log(Nu)
    }
  }else{
    Const = rep(0,Max+1)
    Index = 1
    Const[Index] = 1    # Z(0)
    while( Const[Index]/Const[1] > Tol ){
      if(Type=="Z") Const[Index+1] = Const[Index] * ( Lambda / Index^Nu )  # Z(Index)
      if(Type=="S") Const[Index+1] = Const[Index] * ( Mu / Index )^Nu  # Z(Index)
      Index = Index + 1
    }
    ln_Const = log(sum(Const))
  }
  if(Log==TRUE) Return = ln_Const
  if(Log==FALSE) Return = exp(ln_Const)
  return(Return)
}


CMP.NegLogLikeFn = function(Xmat, Zmat, Y, BetaX, BetaZ, Tol, Max, Type, Output="LogLike"){
  LogLike = rep(NA, length(Y))
  if(Type=="Lambda-parameterization"){
    Lambda = exp(Xmat %*% BetaX)
    Mu = rep(NA, length(Y))
    Nu = exp(Zmat %*% BetaZ)
  }
  if(Type=="Mu-parameterization"){
    Lambda = rep(NA, length(Y))
    Mu = exp(Xmat %*% BetaX)
    Nu = exp(Zmat %*% BetaZ)
  }

  # Likelihood
  for(ObsI in 1:length(Y)){
    if(Type=="Lambda-parameterization") LogLike[ObsI] = Y[ObsI]*log(Lambda[ObsI]) - Nu[ObsI]*lfactorial(Y[ObsI]) - ComputeConst(Lambda=Lambda[ObsI], Nu=Nu[ObsI], Mu=Mu[ObsI], Tol=Tol, Max=Max, Log=TRUE, Type="Z")
    if(Type=="Mu-parameterization") LogLike[ObsI] = Nu[ObsI]*Y[ObsI]*log(Mu[ObsI]) - Nu[ObsI]*lfactorial(Y[ObsI]) - ComputeConst(Lambda=Lambda[ObsI], Nu=Nu[ObsI], Mu=Mu[ObsI], Tol=Tol, Max=Max, Log=TRUE, Type="S")
  }

  # Return
  if(Output=="LogLike") Return = sum(LogLike)
  if(Output=="Diag") Return = list( "sum_LogLike"=sum(LogLike), "Xmat"=Xmat, "Zmat"=Zmat, "Diag"=cbind("Mu"=Mu, "Lambda"=Lambda, "Nu"=Nu, "Y"=Y, "LogLike"=LogLike) )
  return(-Return)
}

We introduce an alternate version of the NLL that differs only in that the two parameters are passed as a single object. This alternate version is not specific to the CMP but is handy for any distribution requiring the optimization of more than one parameter.

CMP.NegLogLikeFn.alt = function(par, Xmat, Zmat, Y, Tol, Max, Type, Output="LogLike"){
  BetaX=par[1]
  BetaZ=par[2]
  LogLike = rep(NA, length(Y))
  if(Type=="Lambda-parameterization"){
    Lambda = exp(Xmat %*% BetaX)
    Mu = rep(NA, length(Y))
    Nu = exp(Zmat %*% BetaZ)
  }
  if(Type=="Mu-parameterization"){
    Lambda = rep(NA, length(Y))
    Mu = exp(Xmat %*% BetaX)
    Nu = exp(Zmat %*% BetaZ)
  }

  # Likelihood
  for(ObsI in 1:length(Y)){
    if(Type=="Lambda-parameterization") LogLike[ObsI] = Y[ObsI]*log(Lambda[ObsI]) - Nu[ObsI]*lfactorial(Y[ObsI]) - ComputeConst(Lambda=Lambda[ObsI], Nu=Nu[ObsI], Mu=Mu[ObsI], Tol=Tol, Max=Max, Log=TRUE, Type="Z")
    if(Type=="Mu-parameterization") LogLike[ObsI] = Nu[ObsI]*Y[ObsI]*log(Mu[ObsI]) - Nu[ObsI]*lfactorial(Y[ObsI]) - ComputeConst(Lambda=Lambda[ObsI], Nu=Nu[ObsI], Mu=Mu[ObsI], Tol=Tol, Max=Max, Log=TRUE, Type="S")
  }

  # Return
  if(Output=="LogLike") Return = sum(LogLike)
  if(Output=="Diag") Return = list( "sum_LogLike"=sum(LogLike), "Xmat"=Xmat, "Zmat"=Zmat, "Diag"=cbind("Mu"=Mu, "Lambda"=Lambda, "Nu"=Nu, "Y"=Y, "LogLike"=LogLike) )
  return(-Return)
}

We find the minimum NLL using

Xmat = matrix( rep(1,length(Y)) )
BetaX = 5
Zmat = matrix( rep(1,length(Y)) )
BetaZ = 1
Opt = nlminb(start=c(5,1), objective=CMP.NegLogLikeFn.alt, control=list(trace=0), Xmat=Xmat, Zmat=Zmat, Y=Y, Tol=0.01, Max=200, Type="Mu-parameterization")
CMP.est.mu<-exp(Opt$par[1])
CMP.est.nu<-exp(Opt$par[2])

and calculate the AIC as

k<-2
AIC.CMP<-2*min(LL.profile.mu,LL.profile.nu)+2*k

We modify this to calculate a zero-truncated Conway-Maxwell-Poisson (CMP) distribution as follows:

\[ P(X=x|X>0,\mu,\nu)=\frac{P(X=x|\mu,\nu)}{P(X>0|\mu,\nu)} \]

\[ P(X=x|X>0,\mu,\nu)=\frac{P(X=x|\mu,\nu)}{1-P(X=0|\mu,\nu)} \]

\[ P(X=0|\mu,\nu) = \frac{1}{S(\mu,\nu)} \]

Therefore, the zero-truncated CMP (ZTCMP) log likelihood is related to the CMP log-likelihood by

\[ LL_{ZTCMP} = LL_{CMP}-Log\left(1-\frac{1}{S(\mu,\nu)}\right) \]

and can be calculated similarly to above with the following modification to the NLL function

ZTCMP.NegLogLikeFn = function(Xmat, Zmat, Y, BetaX, BetaZ, Tol, Max, Type, Output="LogLike"){
  LogLike = rep(NA, length(Y))
  if(Type=="Lambda-parameterization"){
    Lambda = exp(Xmat %*% BetaX)
    Mu = rep(NA, length(Y))
    Nu = exp(Zmat %*% BetaZ)
  }
  if(Type=="Mu-parameterization"){
    Lambda = rep(NA, length(Y))
    Mu = exp(Xmat %*% BetaX)
    Nu = exp(Zmat %*% BetaZ)
  }

  # Likelihood
  for(ObsI in 1:length(Y)){
#    if(Type=="Lambda-parameterization") LogLike[ObsI] = Y[ObsI]*log(Lambda[ObsI]) - Nu[ObsI]*lfactorial(Y[ObsI]) - ComputeConst(Lambda=Lambda[ObsI], Nu=Nu[ObsI], Mu=Mu[ObsI], Tol=Tol, Max=Max, Log=TRUE, Type="Z")
    if(Type=="Mu-parameterization") LogLike[ObsI] = Nu[ObsI]*Y[ObsI]*log(Mu[ObsI]) - Nu[ObsI]*lfactorial(Y[ObsI]) - ComputeConst(Lambda=Lambda[ObsI], Nu=Nu[ObsI], Mu=Mu[ObsI], Tol=Tol, Max=Max, Log=TRUE, Type="S")-log(1-(1/exp(ComputeConst(Lambda=Lambda[ObsI], Nu=Nu[ObsI], Mu=Mu[ObsI], Tol=Tol, Max=Max, Log=TRUE, Type="S"))))
  }

  # Return
  if(Output=="LogLike") Return = sum(LogLike)
  if(Output=="Diag") Return = list( "sum_LogLike"=sum(LogLike), "Xmat"=Xmat, "Zmat"=Zmat, "Diag"=cbind("Mu"=Mu, "Lambda"=Lambda, "Nu"=Nu, "Y"=Y, "LogLike"=LogLike) )
  return(-Return)
}

Literature Cited

Bolker, B. M. 2008. Ecological models and data in R. Princeton University Press, Princeton, New Jersey, USA.

Consul, P. C. 1989. Generalized Poisson Distributions: Properties and Applications. Marcel Dekker, New York, New York, USA.

Consul, P. C., and F. Famoye. 1989. The truncated generalized poisson distribution and its estimation. Communications in Statistics - Theory and Methods 18:3635–3648.

Kendall, B. E., and M. E. Wittmann. 2010. A stochastic model for annual reproductive success. The American Naturalist 175:461–468.

Scollnik, D. P. M. 1998. On the analysis of the truncated Generalized Poisson distribution using a Bayesian method. ASTIN Bulletin 28:135–152.


[Back to E095-272]