Here we demonstrate that recruitment (R) and survival (S) can be estimated using just counts of individuals (C), if the counts are from multiple sites at multiple times (seasons). We also show that this works without the need to have repeated counts within one season.
Motivation: Many of our colleagues find it counter-intuitive, and hard to believe, that R and S can be estimated from just the time series of counts, without any additional ecological information. Here we show that it is definitely possible. This is because the counts are provided at multiple time steps, but also across multiple sites. This then enables the demographic parameters to be statistically identifiable.
Principle: We first simulate abundance time series at multiple sites, with known recruitment and survival parameters. We then use the dynamic N-mixture model (Dail & Madsen 2011) to correctly estimate the recruitment and survival parameters.
We encourage readers to explore and modify the simulation using different parameter values.
The script is heavily inspired by code from AHM vol. 2 by Kéry and Royle (2020). If you use this script, please give THEM credits.
Estimated time to run the script: 15 minutes (8 cores, 16 threads, 16GB RAM)
simDM0 <- function(nsites = sites, # number of sites
nsurveys = surveys, # number of surveys per year
nyears = years, # number of years
lambda = lambda, # abundance at time 1
phi = phi, # probability of survival
gamma = gamma, #recruitment parameter
p = p) # detection probability
{
## No covariates (like in our analysis)
y <- array(NA, dim = c(nsites, nyears, nsurveys))
N <- matrix(NA, nsites, nyears)
S <- R <- matrix(NA, nsites, nyears-1)
N[,1] <- rpois(nsites, lambda) # Initial state
for(t in 1:(nyears-1)) { # State dynamics
S[,t] <- rbinom(nsites, N[,t], phi) # Survival process
R[,t] <- rpois(nsites, gamma) # Recruitment process
N[,t+1] <- S[,t] + R[,t]
}
for(j in 1:nsurveys){ # Observation process
y[,,j] <- rbinom(nsites*nyears, N, p)
}
# Put observed data into two dimensions
yy <- array(NA, dim = c(nsites, nsurveys*nyears))
for(t in 1:nyears){
yy[,(nsurveys * t-(nsurveys-1)):(nsurveys*t)] <- y[,t,]
}
return(list(nsites = nsites, nsurveys = nsurveys, nyears = nyears,
lambda = lambda, phi = phi, gamma = gamma, p = p, N = N, S = S, R = R,
y = y, yy = yy))
}
sites = 100 # the number of sites, or routes of the BBS data
surveys = 1 # we used the DM model without repeated count, thus this variable should be set to 1
years = 10 # length of the time series
We warn that using the same number of sites (1033) and the same length of time series (35) as in our analysis will take a very long time to fit.
# values to the parameters we want to predict
lambda = 4 # abundance at time 1
phi = 0.8 # probability of survival
gamma = 1.5 # recruitment parameter
p = 0.7 # detection probability
## Parameters examples for a subset of species for which the models converged.
## Please note that the convergence and estimations of the parameters may be wrong for the following values in the simulation,
## mainly due to restricted spatial and temporal extent (compared to our analysis), and due to a lack of covariates.
# Evening Grosbeak # Hooded Warbler # White-winged Dove
# lambda # 3.19 # 1.09 # 0.53
# phi # 0.92 # 0.99 # 0.98
# gamma # 0.04 # 0.04 # 0.03
# p # 0.21 # 0.40 # 0.47
Note that the simulated data only have the abundance counts (bdata$C), i.e. there is no information on recruitment nor survival in the data. There are multiple years, but no repeated counts per year, i.e. each year and each site only have one count.
data <- simDM0(nsites = sites,
nsurveys = surveys,
nyears = years,
lambda = lambda,
phi = phi,
gamma = gamma,
p = p)
# Bundle the data together for JAGS
bdata <- list(C = data$y,
nsites = dim(data$y)[1],
nsurveys = dim(data$y)[3],
nyears = dim(data$y)[2])
# looking at the first six lines of the counts (C) table
head(bdata$C)
## , , 1
##
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 2 3 4 5 2 3 3 3 3 5
## [2,] 6 6 2 5 3 2 6 3 6 5
## [3,] 1 3 3 4 4 3 8 4 4 3
## [4,] 2 2 3 5 5 5 6 4 7 7
## [5,] 2 0 4 6 3 8 4 5 6 5
## [6,] 4 2 4 4 4 4 9 5 5 6
cat(file = "DM1.txt","
model {
# Priors
lambda ~ dunif(0, 100) # Initial site-specific abundance
phi ~ dunif(0, 1) # Apparent survival
gamma ~ dunif(0, 5) # Recruitment rate
p ~ dunif(0, 1) # Detection probability
# Likelihood
for(i in 1:nsites){
# State process: initial condition
N[i,1] ~ dpois(lambda)
# State process: transition model
for(t in 1:(nyears-1)){
S[i,t+1] ~ dbin(phi, N[i,t]) # Survival process
R[i,t+1] ~ dpois(gamma) # 'absolute' recruitment = 'constant'
N[i,t+1] <- S[i,t+1] + R[i,t+1]
}
# Observation process
for(t in 1:nyears){
for(j in 1:nsurveys){
C[i,t,j] ~ dbin(p, N[i,t])
}
}
}
}
")
# Initial values (they are the same as we used in our main empirical analysis)
Nst <- apply(data$y, c(1,2), max) + 2
Nst[, 2:dim(data$y)[2]] <- NA # Deterministic, N <- S + R
R1 <- apply(data$y, c(1,2), max) # Observed max. counts + 1 as inits
R1[,1] <- NA
inits <- function(){list( lambda = runif(1, 6, 16), phi = runif(1),
gamma = runif(1), p = runif(1), N = Nst, R = R1 + 1 )}
# Parameters to be monitored
params <- c("lambda", "phi", "gamma", "p", "S", "R")
# MCMC settings that we used in our analysis (you can modify them to make the fitting shorter)
na <- 1000 ; ni <- 100000 ; nt <- 10 ; nb <- 75000 ; nc <- 3
# Call JAGS
out1 <- jags(data = bdata,
inits = inits,
parameters.to.save = params,
model.file = "DM1.txt",
n.adapt = na,
n.chains = nc,
n.thin = nt,
n.iter = ni,
n.burnin = nb,
parallel = TRUE)
##
## Processing function input.......
##
## Done.
##
## Beginning parallel processing using 3 cores. Console output will be suppressed.
##
## Parallel processing completed.
##
## Calculating statistics.......
##
## Done.
Estimated parameters are represented by the density plots of the three MCMC chains. True parameters are the vertical solid lines.
# get true counts of surviving and recruited individuals from simulated data
true.S <- data$S
true.R <- data$R
# get estimated counts of surviving and recruited individuals from the model
est.S <- out1$mean$S[,-1]
est.R <- out1$mean$R[,-1]
# plot the estimated vs observed counts
par(mfrow=c(1,2))
plot(true.S, est.S, xlab="True survival (S)", ylab="Estimated survival (S)")
abline(a= 0, b=1, col="red")
plot(true.R, est.R, xlab="True recruitment (R)", ylab="Estimated recruitment (R)")
abline(a= 0, b=1, col="red")
## R version 4.2.0 (2022-04-22 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22631)
##
## Matrix products: default
##
## Random number generation:
## RNG: L'Ecuyer-CMRG
## Normal: Inversion
## Sample: Rejection
##
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8
## [2] LC_CTYPE=English_United Kingdom.utf8
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United Kingdom.utf8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] jagsUI_1.6.2
##
## loaded via a namespace (and not attached):
## [1] lattice_0.20-45 digest_0.6.37 grid_4.2.0 R6_2.5.1
## [5] lifecycle_1.0.4 jsonlite_1.8.9 coda_0.19-4.1 evaluate_1.0.1
## [9] cachem_1.1.0 rlang_1.1.4 cli_3.6.3 rstudioapi_0.17.1
## [13] jquerylib_0.1.4 rjags_4-16 bslib_0.8.0 rmarkdown_2.29
## [17] tools_4.2.0 parallel_4.2.0 xfun_0.49 yaml_2.3.10
## [21] fastmap_1.2.0 compiler_4.2.0 htmltools_0.5.8.1 knitr_1.49
## [25] sass_0.4.9