Analysis using multipliers
We consider indirect methods to estimate abundance and hence include multipliers in the abundance calculations. The first problem uses data from a dung survey of deer and there are two levels of multipliers that need to be accounted for (dung production rate and dung decay rate). The songbird data set deals with instantaneous cues and so only cue rate needs to be taken into account.
Objectives
The objectives of this exercise are to
- Fit detection functions to cues
- Obtain relevant multipliers
- Use the multipliers in the
dht2
function to obtain animal abundances.
Dung survey of deer
The question is how to estimate of the density of sika deer in a number of woodlands in the Scottish Borders (Marques et al., 2001). These animals are shy and will be aware of the presence of an observer before the observer detects them, making surveys of this species challenging. As a consequence, indirect estimation methods have been applied to this problem. In this manner, an estimate of density is produced for some sign generated by deer (in this case, faecal or dung pellets) and this estimate is transformed to density of deer (\(D_{\textrm{deer}}\)) by
\[ \hat D_{\textrm{deer}} = \frac{\textrm{dung deposited daily}}{\textrm{dung production rate (per animal)}} \] where the dung deposited daily is given by
\[ \textrm{dung deposited daily} = \frac{\hat D_{\textrm{pellet groups}}}{\textrm{mean time to decay}} \] Hence, we use distance sampling to produce a pellet group density estimate, then adjust it accordingly to account for the production and decay processes operating during the time the data were being acquired. We will also take uncertainty in the dung production and decay rates into account in our final estimate of deer density.
Data from 9 woodlands (labelled A-H and J) were collected according to the survey design (Figure 1) but note that data from block D were not included in this exercise.
In addition to these data, we also require estimates of the production rate. From a literature search, we learn that sika deer produce 25 pellet groups daily but this source did not provide a measure of variability of this estimate. During the course of our surveys we also followed the fate of some marked pellet groups to estimate the decay (disappearance) rates of a pellet group. A thorough discussion of methods useful for estimating decay rates and associated measures of precision can be found in Laing et al. (2003).
There are many factors that might influence both production and decay rates, and for purposes of this exercise we will make the simplifying assumption that decay rate is homogeneous across these woodlands. A study to estimate the mean time for pellet groups to decay was conducted (see Multipliers section below), with decay rate from that study presumed to be representative of the study area. If you were to conduct a survey such as this, you would want to investigate this assumption more thoroughly.
Getting started
These data (called sikadeer
) are available in the Distance
package. As in previous exercises the conversion units are calculated. What are the measurement units for these data?
Fit detection function to dung pellets
Fit the usual series of models (i.e. half normal, hazard rate, uniform) models to the distances to pellet groups and decide on a detection function (don’t spend too long on this). Call the fitted model object deer.df
(abbreviation for deer detection function). This detection function will be used to obtain \(\hat D_{\textrm{pellet groups}}\).
Have a look at the Summary statistics
for this model - what do you notice about the allocation of search effort in each woodland?
Multipliers
The next step is to create an object which contains the multipliers we wish to use. We already have estimates of dung production rates but need similar information on dung decay (or persistence) rate.
Data to calculate this has been collected in the file IntroDS_9.1.csv
that can be read from the Github internet repository. Following code comes from Meredith (2017).
MIKE.persistence <- function(DATA) {
# Purpose: calculate mean persistence time (mean time to decay) for dung/nest data
# Input: data frame with at least two columns:
# DAYS - calendar day on which dung status was observed
# STATE - dung status: 1-intact, 0-decayed
# Output: point estimate, standard error and CV of mean persistence time
#
# Attribution: code from Mike Meredith website:
# http://www.mikemeredith.net/blog/2017/Sign_persistence.htm
# (site no longer maintained)
# Citing: CITES elephant protocol
# https://cites.org/sites/default/files/common/prog/mike/survey/dung_standards.pdf
## Fit logistic regression model to STATE on DAYS, extract coefficients
dung.glm <- glm(STATE ~ DAYS, data=DATA, family=binomial(link = "logit"))
betas <- coefficients(dung.glm)
## Calculate mean persistence time
mean.decay <- -(1+exp(-betas[1])) * log(1+exp(betas[1])) / betas[2]
## Calculate the variance of the estimate
vcovar <- vcov(dung.glm)
var0 <- vcovar[1,1] # variance of beta0
var1 <- vcovar[2,2] # variance of beta1
covar <- vcovar[2,1] # covariance
deriv0 <- -(1-exp(-betas[1]) * log(1+exp(betas[1])))/betas[2]
deriv1 <- -mean.decay/betas[2]
var.mean <- var0*deriv0^2 + 2*covar*deriv0*deriv1 + var1*deriv1^2
## Calculate the SE and CV and return
se.mean <- sqrt(var.mean)
cv.mean <- se.mean/mean.decay
out <- c(mean.decay, se.mean, 100*cv.mean)
names(out) <- c("Mean persistence time", "SE", "%CV")
plot(decay$DAYS, jitter(decay$STATE, amount=0.10), xlab="Days since initiation",
ylab="Dung persists (yes=1)",
main="Eight dung piles revisited over time")
curve(predict(dung.glm, data.frame(DAYS=x), type="resp"), add=TRUE)
abline(v=mean.decay, lwd=2, lty=3)
return(out)
}
github <- "https://raw.githubusercontent.com/"
repo <- "distanceworkshops/async2024-2/main/"
filename <- "10-multipliers/R-prac/IntroDS_9.1.csv"
decay <- read.csv(paste0(github, repo, filename))
persistence.time <- MIKE.persistence(decay)
print(persistence.time)
Running the above command should have produced a plot of dung persistence versus days since produced and fitted a logistic regression (this is like a simple linear regression but restricts the response to taking values between 0 and 1). Note the points can in reality only take values between 0 and 1 but for the purposes of plotting have been ‘jittered’ to avoid over-plotting.
An estimate of mean persistence time and measure of variability are also provided - make a note of these as they will be required below.
As stated above, we want an object which contains information on the dung production rate (and standard error) and dung decay rate (and standard error). The following command creates a list containing two data frames:
creation
contains estimates of the dung production rate and associated standard errordecay
contains the dung decay rate and associated standard error whereXX
andYY
are the estimates you obtained from the dung decay rate analysis.
The final step is to use these multipliers to convert \(\hat D_{\textrm{pellet groups}}\) to \(\hat D_{\textrm{deer}}\) (as in the equations above) - for this we need to employ the dht2
function. In the command below the multipliers=
argument allows us to specify the rates and standard errors. There are a couple of other function arguments that need some explanation:
strat_formula=~Region.Label
is specified to take into account the design (i.e. different woodlands or blocks).stratification="effort_sum"
is specified because we want to produce an overall estimate density that is the mean of the woodland specific densities weighted by effort allocated within each block.deer.df
is the detection function you have fitted.
The function dht2
also provides information on the components of variance. Make a note of the these (contribution of detection function, encounter rate, decay rate and what happened to production rate component?) in each strata.
Cue counting of songbirds
Remember the wren data that was introduced in the point transect exercises (Module 5, point transects): another data collection method that was used was cue counting. In this case, the cue was a song burst. These data are stored in wren_cuecount
in the Distance
package. The data arose from the point transect survey of songbirds described in Buckland (2006).
Following a similar approach to that of the whale data, estimate a detection function of songs, create a multipliers object and include this in the dht2
function to estimate wren density. Call this wren3.est
). There are a few things that will be useful to know:
- search effort (measured in time) was 2 visits each lasting 5 minutes,
- effort will need to be properly specified before fitting the detection function with
ds
- multiplier in this case is cue rate and its measure of precision,
- the multiplier needs to be created before making the call to
dht2
What do you think the sampling fraction will be for these point transects?
To obtain density (rather than abundance) use the following command: