In this example we use a data set of simulated minke whales in the Southern Ocean to examine data collected in two strata.
library(Distance)
whales <- read.csv("minke.csv")
head(whales)
Region.Label Area Sample.Label Effort distance
1 South 84734 1 86.75 0.10
2 South 84734 1 86.75 0.22
3 South 84734 1 86.75 0.16
4 South 84734 1 86.75 0.78
5 South 84734 1 86.75 0.21
6 South 84734 1 86.75 0.95
In this dataset, the Region.Label
takes on two values, South
and North
relating to the two strata in the data set.
For full geographic stratification we make two separate calls to ds()
behaving as if the two strata have nothing in common; which is what the full stratification analysis presumes.
whale.trunc <- 1.5
whale.full.strat1 <- ds(whales[whales$Region.Label=="South",], truncation=whale.trunc,
key="hr", adjustment=NULL)
whale.full.strat2 <- ds(whales[whales$Region.Label=="North",], truncation=whale.trunc,
key="hr", adjustment=NULL)
The model selection metric for the full stratification analysis is the sum of AIC for the two distinct analyses.
full.aic <- summary(whale.full.strat1)$ds$aic + summary(whale.full.strat2)$ds$aic
AIC scores for the two strata analysed separately: 8.6176 + 37.2772 = 45.8948
Contrast 3.4619, 0.6002 with (3.182, 0.5547) and 2.7847, 0.9902 with (2.770, 0.9706). The first pair of estimates were produced with this current R analysis whereas the second set of each pair was produced by Program Distance 6.2 (numerical results may be slightly different on your computer).
Here we manufacture a new variable in our dataset stratum
based upon the Region.Label
. The new variable is then used in a formula as a discrete covariate.
whales$stratum <- ifelse(whales$Region.Label=="North", "N", "S")
whale.strat.covariate <- ds(whales, truncation=whale.trunc, quiet=TRUE,
formula = ~as.factor(stratum),
key="hr", adjustment=NULL)
AIC score for this model with stratum as a covariate is 43.9582.
whale.pooledf0 <- ds(whales, truncation=whale.trunc,
key="hr", adjustment=NULL)
AIC score for this model pooling sightings from both strata into a single detection function is 48.6384.
Our model selection table for this stratified survey design
Model | Num. parameters | AIC |
---|---|---|
Full geographic stratification | 4 | 45.8948 |
Detection function shared between strata | 2 | 48.6384 |
Stratum as covariate | 3 | 43.9582 |
This shows that the pooled analysis (AIC=48.6384) is not preferred to the full geographic stratification analysis (AIC=45.8948). The model with the smaller AIC is preferable. However if we introduce stratum
as a covariate, this forms a halfway house between the extremes, with an added parameter causing the two detection functions to share the same basic shape, but detectability falls off more slowly in one stratum compared to the other (see following figure) and the lowest of the three AIC scores is 43.9582.
The detection function that falls off most rapidly is the detection function for the southern stratum, nearer the Antarctic coast where observation conditions were understandable poorer.
par(mfrow=c(1,2))
plot(whale.strat.covariate, main="Minke whales, \ndetection function uses stratum as covariate")
covar.fit <- ddf.gof(whale.strat.covariate$ddf)
message <- paste("Cramer von-Mises W=", round(covar.fit$dsgof$CvM$W,3),
"\nP=", round(covar.fit$dsgof$CvM$p,3))
text(0.6, 0.1, message, cex=0.8)
par(mfrow=c(1,1))
What remains is to examine the estimated abundance produced by the three models.
Estimates for the two strata analysed individually.
Label | Estimate | se | cv | lcl | ucl | df |
---|---|---|---|---|---|---|
Total | 9981 | 3875 | 0.3882 | 4468 | 22298 | 13.97 |
Total | 4588 | 1200 | 0.2616 | 2688 | 7833 | 21.14 |
Estimates of group abundance when a detection function is fitted to data pooled across strata. These results are actually incorrect because effort was not equally allocated between the strata. The southern stratum is much smaller (84000km2) than the northern (630000km2). But the southern stratum is more desirable habitat for the minke whales because it is closer to the ice edge in Antarctica. The southern stratum had much greater survey effort per area than the northern stratum. This is not represented in the pooled analysis.
Label | Estimate | se | cv | lcl | ucl | df |
---|---|---|---|---|---|---|
North | 12182 | 4638.8 | 0.3808 | 5500 | 26980 | 12.97 |
South | 3653 | 910.1 | 0.2491 | 2182 | 6118 | 17.96 |
Total | 15835 | 4834.4 | 0.3053 | 8389 | 29892 | 15.26 |
Group abundance estimates when strata is a covariate in the detection function (this model was preferred in the model selection exercise).
Label | Estimate | se | cv | lcl | ucl | df |
---|---|---|---|---|---|---|
North | 9863 | 3760 | 0.3813 | 4451 | 21856 | 13.03 |
South | 4651 | 1225 | 0.2633 | 2719 | 7956 | 22.16 |
Total | 14514 | 3970 | 0.2735 | 8215 | 25645 | 16.06 |