Fit a density surface model to segment-specific estimates of abundance or density.
Source:R/dsm.R
dsm.Rd
Fits a density surface model (DSM) to detection adjusted counts from a
spatially-referenced distance sampling analysis. dsm
takes observations of
animals, allocates them to segments of line (or strip transects) and
optionally adjusts the counts based on detectability using a supplied
detection function model. A generalized additive model, generalized mixed
model or generalized linear model is then used to model these adjusted
counts based on a formula involving environmental covariates.
Usage
dsm(
formula,
ddf.obj,
segment.data,
observation.data,
engine = "gam",
convert.units = 1,
family = quasipoisson(link = "log"),
group = FALSE,
control = list(keepData = TRUE),
availability = 1,
segment.area = NULL,
weights = NULL,
method = "REML",
...
)
Arguments
- formula
formula for the surface. This should be a valid formula. See "Details", below, for how to define the response.
- ddf.obj
result from call to
ddf
ords
. If multiple detection functions are required alist
can be provided. For strip/circle transects where it is assumed all objects are observed, seedummy_ddf
. Mark-recapture distance sampling (mrds
) models of typeio
(independent observers) andtrial
are allowed.- segment.data
segment data, see
dsm-data
.- observation.data
observation data, see
dsm-data
.- engine
which fitting engine should be used for the DSM (
"glm"
/"gam"
/"gamm"
/"bam"
).- convert.units
conversion factor to multiply the area of the segments by. See 'Units' below.
- family
response distribution (popular choices include
quasipoisson
,Tweedie
/tw
andnegbin
/nb
). Defaultsquasipoisson
.- group
if
TRUE
the abundance of groups will be calculated rather than the abundance of individuals. Setting this option toTRUE
is equivalent to setting the size of each group to be 1.- control
the usual
control
argument for agam
;keepData
must beTRUE
for variance estimation to work (though this option cannot be set for GLMs or GAMMs).- availability
an estimate of availability bias. For count models used to multiply the effective strip width (must be a vector of length 1 or length the number of rows in
segment.data
); for estimated abundance/estimated density models used to scale the response (must be a vector of length 1 or length the number of rows inobservation.data
). Uncertainty in the availability is not handled at present.- segment.area
if
NULL
(default) segment areas will be calculated by multiplying theEffort
column insegment.data
by the (right minus left) truncation distance for theddf.obj
or bystrip.width
. Alternatively a vector of segment areas can be provided (which must be the same length as the number of rows insegment.data
) or a character string giving the name of a column insegment.data
which contains the areas. Ifsegment.area
is specified it takes precedent.- weights
weights for each observation used in model fitting. The default,
weights=NULL
, weights each observation by its area (see Details). Setting a scalar value (e.g.,weights=1
) all observations are equally weighted.- method
The smoothing parameter estimation method. Default is
"REML"
, using Restricted Maximum Likelihood. Seegam
for other options. Ignored forengine="glm"
.- ...
anything else to be passed straight to
glm
,gam
,gamm
orbam
.
Value
a glm
, gam
, gamm
or
bam
object, with an additional element, $ddf
which holds the
detection function object.
Details
The response (LHS of formula
) can be one of the following (with
restrictions outlined below):
count
count in each segmentabundance.est
estimated abundance per segment, estimation is via a Horvitz-Thompson estimatordensity.est
density per segment
The offset used in the model is dependent on the response:
count
area of segment multiplied by average probability of detection in the segmentabundance.est
area of the segmentdensity
zero
The count
response can only be used when detection function covariates
only vary between segments/points (not within). For example, weather
conditions (like visibility or sea state) or foliage cover are usually
acceptable as they do not change within the segment, but animal sex or
behaviour will not work. The abundance.est
response can be used with any
covariates in the detection function.
In the density case, observations can be weighted by segment areas via the
weights=
argument. By default (weights=NULL
), when density is estimated
the weights are set to the segment areas (using segment.area
or by
calculated from detection function object metadata and Effort
data).
Alternatively weights=1
will set the weights to all be equal. A third
alternative is to pass in a vector of length equal to the number of
segments, containing appropriate weights.
A example analyses are available at http://examples.distancesampling.org.
Units
It is often the case that distances are collected in metres and segment
lengths are recorded in kilometres. dsm
allows you to provide a conversion
factor (convert.units
) to multiply the areas by. For example: if distances
are in metres and segment lengths are in kilometres setting
convert.units=1000
will lead to the analysis being in metres. Setting
convert.units=1/1000
will lead to the analysis being in kilometres. The
conversion factor will be applied to segment.area
if that is specified.
Large models
For large models, engine="bam"
with method="fREML"
may be useful. Models
specified for bam
should be as gam
. Read bam
before using
this option; this option is considered EXPERIMENTAL at the moment. In
particular note that the default basis choice (thin plate regression
splines) will be slow and that in general fitting is less stable than when
using gam
. For negative binomial response, theta must be
specified when using bam
.
References
Hedley, S. and S. T. Buckland. 2004. Spatial models for line transect sampling. JABES 9:181-199.
Miller, D. L., Burt, M. L., Rexstad, E. A., Thomas, L. (2013), Spatial models for distance sampling data: recent developments and future directions. Methods in Ecology and Evolution, 4: 1001-1010. doi: 10.1111/2041-210X.12105 (Open Access)
Wood, S.N. 2006. Generalized Additive Models: An Introduction with R. CRC/Chapman & Hall.
Examples
if (FALSE) { # \dontrun{
library(Distance)
library(dsm)
# load the Gulf of Mexico dolphin data (see ?mexdolphins)
data(mexdolphins)
# fit a detection function and look at the summary
hr.model <- ds(distdata, truncation=6000,
key = "hr", adjustment = NULL)
summary(hr.model)
# fit a simple smooth of x and y to counts
mod1 <- dsm(count~s(x,y), hr.model, segdata, obsdata)
summary(mod1)
# predict over a grid
mod1.pred <- predict(mod1, preddata, preddata$area)
# calculate the predicted abundance over the grid
sum(mod1.pred)
# plot the smooth
plot(mod1)
} # }