Data format for DSM — dsm-data • dsm

Two data.frames must be provided to dsm. They are referred to as observation.data and segment.data.

Details

The segment.data table has the sample identifiers which define the segments, the corresponding effort (line length) expended and the environmental covariates that will be used to model abundance/density. observation.data provides a link table between the observations used in the detection function and the samples (segments), so that we can aggregate the observations to the segments (i.e., observation.data is a "look-up table" between the observations and the segments).

observation.data - the observation data.frame must have (at least) the following columns:

object unique object identifier
Sample.Label the identifier for the segment where observation occurred
size the size of each observed group (e.g., 1 if all animals occurred individually)
distance distance to observation

One can often also use observation.data to fit a detection function (so additional columns for detection function covariates are allowed in this table).

segment.data: the segment data.frame must have (at least) the following columns:

Effort the effort (in terms of length of the segment)
Sample.Label identifier for the segment (unique!)
??? environmental covariates, for example location (projected latitude and longitude), and other relevant covariates (sea surface temperature, foliage type, altitude, bathymetry etc).

Multiple detection functions

If multiple detection functions are to be used, then a column named ddfobj must be included in observation.data and segment.data. This lets the model know which detection function each observation is from. These are numeric and ordered as the ddf.obj argument to dsm, e.g., ddf.obj=list(ship_ddf, aerial_ddf) means ship detections have ddfobj=1 and aerial detections have ddfobj=2 in the observation data.

Mark-recapture distance sampling models

When using mrds models that include mark-recapture components (currently independent observer and trial modes are supported) then the format of the observation data needs to be checked to ensure that observations are not duplicated. The observer column is also required in the observation.data.

Independent observer mode only unique observations (unique object IDs) are required.
Trial mode only observations made by observer 1 are required.