Fit detection functions and calculate abundance from line or point transect data

This function fits detection functions to line or point transect data and then (provided that survey information is supplied) calculates abundance and density estimates. The examples below illustrate some basic types of analysis using ds().

Usage

ds(
  data,
  truncation = ifelse(is.null(data$distend), ifelse(is.null(cutpoints),
    max(data$distance), max(cutpoints)), max(data$distend)),
  transect = "line",
  formula = ~1,
  key = c("hn", "hr", "unif"),
  adjustment = c("cos", "herm", "poly"),
  nadj = NULL,
  order = NULL,
  scale = c("width", "scale"),
  cutpoints = NULL,
  dht_group = FALSE,
  monotonicity = ifelse(formula == ~1, "strict", "none"),
  region_table = NULL,
  sample_table = NULL,
  obs_table = NULL,
  convert_units = 1,
  er_var = ifelse(transect == "line", "R2", "P2"),
  method = "nlminb",
  mono_method = "slsqp",
  quiet = FALSE,
  debug_level = 0,
  initial_values = NULL,
  max_adjustments = 5,
  er_method = 2,
  dht_se = TRUE,
  optimizer = "both",
  winebin = NULL,
  dht.group,
  region.table,
  sample.table,
  obs.table,
  convert.units,
  er.var,
  debug.level,
  initial.values,
  max.adjustments
)

Arguments

data

a data.frame containing at least a column called distance or a numeric vector containing the distances. NOTE! If there is a column called size in the data then it will be interpreted as group/cluster size, see the section "Clusters/groups", below. One can supply data as a "flat file" and not supply region_table, sample_table and obs_table, see "Data format", below and flatfile.

truncation

either truncation distance (numeric, e.g. 5) or percentage (as a string, e.g. "15%","15"). Can be supplied as a list with elements left and right if left truncation is required (e.g. list(left=1,right=20) or list(left="1%",right="15%") or even list(left="1",right="15%")). By default for exact distances the maximum observed distance is used as the right truncation. When the data is binned, the right truncation is the largest bin end point. Default left truncation is set to zero.

transect

indicates transect type "line" (default) or "point".

formula

formula for the scale parameter. For a CDS analysis leave this as its default ~1.

key

key function to use; "hn" gives half-normal (default), "hr" gives hazard-rate and "unif" gives uniform. Note that if uniform key is used, covariates cannot be included in the model.

adjustment

adjustment terms to use; "cos" gives cosine (default), "herm" gives Hermite polynomial and "poly" gives simple polynomial. A value of NULL indicates that no adjustments are to be fitted.

nadj

the number of adjustment terms to fit. In the absence of covariates in the formula, the default value (NULL) will select via AIC (using a sequential forward selection algorithm) up to max.adjustment adjustments (unless order is specified). When covariates are present in the model formula, the default value of NULL results in no adjustment terms being fitted in the model. A non-negative integer value will cause the specified number of adjustments to be fitted. Supplying an integer value will allow the use of adjustment terms in addition to specifying covariates in the model. The order of adjustment terms used will depend on the keyand adjustment. For key="unif", adjustments of order 1, 2, 3, ... are fitted when adjustment = "cos" and order 2, 4, 6, ... otherwise. For key="hn" or "hr" adjustments of order 2, 3, 4, ... are fitted when adjustment = "cos" and order 4, 6, 8, ... otherwise. See Buckland et al. (2001) p. 47 for details.

order

order of adjustment terms to fit. The default value (NULL) results in ds choosing the orders to use - see nadj. Otherwise a scalar positive integer value can be used to fit a single adjustment term of the specified order, and a vector of positive integers to fit multiple adjustment terms of the specified orders. For simple and Hermite polynomial adjustments, only even orders are allowed. The number of adjustment terms specified here must match nadj (or nadj can be the default NULL value).

scale

the scale by which the distances in the adjustment terms are divided. Defaults to "width", scaling by the truncation distance. If the key is uniform only "width" will be used. The other option is "scale": the scale parameter of the detection

cutpoints

if the data are binned, this vector gives the cutpoints of the bins. Supplying a distance column in your data and specifying cutpoints is the recommended approach for all standard binned analyses. Ensure that the first element is 0 (or the left truncation distance) and the last is the distance to the end of the furthest bin. (Default NULL, no binning.) Provide distbegin and distend columns in your data only when your cutpoints are not constant across all your data, e.g. planes flying at differing altitudes then do not specify the cutpoints argument.

dht_group

should density abundance estimates consider all groups to be size 1 (abundance of groups) dht_group=TRUE or should the abundance of individuals (group size is taken into account), dht_group=FALSE. Default is FALSE (abundance of individuals is calculated).

monotonicity

should the detection function be constrained for monotonicity weakly ("weak"), strictly ("strict") or not at all ("none" or FALSE). See Monotonicity, below. (Default "strict"). By default it is on for models without covariates in the detection function, off when covariates are present.

region_table

data_frame with two columns:

Region.Label label for the region
Area area of the region
region_table has one row for each stratum. If there is no stratification then region_table has one entry with Area corresponding to the total survey area. If Area is omitted density estimates only are produced.

sample_table

data.frame mapping the regions to the samples (i.e. transects). There are three columns:

Sample.Label label for the sample
Region.Label label for the region that the sample belongs to.
Effort the effort expended in that sample (e.g. transect length).

obs_table

data.frame mapping the individual observations (objects) to regions and samples. There should be three columns:

object unique numeric identifier for the observation
Region.Label label for the region that the sample belongs to
Sample.Label label for the sample

convert_units

conversion between units for abundance estimation, see "Units", below. (Defaults to 1, implying all of the units are "correct" already.)

er_var

specifies which encounter rate estimator to use in the case that dht_se is TRUE, er_method is either 1 or 2 and there are two or more samplers. Defaults to "R2" for line transects and "P2" for point transects (>= 1.0.9, earlier versions <= 1.0.8 used the "P3" estimator by default for points), both of which assume random placement of transects. For systematic designs, alternative estimators may be more appropriate, see dht2 for more information.

method

optimization method to use (any method usable by optim or optimx). Defaults to "nlminb".

mono_method

optimization method to use when monotonicity is enforced. Can be either slsqp or solnp. Defaults to slsqp.

quiet

suppress non-essential messages (useful for bootstraps etc). Default value FALSE.

debug_level

print debugging output. 0=none, 1-3 increasing levels of debugging output.

initial_values

a list of named starting values, see mrds_opt. Only allowed when AIC term selection is not used.

max_adjustments

maximum number of adjustments to try (default 5) only used when order=NULL.

er_method

encounter rate variance calculation: default = 2 gives the method of Innes et al. (2002) , using expected counts in the encounter rate. Setting to 1 gives observed counts (which matches Distance for Windows) and 0 uses negative binomial variance (only useful in the rare situation where study area = surveyed area). See dht.se for more details, noting this er_method argument corresponds to the varflag element of the options argument in dht.se.

dht_se

should uncertainty be calculated when using dht? Safe to leave as TRUE, used in bootdht.

optimizer

By default this is set to 'both'. In this case the R optimizer will be used and if present the MCDS optimizer will also be used. The result with the best likelihood value will be selected. To run only a specified optimizer set this value to either 'R' or 'MCDS'. See mcds_dot_exe for setup instructions.

winebin

If you are trying to use our MCDS.exe optimizer on a non-windows system then you may need to specify the winebin. Please see mcds_dot_exe for more details.

dht.group