Assessing line transect detection functions
In Exercise 2, we fitted different detection functions and compared them in terms of goodness of fit and AIC. Here, we continue to fit and assess different models and look at additional arguments in the ds
package.
Objectives
The aim of this exercise is to practise fitting and assessing different line transect detection functions and in particular to:
- Understand the data format when there are no detections on a line,
- Explore different truncation options,
- Determine whether adjustment terms are required,
- Practice model selection,
Fitting models to simulated data
The data used for this practical were generated from a half-normal distribution and therefore the true density is known. There are 12 transects.
The data are stored in a data set LTExercise3
which contains the following columns:
- Study.Area - Name of study called (not very imaginatively) ‘LTExercise3’
- Region.Label - identifier of regions (in this case there is only one region and it is set to ‘Default’)
- Area - size of the study region (km\(^2\))
- Sample.Label - line transect identifier (Line 1 - Line 12)
- Effort - length of the line transects (km)
- object - unique identifier to each detected object
- distance - perpendicular distances (metres).
Accessing the data
Load the Distance
package to access both the data and analytical functions.
Data format: no detections on a transect
Before fitting models, it is worth investigating the data a bit further: let’s start by summarising the perpendicular distances:
The summary indicates that the minimum distance is min(hndat$distance)
and the maximum is max(hndat$distance)
metres and there is one missing value (indicated by the NA
). If we print a few rows of the data, we can see that this missing value occurred on transect ‘Line 11’.
The NA
indicates that there were no detections on this transect, but the transect information needs to be included otherwise the number of transects and the total line length will be incorrect.
Truncation
Let’s start by fitting a basic model, i.e. no adjustment terms (by default a half normal model is fitted).
For this project, perpendicular distances are in metres and the transect lines are recorded in kilometres.
Looking at a summary of the model object, how many objects are there in total? What is the maximum observed perpendicular distance?
Plot the detection function and specify many histogram bins:
The histogram indicates that there is a large gap in detections therefore, to avoid a long right hand tail in the detection function, truncation is necessary. There are several ways to truncate: excluding distances beyond some specified distance or excluding a specified percentage of the largest distances. Note that here we only consider excluding large perpendicular distances, which is frequently referred to as right truncation.
Truncation at a fixed distance
The following command truncates the perpendicular distances at 20 metres i.e. objects detected beyond 20m are excluded.
Generate a summary and plot of the detection function to see what effect this truncation has had on the number of objects.
Truncating a percentage of distances
An alternative way to truncate distances is to specify a percentage of detected objects that should be excluded. In the command below, 10% of the largest distances are excluded.
Generate a summary and plot to see what effect truncation has had.
Exploring different models
Decide on a suitable truncation distance (but don’t spend too long on this) and then fit different key detection functions and adjustment terms to assess whether these data can be satisfactorily analysed with the ‘wrong’ model. By default, the ds
function fits a half normal function and cosine adjustment terms (adjustment="cos"
) of up to order 5: AIC is used to determine how many, if any, adjustment terms are required. This model is specified in the command below:
Change the key and adjustment terms: possible options are listed in Exercise 2 or use the help(ds)
for options. See how the bias and precision compare between the models: the true density was 79.8 animals per km\(^2\).
Fitting models to real data (optional question)
- The objective of this portion of the exercise is to give you more practice with a line transect data set and also familiarise you with converting from exact distances to binned distances.
The data stored in a data set capercaillie
were collected during a line transect survey of capercaillie (a species of large grouse) in Scotland. For a paper demonstrating a distance sampling survey of this species, consult (Catt et al., 1998). The data consist of:
- Region.Label - name of the region
- Area - size of region in hectares (1 hectare = 100m \(\times\) 100m)
- Sample.Label - line transect identifier (one transect)
- Effort - length of line (km)
- object - unique identifier for each group detected
- distance - perpendicular distance to each group (metres)
- size - group size (in this case only single birds were detected)
Access these data and:
- Decide on a suitable truncation distance, if any,
- Fit a few different key detection functions and adjustment terms
- Compare the AIC values and qq plots and choose a model
- Calculate density for your chosen model. To obtain density in birds per hectare, the conversion units should be specified as
conversion.factor<-convert_units("meter", "kilometer", "hectare")
.
Converting exact distances to binned distances
Sometimes we wish to convert exact distances to binned distances, if for example, there is evidence of rounding to favoured values. To do this in ds
we need to specify the cutpoints of the bins, including zero and the maximum distance. In the example below, cutpoints at 0, 10, 20, …, 80 are specified.