Distance sampling survey design solution

Author

Centre for Research into Ecological and Environmental Modelling
University of St Andrews

Modified

November 2024

Solution

Distance sampling survey design

library(dssd)
library(sf)
Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE

Aerial survey of marine mammals in St Andrews Bay

Systematic Parallel Line Design

Answers

What spacing would you select for this design? What is the maximum trackline length for the design you have selected? What on-effort line length are we likely to achieve?

The spacing chosen by dssd of 4937.5m to generate a line length of 200km resulted in a maximum trackline length of around 261km (each exact answer will vary due to the random generate of surveys). If we choose this design then it is possible that when we randomly generate our survey we may not be able to complete it with the effort we have available.

We should therefore increase the spacing between the transects and re-run the coverage simulations. A spacing of 5000m gave a maximum trackline length of around 249km (see summary table of Trackline length in the output below) so we can be fairly confident that we will be able to complete any survey which we randomly generate from this design. This spacing should allow us to achieve an on-effort line length of 199km (see Line length section of design summary below). The minimum line length we would expect to achieve is 184km and the maximum is 206km. [Note your values might differ to those below]

shapefile.name <- system.file("extdata", "StAndrew.shp", package = "dssd")
region.sab <- make.region(region.name = "St Andrews Bay",
                      units = "m",
                      shape = shapefile.name)
cover.sabay <- make.coverage(region.sab, n.grid.points = 5000)
design.spacing5km <- make.design(region = region.sab,
                      transect.type = "line",
                      design = "systematic",
                      spacing = 5000,
                      design.angle = 90,
                      edge.protocol = "minus",
                      truncation = 2000,
                      coverage.grid = cover.sabay)
design.spacing5km <- run.coverage(design.spacing5km, reps = 250, quiet=TRUE)
plot(design.spacing5km)

Coverage grid plot for parallel design of St Andrews Bay.
print(design.spacing5km)

   Strata St Andrews Bay:
   _______________________
Design:  systematically spaced transects
Spacing:  5000
Number of samplers:  NA
Line length: NA
Design angle:  90
Edge protocol:  minus

Strata areas:  987500079
Region and effort units:  m
Coverage Simulation repetitions:  250

    Number of samplers:
    
        St Andrews Bay Total
Minimum            7.0   7.0
Mean               7.9   7.9
Median             8.0   8.0
Maximum            8.0   8.0
sd                 0.2   0.2

    Covered area:
    
        St Andrews Bay     Total
Minimum      726475938 726475938
Mean         761495164 761495164
Median       766397303 766397303
Maximum      778636421 778636421
sd            14951337  14951337

    % of region covered:
    
        St Andrews Bay Total
Minimum          73.57 73.57
Mean             77.11 77.11
Median           77.61 77.61
Maximum          78.85 78.85
sd                1.51  1.51

    Line length:
    
        St Andrews Bay     Total
Minimum      184440.04 184440.04
Mean         197769.33 197769.33
Median       198792.64 198792.64
Maximum      205770.46 205770.46
sd             5662.35   5662.35

    Trackline length:
    
        St Andrews Bay     Total
Minimum      220531.41 220531.41
Mean         242772.12 242772.12
Median       246251.49 246251.49
Maximum      248799.33 248799.33
sd             7485.98   7485.98

    Cyclic trackline length:
    
        St Andrews Bay     Total
Minimum      251944.76 251944.76
Mean         279722.19 279722.19
Median       283646.96 283646.96
Maximum      285985.72 285985.72
sd             8556.03   8556.03

    Coverage Score Summary:
    
        St Andrews Bay      Total
Minimum     0.37200000 0.37200000
Mean        0.77120368 0.77120368
Median      0.79600000 0.79600000
Maximum     0.84400000 0.84400000
sd          0.08376865 0.08376865

Equal Spaced Zigzag Design

Answers

Does this design meet our survey effort constraint? What is the maximum total trackline length for this design? What line length are we likely to achieve with this design? Is this higher or lower than the systematic parallel design?

You were asked to then run a coverage simulation and check if the trackline length was within our effort constraints. I found the maximum trackline length to be 242km (see Trackline length summary table in the output below) so within our constraint of 250km. I then got a mean line length of 221km and minimum and maximum line lengths of 212km and 227km, respectively (see Line length summary table in the output below). We can therefore expect to achieve just over 20km more on-effort survey line length with the zigzag design than the systematic parallel line design - 10% gain. [Note your values may differ]

design.zz.4500 <- make.design(region = region.sab,
                      transect.type = "line",
                      design = "eszigzag",
                      spacing = 4500,
                      design.angle = 0,
                      edge.protocol = "minus",
                      bounding.shape = "convex.hull",
                      truncation = 2000,
                      coverage.grid = cover.sabay)
design.zz.4500 <- run.coverage(design.zz.4500, reps = 250, quiet=TRUE)
plot(design.zz.4500)

Coverage grid plot for zigzag design of St Andrews Bay.
Answers

Do you think the coverage scores look uniform across the study region? Where are they higher/lower? Why do you think this is?

You were finally asked to look at the coverage scores across the survey region to see if this design has even coverage. There are some points with lower coverage around the survey region boundary. This is actually down to the fact we are using a minus sampling strategy. If we plotted coverage scores from a systematic parallel design we would see a similar pattern.

Usually edge effects from minus sampling are minor unless we have a very long survey region boundary containing a small study area. If using a zigzag design was causing us issues with coverage we would expect to see higher coverage at the very top or very bottom of the survey region (as our design angle is 0). We do not see this. The survey region boundaries at the top and bottom are both quite wide and perpendicular to the design angle, in this situation zigzag designs perform well with regard to even coverage.

Point Transect Bird Survey in Tentsmuir Forest

Answers

What are the analysis implications of a design with unequal coverage?

As our two strata have different coverage we should analyse them separately. We therefore need to make sure that we have sufficient transects in each strata to perform an analysis - ideally 20. There are two reasons that we should analyse them separately. Our covered area will not be representative of the study area as a whole. If density is higher or lower in one strata than the other we will get a biased estimate of abundance for the area as a whole using the standard distance sampling estimators. Pooling robustness between the two strata will no longer apply and it may be the case that detection functions differ between the two strata. We will also no longer have a representative sample of observations across the entire study region.

Coverage

Organise the study area shape file.

shapefile.name <- system.file("extdata", "TentsmuirUnproj.shp", 
                              package = "dssd")
sf.shape <- read_sf(shapefile.name)
st_crs(sf.shape)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["latitude",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["longitude",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
proj4string <- "+proj=aea +lat_1=56 +lat_2=62 +lat_0=50 +lon_0=-3 +x_0=0 
                +y_0=0 +ellps=intl +units=m"
projected.shape <- st_transform(sf.shape, crs = proj4string)
region.tm <- make.region(region.name = "Tentsmuir",
                         strata.name = c("Main Area", "Morton Lochs"),
                         shape = projected.shape)

Create the coverage grid.

cover.tm <- make.coverage(region.tm, n.grid.points = 5000)
design.tm <- make.design(region = region.tm,
                         transect.type = "point",
                         design = "systematic",
                         samplers = c(25,15),
                         design.angle = 0,
                         edge.protocol = "minus",
                         truncation = 100,
                         coverage.grid = cover.tm)
survey.tentsmuir <- generate.transects(design.tm)
print(survey.tentsmuir)

   Strata Main Area:
   __________________
Design:  systematically spaced transects
Spacing:  751.2295
Number of samplers:  25
Design angle:  0
Edge protocol:  minus
Covered area:  758758.5
Strata coverage: 5.38%
Strata area:  14108643

   Strata Morton Lochs:
   _____________________
Design:  systematically spaced transects
Spacing:  218.3674
Number of samplers:  15
Design angle:  0
Edge protocol:  minus
Covered area:  412791.9
Strata coverage: 57.71%
Strata area:  715264.9

   Study Area Totals:
   _________________
Number of samplers:  40
Covered area:  1171550
Average coverage: 7.9%
Answers

What spacing was used in each strata to try and achieve the desired number of samplers? Did your survey achieve exactly the number of samplers you requested? How much does coverage differ between the two strata for this realisation?

A spacing of 751m was used in the main stratum and 218m in the Morton Lochs stratum - these values are calculated based on the stratum areas and should not vary between surveys generated from the same design. You may or may not have achieved the number of transects you requested, this will depend on the random start point calculated for your particular survey. There will also be some variability in coverage, my survey achieved a coverage of 5.7% in the main strata and 64.8% in the Morton Loch strata.

coverage.tentsmuir <- run.coverage(design.tm, reps=250, quiet=TRUE)
print(coverage.tentsmuir)

   Strata Main Area:
   __________________
Design:  systematically spaced transects
Spacing:  NA
Number of samplers:  25
Design angle:  0
Edge protocol:  minus

   Strata Morton Lochs:
   _____________________
Design:  systematically spaced transects
Spacing:  NA
Number of samplers:  15
Design angle:  0
Edge protocol:  minus

Strata areas:  14108643, 715265
Region units:  m
Coverage Simulation repetitions:  250

    Number of samplers:
    
        Main Area Morton Lochs Total
Minimum      22.0         12.0  36.0
Mean         24.9         15.0  39.9
Median       25.0         15.0  40.0
Maximum      27.0         18.0  44.0
sd            1.0          1.2   1.5

    Covered area:
    
        Main Area Morton Lochs      Total
Minimum 673945.64    347810.04 1067057.29
Mean    761859.83    416894.79 1178754.63
Median  763209.08    415642.26 1180365.83
Maximum 819995.66    469139.40 1282511.60
sd       30136.67     27749.45   40029.67

    % of region covered:
    
        Main Area Morton Lochs Total
Minimum      4.78        48.63  7.20
Mean         5.40        58.29  7.95
Median       5.41        58.11  7.96
Maximum      5.81        65.59  8.65
sd           0.21         3.88  0.27

    Coverage Score Summary:
    
         Main Area Morton Lochs      Total
Minimum 0.00800000    0.2640000 0.00800000
Mean    0.05403951    0.5865167 0.07960864
Median  0.05200000    0.6200000 0.05200000
Maximum 0.11200000    0.7360000 0.73600000
sd      0.01680356    0.1122782 0.11762456
Answers

View the design statistics. What is the minimum number of samplers you will achieve in each strata? Is this sufficient to complete separate analyses in each stratum?

My design statistics indicated I should achieve between 22 and 27 transects in the main stratum and between 12 and 18 in the Morton Lochs stratum. I might be a bit concerned about the possibility of only achieving 12 transects in the Morton Lochs stratum (remember I cannot just discard a survey due to the number of transects and generate another as it will affect my coverage properties).

Whether this is sufficient will depend on a number of things such as a) objectives of the study, b) number of detections per transect etc. Information from a pilot study would be useful to help decide how many transects are required as a minimum.

plot(coverage.tentsmuir, strata=1)
plot(coverage.tentsmuir, strata=2)

Coverage scores main stratum Tentsmuir Forest.

Coverage scores Morton Lochs stratum Tentsmuir Forest.
Answers

Does it appear that you that there is even coverage within strata?

The main strata looks to have fairly uniform coverage. The values appear to have such small levels of variability that the variability that is seen will be down to stochasticity as it is seen across the entire strata. The Morton Lochs strata we can see has areas of lower coverage around the edge of the study region. This grid is a bit too coarse to allow us to properly judge how much of an issue edge effects will be in this strata. It may be wise to re-run the coverage simulation with a finer coverage grid and more repetitions too. Edge effects could potentially be problematic in such small areas.