Goodness of fit testing for detection function models. For continuous distances Kolmogorov-Smirnov and Cramer-von Mises tests can be used, when binned or continuous distances are used a \(\chi^2\) test can be used.
Usage
gof_ds(
model,
plot = TRUE,
chisq = FALSE,
nboot = 100,
ks = FALSE,
nc = NULL,
breaks = NULL,
...
)
Arguments
- model
a fitted detection function.
- plot
if
TRUE
the Q-Q plot is plotted- chisq
if
TRUE
then chi-squared statistic is calculated even for models that use exact distances. Ignored for models that use binned distances- nboot
number of replicates to use to calculate p-values for the Kolmogorov-Smirnov goodness of fit test statistics
- ks
perform the Kolmogorov-Smirnov test (this involves many bootstraps so can take a while)
- nc
number of evenly-spaced distance classes for chi-squared test, if
chisq=TRUE
- breaks
vector of cutpoints to use for binning, if
chisq=TRUE
- ...
other arguments to be passed to
ddf.gof
Details
Kolmogorov-Smirnov and Cramer-von Mises tests are based on looking at the
quantile-quantile plot produced by qqplot.ddf
and
deviations from the line \(x=y\).
The Kolmogorov-Smirnov test asks the question "what's the largest vertical distance between a point and the \(y=x\) line?" It uses this distance as a statistic to test the null hypothesis that the samples (EDF and CDF in our case) are from the same distribution (and hence our model fits well). If the deviation between the \(y=x\) line and the points is too large we reject the null hypothesis and say the model doesn't have a good fit.
Rather than looking at the single biggest difference between the y=x line and the points in the Q-Q plot, we might prefer to think about all the differences between line and points, since there may be many smaller differences that we want to take into account rather than looking for one large deviation. Its null hypothesis is the same, but the statistic it uses is the sum of the deviations from each of the point to the line.
A chi-squared test is also run if chisq=TRUE
. In this case binning of
distances is required if distance data are continuous. This can be specified
as a number of equally-spaced bins (using the argument nc=
) or the
cutpoints of bins (using breaks=
). The test compares the number of
observations in a given bin to the number predicted under the fitted
detection function.
Details
Note that a bootstrap procedure is required for the Kolmogorov-Smirnov test
to ensure that the p-values from the procedure are correct as the we are
comparing the cumulative distribution function (CDF) and empirical
distribution function (EDF) and we have estimated the parameters of the
detection function. The nboot
parameter controls the number of bootstraps
to use. Set to 0
to avoid computing bootstraps (much faster but with no
Kolmogorov-Smirnov results, of course).