mrds 2.2.5 and Distance 1.0.3
Posted 08 July 2021 by Dave Miller
mrds 2.2.5 and
Distance 1.0.3 are now on CRAN.
New versions of
Distance are now available on CRAN. There are a couple of big new features and improvements so I’ve added a bit more description here about them. The full changelog is listed below too for those who want all the details.
You can update using:
Better optimisation in
Thanks to funding from the University of St Andrews’ Knowledge Exchange and Impact Fund we were able to implement some big improvements to the optimisation procedure used to fit detection functions in
Distance uses under-the-hood too). This has brought results in line with those from Distance for Windows and made the detection function fitting process both more reliable and faster.
Improvements focussed on the cases where montonicity was not constrained - i.e., when we don’t enforce the constraint that the detection function is always flat or decreasing with increasing distance. This constraint is the default for CDS analyses with adjustment terms included in the model, but not for MCDS (i.e., when covariates are included in the detection function model). That being said, there were other improvements to how integration of the detection function is performed that should benefit all models fitted by
Distance and most fitted by
More information in abundance and density results
Distance::dht2, more information is now reported about the options which are being used. For
mrds::dht the encounter rate variance estimator is now reported. Additionally, for
Distance::dht2 we also report the stratification scheme used, along with whether multipliers and sample fraction have been specified.
Bootstraps are faster
Distance::bootdht() can now run in parallel. When you set the argument
cores > 1, models will be fitted over
cores cores of your computer. Be careful not to set
cores higher than one less than the number of physical cores on your computer, else you might not be able to do anything else while your bootstrap runs. Also, unfortunately progress bars can’t be shown when running in parallel (we’re working on it!) so it’s best to make sure your bootstrap works with a small number of replicates first not in parallel, then in parallel, then finally the full number of replicates.
Multiple multipliers and sample fractions
Distance::dht2 you can now specify multiple multipliers (varying according to stratum, for example) and sample fractions (for each transect). Making abundance and density estimation more flexible.
Distance is more talkative
When fitting detection functions with
Distance::ds(), more information about the fitting procedure is printed out, making it easier to work out what went wrong (and potentially sending debug information to the distance sampling mailing list or reporting bugs on github). You can make
ds() less talkative using the
“Should I re-run previous analyses?”
It is worth re-running previous analyses (refitting detection functions) with these new versions as the optimisation improvements may give better results (detection function parameters with better likelihoods). If you do this and the likelihood (or equivalently AIC) gets worse, then please let us know!
Bugs have been fixed in confidence interval calculations for abundance estimates in clustered populations for
stratification="geographical"), so these should be re-run. Finally the
stratification=replicate modes of
Distance::dht2 have had several bugs fixed, so any analyses using these stratification types should also be re-run.
Most of the bug fixes are thanks to users reporting issues on the distance sampling mailing list. We wanted to thank you all for bearing with us and helping is get to the bottom of your problems and making the software better for everyone. Thank you!
- use “probabalists” definition of Hermite polynomials, as from Distance. More numerically stable
- remove setting of Hermite parameter to 1 (unclear why this was the case!)
- refinement of adjustment-key-all outer optimisation, optimization is now only over the subset of parameters, rather than holding one parameter constant
- refine outer optimization, using best previous values (by likelihood) rather than last values. Use optimizer’s convergence diagnostic to assess outer convergence.
- Refinement of “inner” optimization (
detfct.fit.opt): (1) simplification of stopping rules (one while() loop rather than two), (2) parameters are nudged only when bounds have not been hit, if bounds have been hit then they are expanded
- Rescaling of covariate models’ parameters (when scaling difference was large) was inverted, causing all kinds of issues.
- Made the scaling kick-in at smaller scales.
- Removed inner (
while()loop dependence on bounded status, since that didn’t seem to make sense
- Stop “correcting” infinite/
NaNintegrals to small numbers as this was misleading the optimizer to think these were “good” values
- Refine constrained optimisation to use actual starting values once, then use random start points and compare the two.
- handle the case where a model failed in AIC adjustment term selection, montonicity check would throw an error
- assign $g(x)=0$ for $g(x)<0$ when integrating the detection function (but check post-optimisation that this is not a problem!)
- fix bug in
predict.dswhen uniform key was used with binned data (Thanks to Noémie Cappelle for reporting this issue!)
dhtnow prints additional information about the variance estimators used
- errors now thrown when more parameters than data (either unique distance values or bins)
- fix bug in
dht2where warnings were thrown if object column was not in the flatfile (https://github.com/DistanceDevelopment/Distance/issues/83)
try()around model fitting to enable users to get error messages from
mrdsduring fitting. Old behaviour can be recovered using
- better handling of when models fail to converge during AIC adjustment term selection
- documentation now in rmarkdown format
- fix issue #85 when species was used in the detection function and for post-stratification. Thanks to jason-airst for reporting the bug.
stratification="replicate"variance estimation was 0 due to order of operations
stratification="effort_sum"encounter rate variance estimation, due to incorrect grouping of transects into strata. Thanks to Samantha Ball and Jamie McKaughan for reporting this issue.
bootdhtcan now run in parallel via the
doParallelpackages, see the cores argument.
- multiple multipliers can now be specified, for example to have different creation/decay rates for each stratum
- new argument
ds(), allows further refinement of encounter rate variance calculation. Default 2 is as before, use
er.method=1to get results which match Distance for Windows.
- fix issues with Satterthwaite degrees of freedom calculations when geographical stratification was used with clustered observations
- Sample fraction may now be specified as a
data.frameif fractions are different for each transect
- Fix various bugs in
stratification="replicate", thanks to Sam Ball and Jamie McKaughan for reporting issues and testing.