mrds 2.2.5 and Distance 1.0.3

Posted 08 July 2021 by Dave Miller

R packages mrds 2.2.5 and Distance 1.0.3 are now on CRAN.

New versions of mrds and Distance are now available on CRAN. There are a couple of big new features and improvements so I’ve added a bit more description here about them. The full changelog is listed below too for those who want all the details.

You can update using:

update.packages(oldPkgs=c("mrds", "Distance"))

Better optimisation in mrds (and Distance)

Thanks to funding from the University of St Andrews’ Knowledge Exchange and Impact Fund we were able to implement some big improvements to the optimisation procedure used to fit detection functions in mrds (which Distance uses under-the-hood too). This has brought results in line with those from Distance for Windows and made the detection function fitting process both more reliable and faster.

Improvements focussed on the cases where montonicity was not constrained - i.e., when we don’t enforce the constraint that the detection function is always flat or decreasing with increasing distance. This constraint is the default for CDS analyses with adjustment terms included in the model, but not for MCDS (i.e., when covariates are included in the detection function model). That being said, there were other improvements to how integration of the detection function is performed that should benefit all models fitted by Distance and most fitted by mrds.

More information in abundance and density results

When using mrds::dht and Distance::dht2, more information is now reported about the options which are being used. For mrds::dht the encounter rate variance estimator is now reported. Additionally, for Distance::dht2 we also report the stratification scheme used, along with whether multipliers and sample fraction have been specified.

Bootstraps are faster

Distance::bootdht() can now run in parallel. When you set the argument cores > 1, models will be fitted over cores cores of your computer. Be careful not to set cores higher than one less than the number of physical cores on your computer, else you might not be able to do anything else while your bootstrap runs. Also, unfortunately progress bars can’t be shown when running in parallel (we’re working on it!) so it’s best to make sure your bootstrap works with a small number of replicates first not in parallel, then in parallel, then finally the full number of replicates.

Multiple multipliers and sample fractions

When using Distance::dht2 you can now specify multiple multipliers (varying according to stratum, for example) and sample fractions (for each transect). Making abundance and density estimation more flexible.

Distance is more talkative

When fitting detection functions with Distance::ds(), more information about the fitting procedure is printed out, making it easier to work out what went wrong (and potentially sending debug information to the distance sampling mailing list or reporting bugs on github). You can make ds() less talkative using the quiet=TRUE option.

“Should I re-run previous analyses?”

It is worth re-running previous analyses (refitting detection functions) with these new versions as the optimisation improvements may give better results (detection function parameters with better likelihoods). If you do this and the likelihood (or equivalently AIC) gets worse, then please let us know!

Bugs have been fixed in confidence interval calculations for abundance estimates in clustered populations for mrds::dht and Distance::dht2 (when stratification="geographical"), so these should be re-run. Finally the stratification="effort_sum" and stratification=replicate modes of Distance::dht2 have had several bugs fixed, so any analyses using these stratification types should also be re-run.

Thank you!

Most of the bug fixes are thanks to users reporting issues on the distance sampling mailing list. We wanted to thank you all for bearing with us and helping is get to the bottom of your problems and making the software better for everyone. Thank you!

Full changelog

mrds 2.2.5

  • use “probabalists” definition of Hermite polynomials, as from Distance. More numerically stable
  • remove setting of Hermite parameter to 1 (unclear why this was the case!)
  • refinement of adjustment-key-all outer optimisation, optimization is now only over the subset of parameters, rather than holding one parameter constant
  • refine outer optimization, using best previous values (by likelihood) rather than last values. Use optimizer’s convergence diagnostic to assess outer convergence.
  • Refinement of “inner” optimization ( (1) simplification of stopping rules (one while() loop rather than two), (2) parameters are nudged only when bounds have not been hit, if bounds have been hit then they are expanded
  • Rescaling of covariate models’ parameters (when scaling difference was large) was inverted, causing all kinds of issues.
  • Made the scaling kick-in at smaller scales.
  • Removed inner ( while() loop dependence on bounded status, since that didn’t seem to make sense
  • Stop “correcting” infinite/NaN integrals to small numbers as this was misleading the optimizer to think these were “good” values
  • Refine constrained optimisation to use actual starting values once, then use random start points and compare the two.
  • handle the case where a model failed in AIC adjustment term selection, montonicity check would throw an error
  • assign $g(x)=0$ for $g(x)<0$ when integrating the detection function (but check post-optimisation that this is not a problem!)
  • fix bug in predict.ds when uniform key was used with binned data (Thanks to Noémie Cappelle for reporting this issue!)
  • dht now prints additional information about the variance estimators used
  • errors now thrown when more parameters than data (either unique distance values or bins)

Distance 1.0.3

  • fix bug in dht2 where warnings were thrown if object column was not in the flatfile (
  • removed silent=TRUE in try() around model fitting to enable users to get error messages from mrds during fitting. Old behaviour can be recovered using quiet=TRUE argument to ds()
  • better handling of when models fail to converge during AIC adjustment term selection
  • documentation now in rmarkdown format
  • fix issue #85 when species was used in the detection function and for post-stratification. Thanks to jason-airst for reporting the bug.
  • fix dht2 bug where stratification="replicate" variance estimation was 0 due to order of operations
  • fix dht2 bug where stratification="effort_sum" encounter rate variance estimation, due to incorrect grouping of transects into strata. Thanks to Samantha Ball and Jamie McKaughan for reporting this issue.
  • bootdht can now run in parallel via the foreach/doParallel packages, see the cores argument.
  • multiple multipliers can now be specified, for example to have different creation/decay rates for each stratum
  • new argument er.method to ds(), allows further refinement of encounter rate variance calculation. Default 2 is as before, use er.method=1 to get results which match Distance for Windows.
  • fix issues with Satterthwaite degrees of freedom calculations when geographical stratification was used with clustered observations
  • Sample fraction may now be specified as a data.frame if fractions are different for each transect
  • Fix various bugs in dht2 when stratification="replicate", thanks to Sam Ball and Jamie McKaughan for reporting issues and testing.