dsm
1.1.3 to 2.1.511 November 2013
mrds
parts of the analysis in this document, as far as I know there haven’t been any changes to mrds
interface.dsm
and have left all arguments named, so you know what’s what.Old call example:
dsm.14 < dsm.fit(ddfobject=ddf.2, response='indiv.abund',
model.defn=list(fn='gam',family='quasipoisson'),
wghts=NULL, link='log', formula=~s(longitude),
obsdata=obs.dat.14, segdata=sample.dat.14,
convert.units=.001)
In new dsm
this should be specified:
dsm.14.new < dsm(ddf.obj=ddf.2, engine='gam', group=FALSE,
family=quasipoisson(log), weights=NULL,
formula=N~s(longitude),
observation.data=obs.dat.14,
segment.data=sample.dat.14,
convert.units=.001)
response
and formula
have been merged. The LHS of formula
is a kind of pseudoformula that tells dsm
what to calculate. Whether group or individual abundance are estimated is now in the hands of group
(more below). In the example I have the mrds
model does not have covariates (hence N
as the response) and we want individual abundance (hence group=FALSE
, see below).The response can be one of the following:
Response  Description 

N , abundance

count in each segment 
Nhat , abundance.est

estimated abundance per segment, estimation is via a HorvitzThompson estimator. This should be used there are covariates in the detection function. 
D , density

density per segment 
presence 
interpret the data as presence/absence (remember to change the family argument to binomial() ) 
link
is now included in the family
as is standard for glm
/gam
specifications.reponse
(selecting whether group or individual abundance are estimated) has now been given to group
. If group
is TRUE
then we estimate group abundance.observation.data
and segment.data
are both specified in ?"dsmdata"
. See also ?dsm:::check.cols
(note on this in “Other stuff”, below).summary
on a dsm
object will give you exactly the summary you would get for the equivalent gam
object. Anything you want to extract should be obtained as given in ?mgcv:::gamObject
.dsm
object contains an extra element ddf
(if there was a detection function part of the model, of course) which is just the ddf.obj
that it was called with.family
can be supplied that can be supplied to gam
.availability
lets you give an availability bias, if you have one estimated (probably not necessary to include in Distance).engine
can take the values glm
, gam
, gamm
or bam
. There’s a certain amount of extra stuff you’d need for gamm
and bam
. If you want information on these options, I can provide it.Old call example:
dsm.predict.15 < dsm.predict(model=dsm.14,newdata=grid.15, off.set=444)
New call example:
dsm.predict.15 < predict(model=dsm.14,newdata=grid.15, off.set=444)
No real change here except that we now call down to the appropriate predict
method (for gam
, glm
etc). You could also ignore off.set
and have it as a variable in newdata
, if there is a column called off.set
in newdata
, then the function argument off.set
is ignored.
Returned object is exactly as before (vector of length nrow(newdata)
), so you need to sum()
for total abundance.
Looks like in the old regime, you had to do something funny merge
ing data into the model object (picking out relevant lines only):
var.dat < read.table(file='C:\\Users\\eric\\AppData\\Local\\Temp\\dst21303\\var.dat.r',
header=TRUE, sep='\t', comment.char='')
dsm.tmp < dsm.14
dsm.tmp$result$data < merge(dsm.tmp$result$data, var.dat)
I think this was to include the Bootstrap.Sample.Label
into the model data. If that was the case then you can use the code above jsut removing the $result
in the above.
Old call example:
dsm.var.16 < dsm.var.movblk(dsm.object=dsm.tmp, pred.data = grid.15,
n.boot=10, block.size=3, off.set=444,
samp.unit.name='Bootstrap.Sample.Label',
progress.file='C:\\Users\\eric\\AppData\\Local\\Temp\\dst21303\\bootprog.txt',
bar = FALSE)
New call example:
dsm.var.16 < dsm.var.movblk(dsm.object=dsm.tmp, pred.data = grid.15,
n.boot=10, block.size=3, off.set=444,
samp.unit.name='Bootstrap.Sample.Label',
progress.file='C:\\Users\\eric\\AppData\\Local\\Temp\\dst21303\\bootprog.txt',
bar = FALSE)
bs.file="<PATH>"
will save each bootstrap replicate to file.ds.uncertainty=TRUE
will incorporate detection function uncertainty directly into the bootstrap. See ?dsm.var.movblk
.summary
and plot
on the returned object to get useful statistics and a plot of the coefficient of variation. The plot
method uses ggplot2
and requires that there are width
and height
columns in the prediction data.The two other functions dsm.var.gam
and dsm.var.prop
also estimate the variance. The first by standard GAM theory (using the delta method to combine detection function and GAM uncertainties), the second using MVB’s trick. The respective manual pages for these functions should provide all the information you require.
gam.check
on the resulting dsm
object to get the usual plots. gam.check
now has a neat little “is k
big enough?” test that gives you some idea of whether you gave your splines enough “wigglyness”.rqgam.check
makes the standard gam.check
plots using randomised quantile residuals (Dunn and Smyth, 1996), which seem like a better option for DSMs.dsm.cor
to get a correlogram.method="REML"
to avoid nasty convergence issues. By default dsm
runs with method="GCV.Cp"
(the default for gam
). If you allow REML then you probably want to warn people that you can’t compare REML scores (and derived AICs etc) when the fixed effects are different....
of the dsm
call goes straight to gam
.dsm:::check.cols
will check that the columns that dsm
wants are in the data.dsm:::make.data
builds the data.frame
to pass to gam
.make.soapgrid
will make a grid of points inside a polygon, which can be useful for soap film smoother knot grids.Dunn, P K, and G K Smyth. “Randomized Quantile Residuals.” Journal of Computational and Graphical Statistics 5, no. 3 (1996): 236–244.