# Multivariate smoothing, model selection

David L Miller

### Recap

• How GAMs work
• How to include detection info
• Simple spatial-only models
• How to check those models

## Univariate models are fun, but...

### Ecology is not univariate

• Many variables affect distribution
• Want to model the right ones
• Select between possible models
• Smooth term selection
• Response distribution
• Large literature on model selection

## Tobler's first law of geography

“Everything is related to everything else, but near things are more related than distant things”

Tobler (1970)

### Implications of Tobler's law

Covariates are not only correlated (linearly)…

…they are also “concurve”

• Careful inclusion of smooths
• Fit models using robust criteria (REML)
• Test for concurvity
• Test for sensitivity

## Models with multiple smooths

• Already know that + is our friend
• Add everything then remove smooth terms?
dsm_all_tw <- dsm(count~s(x, y, bs="ts") +
s(Depth, bs="ts") +
s(DistToCAS, bs="ts") +
s(SST, bs="ts") +
s(EKE, bs="ts") +
s(NPP, bs="ts"),
ddf.obj=df_hr,
segment.data=segs, observation.data=obs,
family=tw(), method="REML")


## Now we have a huge model, what do we do?

### Smooth term selection

• Classically two main approaches:
• Stepwise - path dependence
• All possible subsets - computationally expensive

### Removing terms by shrinkage

• Remove smooths using a penalty (shrink the EDF)
• Basis "ts" - thin plate splines with shrinkage
• “Automatic”

### p-values

• $$p$$-values can be used
• They are approximate
• Reported in summary
• Generally useful though

## Let's employ a mixture of these techniques

### How do we select smooth terms?

1. Look at EDF
• Terms with EDF<1 may not be useful
• These can usually be removed
2. Remove non-significant terms by $$p$$-value
• Decide on a significance level and use that as a rule

## Example of selection

### Selecting smooth terms


Family: Tweedie(p=1.277)

Formula:
count ~ s(x, y, bs = "ts") + s(Depth, bs = "ts") + s(DistToCAS,
bs = "ts") + s(SST, bs = "ts") + s(EKE, bs = "ts") + s(NPP,
bs = "ts") + offset(off.set)

Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  -20.260      0.234  -86.59   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
edf Ref.df     F  p-value
s(x,y)       1.888e+00     29 0.705 3.56e-06 ***
s(Depth)     3.679e+00      9 4.811 2.15e-10 ***
s(DistToCAS) 3.936e-05      9 0.000   0.6798
s(SST)       3.831e-01      9 0.063   0.2160
s(EKE)       8.196e-01      9 0.499   0.0178 *
s(NPP)       1.587e-04      9 0.000   0.8361
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =   0.11   Deviance explained =   35%
-REML = 385.04  Scale est. = 4.5486    n = 949