# Multivariate smoothing, model selection

### Recap

• How GAMs work
• How to include detection info
• Simple spatial-only models
• How to check those models

## Univariate models are fun, but...

### Ecology is not univariate

• Many variables affect distribution
• Want to model the right ones
• Select between possible models
• Smooth term selection
• Response distribution
• Large literature on model selection

## Models with multiple smooths

• Already know that + is our friend
• Add everything then remove smooth terms?
dsm_all <- dsm(count~s(x, y) +
s(Depth) +
s(DistToCAS) +
s(SST) +
s(EKE) +
s(NPP),
ddf.obj=df_hr,
segment.data=segs, observation.data=obs,
family=tw())


## Now we have a huge model, what do we do?

### Smooth term selection

• Classically, two main approaches
• Both have problems
• Usually use $$p$$-values

Stepwise selection - path dependence

All possible subsets - computationally expensive (fishing?)

### p-values

• $$p$$-values can calculate
• Test for zero effect of a smooth
• They are approximate for GAMs (but useful)
• Reported in summary

### p-values example

summary(dsm_all)


Family: Tweedie(p=1.25)

Formula:
count ~ s(x, y) + s(Depth) + s(DistToCAS) + s(SST) + s(EKE) +
s(NPP) + offset(off.set)

Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -20.6369     0.2752     -75   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
edf Ref.df     F p-value
s(x,y)       5.236  7.169 1.233  0.2928
s(Depth)     3.568  4.439 6.640 1.6e-05 ***
s(DistToCAS) 1.000  1.000 1.503  0.2205
s(SST)       5.927  6.987 2.067  0.0407 *
s(EKE)       1.763  2.225 2.577  0.0696 .
s(NPP)       2.393  3.068 0.855  0.4680
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.224   Deviance explained = 44.8%
-REML = 368.97  Scale est. = 3.9334    n = 949


### Shrinkage or extra penalties

• Use penalty to remove terms during fitting
• Two methods
• Basis s(..., bs="ts") - thin plate splines with shrinkage
• nullspace should be shrunk less than the wiggly part
• dsm(..., select=TRUE) - extra penalty
• no assumption of how much to shrink the nullspace

### Shrinkage example

dsm_ts_all <- dsm(count~s(x, y, bs="ts") +
s(Depth, bs="ts") +
s(DistToCAS, bs="ts") +
s(SST, bs="ts") +
s(EKE, bs="ts") +
s(NPP, bs="ts"),
ddf.obj=df_hr,
segment.data=segs, observation.data=obs,
family=tw())


### Shrinkage example

summary(dsm_ts_all)


Family: Tweedie(p=1.277)

Formula:
count ~ s(x, y, bs = "ts") + s(Depth, bs = "ts") + s(DistToCAS,
bs = "ts") + s(SST, bs = "ts") + s(EKE, bs = "ts") + s(NPP,
bs = "ts") + offset(off.set)

Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  -20.260      0.234  -86.59   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
edf Ref.df     F  p-value
s(x,y)       1.888e+00     29 0.705 3.56e-06 ***
s(Depth)     3.679e+00      9 4.811 2.15e-10 ***
s(DistToCAS) 9.339e-05      9 0.000   0.6797
s(SST)       3.827e-01      9 0.063   0.2160
s(EKE)       8.196e-01      9 0.499   0.0178 *
s(NPP)       3.570e-04      9 0.000   0.8359
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =   0.11   Deviance explained =   35%
-REML = 385.04  Scale est. = 4.5486    n = 949


### Extra penalty example

dsm_sel <- dsm(count~s(x, y) +
s(Depth) +
s(DistToCAS) +
s(SST) +
s(EKE) +
s(NPP),
ddf.obj=df_hr,
segment.data=segs, observation.data=obs,
family=tw(), select=TRUE)


### Extra penalty example

summary(dsm_sel)


Family: Tweedie(p=1.266)

Formula:
count ~ s(x, y) + s(Depth) + s(DistToCAS) + s(SST) + s(EKE) +
s(NPP) + offset(off.set)

Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -20.4285     0.2454  -83.23   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
edf Ref.df     F  p-value
s(x,y)       7.694e+00     29 1.272 2.67e-07 ***
s(Depth)     3.645e+00      9 4.005 3.24e-10 ***
s(DistToCAS) 1.944e-05      9 0.000   0.7038
s(SST)       2.010e-04      9 0.000   0.8216
s(EKE)       1.417e+00      9 0.630   0.0127 *
s(NPP)       2.318e-04      9 0.000   0.5152
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =   0.16   Deviance explained = 40.8%
-REML =    378  Scale est. = 4.2252    n = 949


### EDF comparison

             allterms select     ts
s(x,y)          5.236 7.6936 1.8875
s(Depth)        3.568 3.6449 3.6794
s(DistToCAS)    1.000 0.0000 0.0001
s(SST)          5.927 0.0002 0.3827
s(EKE)          1.763 1.4174 0.8196
s(NPP)          2.393 0.0002 0.0004


### Double penalty can be slow

• Lots of smoothing parameters to estimate
length(dsm_ts_all$sp)  [1] 6  length(dsm_sel$sp)

[1] 12


## Let's employ a mixture of these techniques

### How do we select smooth terms?

1. Look at EDF
• Terms with EDF<1 may not be useful
• These can usually be removed
2. Remove non-significant terms by $$p$$-value
• Decide on a significance level and use that as a rule

(In some sense leaving “shrunk” terms in is more “consistent”, but can be computationally annoying)

## Example of selection

### Selecting smooth terms


Family: Tweedie(p=1.277)

Formula:
count ~ s(x, y, bs = "ts") + s(Depth, bs = "ts") + s(DistToCAS,
bs = "ts") + s(SST, bs = "ts") + s(EKE, bs = "ts") + s(NPP,
bs = "ts") + offset(off.set)

Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  -20.260      0.234  -86.59   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
edf Ref.df     F  p-value
s(x,y)       1.888e+00     29 0.705 3.56e-06 ***
s(Depth)     3.679e+00      9 4.811 2.15e-10 ***
s(DistToCAS) 9.339e-05      9 0.000   0.6797
s(SST)       3.827e-01      9 0.063   0.2160
s(EKE)       8.196e-01      9 0.499   0.0178 *
s(NPP)       3.570e-04      9 0.000   0.8359
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =   0.11   Deviance explained =   35%
-REML = 385.04  Scale est. = 4.5486    n = 949