The Eco-Stats package contains functions and data supporting the
Eco-Stats text (Warton, 2022, Springer), and solutions to exercises.
Functions include tools for using simulation envelopes in diagnostic
plots, also applicable to multivariate linear models, and a parametric
bootstrap function that can be used in place of an `anova`

call for hypothesis testing via simulation, for many types of regression
models. Datasets mentioned in the package are included here (where not
available elsewhere) and vignettes work through code chunks and
exercises from the textbook, one chapter at a time.

The command `plotenvelope`

will take a fitted object and
construct standard residual plots with global envelopes added around
points (for quantile plots) and around smoothers (for residual plots),
constructed via simulation. These are constructed using the
`GET`

package as global envelopes, which is important for
interpretation – it means that when model assumptions are correct, 95%
of quantile plot envelopes (at confidence level 95%) should contain the
data (or smoother) over the *whole* plot. (Pointwise envelopes
would have been easier to construct, but are much harder to interpret
because they don’t control the chance of missing the data globally,
across the whole plot. A 95% pointwise envelope, constructed when
assumptions are satisfied, might for example miss *some* of the
data 60% of the time.)

`plotenvelope`

will work on lots of different types of
fitted objects – pretty much anything that comes with a
`simulate`

function that behaves in a standard way. A
`simulate.mlm`

and `simulate.manyglm`

functions
have been written in this package specifically so that
`plotenvelope`

also works for multivariate models fitted
using `lm`

or `mvabund::manyglm`

.

```
library(ecostats)
data(iris)
Y = with(iris, cbind(Sepal.Length,Sepal.Width,Petal.Length,Petal.Width))
iris.mlm=lm(Y~Species,data=iris)
# check normality assumption:
par(mfrow=c(1,2),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))
plotenvelope(iris.mlm,n.sim=199)
```

For `mlm`

objects, this function will compute conditional
residuals and fitted values, that is, they are computed for each
response conditional on all other responses being observed, via the
`cpredict`

and `cresiduals`

functions. This is
done because the full conditionals of a distribution are known to be
diagnostic of joint distributions, hence any violation of the
multivariate normality assumption will be expressed as a violation of
assumptions of these full conditional models. The full conditionals are
well-known to follow a linear model, as a function of all other
responses as well as predictors.

The `qqenvelope`

function can be applied for a normal
quantile plot, with global envelope, to either a fitted model or a
sample of data:

`anova`

tests using a parametric bootstrapThe command `anovaPB`

computes analysis of variance (or
deviance) tables for two fitted model objects, but with the \({P}\)-value estimated using a parametric
bootstrap – by repeatedly simulating new responses from the fitted model
under the null hypothesis. This will work on lots of different types of
fitted objects – like `plotenvelope`

, it should work on
pretty much anything that comes with a `simulate`

function
that behaves in a standard way. These fitted models also need to respond
to either `anova`

or `logLik`

.

While the interface is written to be a lot like `anova`

,
it requires two fitted objects to be specified – the first being a fit
under the null hypothesis, and the second being the fit under the
alternative.

```
# generate random Poisson data and a predictor:
y = rpois(50,lambda=1)
x = 1:50
# fit a Poisson regressions with and without x:
rpois_glm = glm(y~x,family=poisson())
rpois_int = glm(y~1,family=poisson())
# use the parametric bootstrap to test for an effect of x (will usually be non-significant)
anovaPB(rpois_int,rpois_glm,n.sim=99, ncpus=1)
#>
#> Analysis of Deviance Table
#>
#> Model 1: y ~ 1
#> Model 2: y ~ x
#> Resid. Df Resid. Dev Df Deviance
#> 1 49 58.69
#> 2 48 57.97 1 0.724 0.38
#>
#> P-value calculated by simulating 99 samples from 1.
```

All datasets used in the Eco-Stats text, where not available elsewhere, are supplied in this package. For example:

```
data(aphids)
cols=c(rgb(1,0,0,alpha=0.5),rgb(0,0,1,alpha=0.5)) #transparent colours
with(aphids$oat, interaction.plot(Time,Plot,logcount,legend=FALSE,
col=cols[Treatment], lty=1, ylab="Counts [log(y+1) scale]",
xlab="Time (days since treatment)") )
legend("bottomleft",c("Excluded","Present"),col=cols,lty=1)
```