The ‘lmvar’ package fits a Gaussian linear model. It differs from a classical linear model in that the variance is not constant. Instead, the variance has its own model, comparable to the model for the expected value. The classical linear model is provided by the function ‘lm’ in the package ‘stats’.

Working with the package is a lot like working with ‘lm’. A fit with ‘lmvar’ results in an ‘lmvar’ object, which is a list. Accessor functions are provided to extract the list members, such as the fitted coefficients \(\beta\) and the log-likelihood. Various utility functions such as `residuals`

to calculate residuals, `AIC`

to calculate the AIC, `fitted`

to obtain expected values, standard deviations and confidence intervals, etc., are also provided by the package.

The package is intended for people who run a classical linear model and want to see what happens if the restriction of a constant variance is dropped. Questions in this context are: does the allowance of heteroscedasticity result in a better fit, lower values for the AIC or BIC, smaller prediction errors, etc.?

The package fits the following model. A vector \(Y\) of observations (also called ‘responses’) of length \(n\) is a stochastic vector. It is distributed according to a multivariate Gaussian distribution:

\[\begin{equation} Y \sim \mathcal{N}_n( \mu, \Sigma), \end{equation}\] where \(\mu\) is the vector of expected values and \(\Sigma\) the covariance matrix (also called the ‘variance-covariance matrix’). Just like in the standard linear model, the covariance matrix is taken to be a \(n \times n\) diagonal matrix but contrary to the standard linear model, the diagonal entries need not be all the same: \[\begin{equation} \Sigma_{ij} = \begin{cases} 0 & i \neq j\\ \sigma_i^2 & i=j \end{cases} \end{equation}\]

As in the classical linear model, the vector of expectation values \(\mu\) is given by \[\begin{equation} \mu = X_\mu \beta_\mu \end{equation}\] where \(X_\mu\) is the ‘model matrix’ or ‘design matrix’ for \(\mu\) and \(\beta_\mu\) the parameter vector for \(\mu\). \(X_\mu\) is a \(n \times k_\mu\) matrix and \(\beta_\mu\) a vector of length \(k_\mu\).

Let \(\sigma\) denotes the vector \((\sigma_1, \dots, \sigma_n)\). The model for \(\sigma\) is \[\begin{equation} \log \sigma = X_\sigma \beta_\sigma \end{equation}\] where \(\log \sigma\) stands for the vector \((\log\sigma_1, \dots, \log\sigma_n)\), \(X_\sigma\) is the ‘model matrix’ or ‘design matrix’ for \(\sigma\) and \(\beta_\sigma\) the parameter vector for \(\sigma\). The logarithm is taken to be the ‘natural logarithm’ with base \(e\). The dimensions of \(X_\sigma\) are \(n \times k_\sigma\) and \(\beta_\sigma\) is a vector of length \(k_\sigma\).

The vector of observations \(Y\) and the matrices \(X_\mu\) and \(X_\sigma\) are specified by the user. They must contain real values. The fit returns the maximum-likelihood estimators for \(\beta_\mu\) and \(\beta_\sigma\).

The model for both \(\mu\) and \(\sigma\) contains an intercept term by default. That means that the first column of both matrices is a column in which each matrix-element equals 1. The package will add this column to the user-suppplied matrices to ensure that the intercept term is present. There is no need for a user to include such a column in a user-supplied model-matrix. Intercept terms can be suppressed with the arguments `intercept_mu = FALSE`

and `intercept_sigma = FALSE`

of the function `lmvar`

.

After adding the intercept columns (if not suppressed), the package will check whether the resulting matrices are full rank. If not, columns will be removed from each matrix until it is full rank.

The addition of an intercept column and, possibly, the removal of columns to obtain a full-rank matrix, imply that the actual matrices used in the fit can be different from the user-specified matrices. The matrices that are actually used in the fit are returned as members of the `lmvar`

object.

Carrying out the fit boils down to solving a set of non-linear equations. By default this is carried out in the background by the function `maxNR`

from the package ‘maxLik’ but it is possible to use another function from the same package.

More mathematical details about the model can be found in the vignette ‘Math’ which comes with this package. It can be viewed with `vignette("Math")`

or `vignette("Math", package="lmvar")`

.

The main function in the package is `lmvar`

. It carries out a fit and returns an `lmvar`

object.

The user must specify a vector of observations and two model-matrices when calling `lmvar`

. They must meet a number of conditions, which are described in detail in the on-line help for `lmvar`

, to be viewed with the command `?lmvar`

. The most important conditions to keep in mind are:

- All observations and matrix elements must be real-valued.
- Missing values, values that are
`NaN`

etc., are not allowed. - A matrix either has column names for all columns, or no column names at all. The same column name can appear for both \(X_\mu\) and \(X_\sigma\) but column names should be unique within each matrix. The intercept column that is added by
`lmvar`

(although one can suppress it), is called`(Intercept)`

for \(X_\mu\) and`(Intercept_s)`

for \(X_\sigma\).

With each column in \(X_\mu\) corresponds an element in \(\beta_\mu\). The name of that element is the corresponding column name. The same is true for \(X_\sigma\) and \(\beta_\sigma\).

It can happen that `lmvar`

fails to solve the maximum-likelihood equations and exits with warnings. See here for the options you have in this case.

An `lmvar`

object is a list whose members are intended to be extracted with the supplied accessor and utility functions. The only members for which no such functions have been implemented are:

`y`

: the user-supplied vector of observations`X_mu`

: the actual, full-rank matrix \(X_\mu\) including the intercept term (if not suppressed) that is used in the fit`X_sigma`

: the actual, full-rank matrix \(X_\sigma\) including the intercept term (if not suppressed) that is used in the fit`slvr_log`

: log-output from the function`maxNR`

. Only added to the`lmvar`

object when requested.

Once `lmvar`

has run and an `lmvar`

-object has been created, one can obtain \(\beta_\mu\) and \(\beta_\sigma\) with the function `coef`

. The function `fitted`

allows one to obtain \(\mu\) and \(\sigma\). We refer to the package documentation (in particular the package index which can be viewed with `help(package = "lmvar")`

) for a list of all available functions and function details.

We demonstrate the package with the help of the dataframe `cats`

which can be found in the `MASS`

package.

```
# As example we use the dataset 'cats' from the library 'MASS'.
require(lmvar); require(MASS)
# A plot of the heart weight versus the body weight in the data set
plot(cats$Bwt, cats$Hwt, xlab = "Body weight", ylab = "Heart weight")
```

We want to regress the cats heart weight ‘Hwt’ onto the body weight ‘Bwt’.

```
# Create the model matrix. It only contains the body weight. An intercept term will be added by 'lmvar'.
X = model.matrix(~ Bwt - 1, cats)
# Carry out the fit with the same model matrix for mu (the expected heart weight) and for log sigma (the standard deviation)
fit = lmvar(cats$Hwt, X_mu = X, X_sigma = X)
```

We have now created the object `fit`

, which is our object of class `lmvar`

. To obtain a first impression of the fit, we look at the summary

```
## Call:
## lmvar(y = cats$Hwt, X_mu = X, X_sigma = X)
##
## Number of observations: 144
## Degrees of freedom : 4
##
## Z-scores:
## Min 1Q Median 3Q Max
## -2.4159 -0.7261 -0.0655 0.6703 2.5998
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.012179 0.679820 -0.0179 0.985707
## Bwt 3.904310 0.258964 15.0766 < 2.2e-16 ***
## (Intercept_s) -0.522274 0.337044 -1.5496 0.121244
## Bwt_s 0.315837 0.121843 2.5922 0.009537 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Standard deviations:
## Min 1Q Median 3Q Max
## 1.1156 1.2265 1.3916 1.5422 2.0330
##
## Comparison to model with constant variance (i.e. classical linear model)
## Log likelihood-ratio: 4.069774
## Additional degrees of freedom: 1
## p-value for difference in deviance: 0.00433
```

The first line shows the call that created `fit`

. Then, we are told something about the distribution of the standard scores (also called the z-scores).

Next, the summary shows the matrix with the coefficients \(\beta_\mu\) and \(\beta_\sigma\). The coefficients \(\beta_\mu\) are `(Intercept)`

, and `Bwt`

. The coefficients \(\beta_\sigma\) are `(Intercept_s)`

, and `Bwt_s`

. They are called this way to distinguish them from the coefficients for \(\beta_\mu\). In cases where there is no risk of confusion, the true names of the coefficients \(\beta_\sigma\) will be used, which are `(Intercept_s)`

and `Bwt`

.

The matrix with coefficients shows that `Bwt`

and `Bwt_s`

are statistically significant at the 5% level, but the intercept terms are not.

The next piece of information gives an impression of the distribution of the standard deviations \(\sigma\).

Finally the model is compared to a classical linear model with the same model matrix \(X_\mu\) but a standard deviation that is the same for all observations. The summary shows the difference in log-likelihood between the two models and the difference in degrees of freedom. Twice the difference in log-likelihood is the difference in deviance, for which a p-value is calculated. The p-value of 0.00433, indicates that the `lmvar`

fit is a better fit than the classical linear model at the 5% confidence level. I.e., it makes sense to let the standard deviation vary instead of keeping it fixed.

Another useful, high-level check of the fit is to look at a number of diagnostic plots with the command `plot (fit)`

.

The number of observations in the fit and the log-likelihood are

`## [1] 144`

`## 'log Lik.' -252.991 (df=4)`

The rank of the matrix \(X_\mu\) used in the fit is

`## [1] 2`

The value 2 is correct: the user-supplied matrix had 1 column and `lmvar`

added an intercept column. The two columns are linearly independent so no column had to be removed.

We run the regression again, but this time without intercept terms

The rank of \(X_\mu\) as used in the fit is now equal to 1:

`## [1] 1`

The summary no longer shows the intercept terms:

```
## Call:
## lmvar(y = cats$Hwt, X_mu = X, X_sigma = X, intercept_mu = FALSE,
## intercept_sigma = FALSE)
##
## Number of observations: 144
## Degrees of freedom : 2
##
## Z-scores:
## Min 1Q Median 3Q Max
## -2.2212 -0.6966 -0.0686 0.7168 3.1238
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## Bwt 3.903044 0.044278 88.1489 < 2.2e-16 ***
## Bwt_s 0.134489 0.021302 6.3135 2.728e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Standard deviations:
## Min 1Q Median 3Q Max
## 1.3086 1.3625 1.4378 1.5021 1.6896
```

The summary overview has left out the comparison with the classical linear model as well. The classical linear model and the model in `fit`

are no longer nested models because of the absence on an intercept term in \(X_\sigma\).

Let’s see whether the z-scores appear to be correlated with the body weight

```
sigma = fitted(fit, mu = FALSE)
plot(cats$Bwt, residuals(fit) / sigma, xlab = "Body weight", ylab = "z-score")
abline(0, 0, col = "red")
```

Let’s also plot the average heart weights versus the body weight, with a 95% confidence interval as error bar.

```
mu = fitted(fit, sigma = FALSE)
plot(cats$Bwt, mu, xlab = "Body weight", ylab = "Average heart weight", ylim = c(7, 16))
intervals = fitted(fit, interval = "confidence", level = 0.95)
lwr = intervals[, "mu_lwr"]
upr = intervals[, "mu_upr"]
segments(cats$Bwt, lwr, cats$Bwt, upr)
```

We carry out a classical linear fit, without an intercept term in the model matrix

and we compare AIC and BIC values

`## [1] 512.7867`

`## [1] 518.3905`

`## [1] 518.7263`

`## [1] 524.3301`

Both AIC and BIC favor the fit with `lmvar`

over the fit with `lm`

.

We plot again the expected heart weight versus the body weight, but now we add the predictions of the classical linear fit as red points

```
plot(cats$Bwt, mu, xlab = "Body weight", ylab = "Average heart weight", ylim = c(7, 16))
segments(cats$Bwt, lwr, cats$Bwt, upr)
points(cats$Bwt, fitted(fit_lm), col = "red")
```

Te plot shows that the predicted values from the classical linear model are nearly the same as the predictions from `lmvar`

. We can also plot the fitted standard deviations with their 95% confidence interval:

```
plot(cats$Bwt, sigma, xlab = "Body weight", ylab = "St dev of heart weight", ylim = c(1, 2))
lwr = intervals[, "sigma_lwr"]
upr = intervals[, "sigma_upr"]
segments(cats$Bwt, lwr, cats$Bwt, upr)
sigma_lm = summary(fit_lm)$sigma
abline(sigma_lm, 0, col = "red")
```

The red line is the standard deviation that is returned from the classical linear model.

The coefficients \(\beta_\mu\) and \(\beta_\sigma\) that were displayed in the summary-overview, are obtained by

```
## Bwt Bwt_s
## 3.9030442 0.1344892
```

The suffix ’_s’ is added to show that ‘Bwt_s’ is the coefficient \(\beta_\sigma\). If we ask for \(\beta_\sigma\) only, the real name of the coefficient is used

```
## Bwt
## 0.1344892
```

We conclude this demonstration with the covariance matrix for the combined coefficients \((\beta_\mu, \beta_\sigma)\)

```
## Bwt Bwt_s
## Bwt 0.00196053 0.0000000000
## Bwt_s 0.00000000 0.0004537699
```

Hopefully this demonstration has given an idea of how to work with the package. The documentation of the individual functions contains further examples.

We refer to the package index for a list of all available functions. The index can be viewed with `help(package="lmvar")`

. Various generic functions from the ‘stats’ package work for an ‘lmvar’ object as well. Examples are `BIC`

and `confint`

.

The function `plot.lmvar`

returns a number of diagnostic plots for an ‘lmvar’ object, similar to the function `plot.lm`

for an ‘lm’ object.

The function `plot_qq`

produces a QQ-plot for one or two fits. They can be of class ‘lm’ or class ‘lmvar’. It is also possible that one id of class ‘lm’ and the other of class ‘lmvar’.

The function `plot_qdis`

produces a plot of the distribution of quantiles. Like `plot_qq`

, one can specify two plots, each of class ‘lm’ or class ‘lmvar’.

The package provides the function `fwbw`

for a selection of the independent model-variables (also called the ‘predictors’ or ‘covariates’). It searches for an optimal subset of variables by means of a forward / backward-stepping algorithm. The function works on a ‘lmvar’ object but also on a ‘lm’ object. See the function documentation of `fwbw`

for details.

The package provides the functions `cv.lm`

and `cv.lmvar`

for cross-validation of prediction errors or any other function of choice. The former function works on an ‘lm’ object, the latter on a ‘lmvar’ object. See the function documentation for details.

The package ‘lmvar’is concerned mostly with objects of class ’lmvar’. However, it also contains the function `lmvar_no_fit`

which creates an object of class ‘lmvar_no_fit’. This class is like the class ‘lmvar’ but misses any information that is the result of a model fit.

The class ‘lmvar’ is an extension of the class ‘lmvar_no_fit’. This means that whenever an object of class ‘lmvar_no_fit’ is required, an object of class ‘lmvar’ can be used as well. An example is the function `nobs.lmvar_no_fit`

supplied by this package. It takes as input an object of class ‘lmvar_no_fit’. Therefore it will also accept an object of class ‘lmvar’.

It can happen that the function `lmvar`

exits with a warning that it had trouble carrying out the model fit. The warning will be like ‘Last step could not find a value above the current’ or ‘Log-likelihood appears not to be at a maximum!’. This is means that the iterative fitting procedure did not converge in a satisfactory manner. In our experience, this is most likely to happen when the model for \(\sigma\) has many degrees of freedom, i.e., the model matrix \(X_\sigma\) has many columns.

What are your options if this happens? You can try the following strategies.

Check whether the model for \(\sigma\) contains factor levels which occur only a few times, or which occur almost always. Such levels are columns in \(X_\sigma\) where nearly all elements are equal to 0 and a few equal to 1, or nearly all elements are equal to 1 and a few equal to zero. Remove these columns from \(X_\sigma\) and run

`lmvar`

again.Run

`lmvar`

with the argument`control = list(remove_df_sigma_post = TRUE)`

. With this option,`lmvar`

tries to remove degrees of freedom from the model for \(\sigma\) that prevent the fit to converge. See the vignette ‘Math’ for a mathematical background of this option.Create the model by starting with a model with only 2 degrees of freedom and gradually adding degrees of freedom, avoiding ones that prevent the fit to converge. This can be done by first running

`lmvar_no_fit`

(with the same arguments as`lmvar`

). The resulting object is then input for the function`fwbw`

. This function must be run with argument`fw = TRUE`

. The function`fwbw`

requires one to select a function that measures the goodness of fit. Reasonable choices for this function are e.g.`AIC`

or`BIC`

. The output of`fwbw`

contains a model-fit which is hopefully one you can work with.

There are several other package that can fit the same model as ‘lmvar’. Below, we mention the ones we are aware of. The ‘lmvar’ package has been developed as a relatively simple next step for users who typically run a linear regression, but want to gain experience with a heterscedastic model. Depending on one’s taste and needs, one might need or prefer another package though.

If we can, we demonstrate the alternatives with the same example we have used to demonstrate ‘lmvar’.

The function `remlscore`

from the package ‘statmod’ (Giner and Smyth 2016) fits our example as follows.

```
require(statmod)
# Run the regression including intercept terms
intercept = rep(1, nrow(cats))
fit = remlscore( cats$Hwt, cbind(intercept, X), cbind(intercept, X))
```

The coefficients \(\beta\) for the expected value can be obtained as `fit$beta`

. The coefficients for the logarithm of the standard deviation as `fit$gamma`

. By definition, the latter coefficients are twice the value calculated by `lmvar`

.

The function `crch`

from the package ‘crch’ (Messner, Mayr, and Zeileis 2016) fits our example as follows.

```
require(crch)
# Run the regression including intercept terms
fit = crch(Hwt ~ Bwt | Bwt , data = cats)
```

All coefficients \(\beta\) can be obtained with `coef(fit)`

.

The function `gam`

from the package ‘mgcv’ (Wood 2011) fits our example as follows.

```
require(mgcv)
# Run the regression including intercept terms
fit = gam(list(Hwt ~ Bwt, ~ Bwt) , data = cats, family = gaulss())
```

All coefficients \(\beta\) can be obtained with `coef(fit)`

.

The function `gamlss`

from the package ‘gamlss’ (Rigby and Stasinopoulos 2005) fits our example as follows.

```
require(gamlss)
# Run the regression including intercept terms
fit = gamlss(Hwt ~ Bwt, ~ Bwt, data = cats)
```

The coefficients \(\beta\) for the expected value can be obtained as `coef(fit)`

. The coefficients for the log of the standard deviation are obtained by `coef(fit, what = "sigma")`

.

Other functions that allow for a model of the dispersion are, e.g., `hglm`

in the package `hglm`

(Ronnegard, Shen, and Alam 2010) and `geese`

in the package `geepack`

(Hojsgaard, Halekoh, and Yan 2006). These models are more complicated though, and require a level of expertise not required by `lmvar`

.

We thank Prof. Dr. Eric Cator for his valuable comments and suggestions. Prof. Dr. Achim Zeileis brought the packages ‘crch, ’mgcv’ and ‘gamlss’ to our attention.

This package uses the package `maxLik`

(Henningsen and Toomet 2011) to find the maximum likelihood. The package `matrixcalc`

is used to check properties of the Hessian. The package `Matrix`

is used to support matrices of class ‘Matrix’.

Giner, Goknur, and Gordon K. Smyth. 2016. “Statmod: Probability Calculations for the Inverse Gaussian Distribution.” *R Journal* 8 (1): 339–51.

Henningsen, Arne, and Ott Toomet. 2011. “MaxLik: A Package for Maximum Likelihood Estimation in R.” *Computational Statistics* 26 (3): 443–58. https://doi.org/10.1007/s00180-010-0217-1.

Hojsgaard, Soren, Ulrich Halekoh, and Jun Yan. 2006. “The R Package Geepack for Generalized Estimating Equations.” *Journal of Statistical Software* 15/2: 1–11.

Messner, Jakob W., Georg J. Mayr, and Achim Zeileis. 2016. “Heteroscedastic Censored and Truncated Regression with crch.” *The R Journal* 8 (1): 173–81. https://journal.r-project.org/archive/2016-1/messner-mayr-zeileis.pdf.

Rigby, R. A., and D. M. Stasinopoulos. 2005. “Generalized Additive Models for Location, Scale and Shape,(with Discussion).” *Applied Statistics* 54.3: 507–54.

Ronnegard, Lars, Xia Shen, and Moudud Alam. 2010. “Hglm: A Package for Fitting Hierarchical Generalized Linear Models.” *The R Journal* 2 (2): 20–28. https://journal.r-project.org/archive/2010-2/RJournal_2010-2_Roennegaard~et~al.pdf.

Wood, S. N. 2011. “Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models.” *Journal of the Royal Statistical Society (B)* 73 (1): 3–36.