Robust (or "resistant") methods for statistics modelling have been
available in S from the very beginning in the 1980s; and then in R in
package stats
.
Examples are median()
, mean(*, trim =
. )
, mad()
, IQR()
,
or also fivenum()
, the statistic
behind boxplot()
in package graphics
)
or lowess()
(and loess()
) for robust
nonparametric regression, which had been complemented
by runmed()
in 2003.
Much further important functionality has been made available in
recommended (and hence present in all R versions) package
MASS (by Bill Venables and Brian Ripley, see the book
Modern Applied
Statistics with S).
Most importantly, they provide
rlm()
for robust regression and cov.rob()
for
robust multivariate scatter and covariance.
This task view is about R addon packages providing newer or faster,
more efficient algorithms and notably for (robustification of) new models.
Please send suggestions for additions and extensions to the
task view maintainer.
An international group of scientists working in the field of robust
statistics has made efforts (since October 2005) to coordinate several of
the scattered developments and make the important ones available
through a set of R packages complementing each other.
These should build on a basic package with "Essentials",
coined robustbase with (potentially many) other packages
building on top and extending the essential functionality to particular
models or applications.
Further, there is the quite comprehensive package
robust, a version of the robust library of SPLUS,
as an R package now GPLicensed thanks to Insightful and Kjell Konis.
Originally, there has been much overlap between 'robustbase'
and 'robust', now robust depends
on robustbase, the former providing convenient routines for
the casual user where the latter will contain the underlying
functionality, and provide the more advanced statistician with a
large range of options for robust modeling.
We structure the packages roughly into the following topics, and
typically will first mention functionality in packages
robustbase and robust.
 Regression (Linear, Generalized Linear, Nonlinear Models,
incl. Mixed Effects):
lmrob()
(robustbase) and lmRob()
(robust) where the former uses the latest of the
fastS algorithms and heteroscedasticity and autocorrelation corrected
(HAC) standard errors, the latter makes use of the MS algorithm of
Maronna and Yohai (2000), automatically when there are factors
among the predictors (where Sestimators (and hence MMestimators)
based on resampling typically badly fail).
The ltsReg()
and lmrob.S()
functions
are available in robustbase, but rather for comparison
purposes.
rlm()
from MASS had been the first widely
available implementation for robust linear models, and also one of
the very first MMestimation implementations.
robustreg provides very simple Mestimates for linear
regression (in pure R).
Note that Koenker's quantile regression package quantreg
contains L1 (aka LAD, least absolute deviations)regression as a
special case, doing so also for nonparametric regression via
splines.
Quantile regression (and hence L1 or LAD) for mixed effect models,
is available in package lqmm, whereas an
MMlike approach for robust linear mixed effects modeling
is available from package robustlmm.
Package mblm's function mblm()
fits
medianbased (TheilSen or Siegel's repeated) simple linear models.
Package TEEReg provides trimmed elemental estimators for
linear models.
Generalized linear models (GLMs) are provided both via
glmrob()
(robustbase) and glmRob()
(robust),
where package robustloggamma focuses on generalized log
gamma models.
Robust ordinal regression is provided by
rorutadis (UTADIS).
Robust Nonlinear model fitting is available through
robustbase's nlrob()
.
multinomRob fits overdispersed multinomial regression
models for count data.
rgam and robustgam both fit robust GAMs,
i.e., robust Generalized Additive Models.
drgee fits "Doubly Robust" Generalized Estimating Equations (GEEs)
 Multivariate Analysis:
Here, the rrcov package which builds ("
Depends
")
on robustbase provides nice S4 class based methods,
more methods for robust multivariate variancecovariance estimation,
and adds robust PCA methodology.
It is extended by rrcovNA, providing robust multivariate
methods for for incomplete or missing (NA
) data, and by
rrcovHD, providing robust multivariate methods for
High Dimensional data. High dimensional data with an
emphasis on functional data are treated robustly also by roahd.
Here, robustbase contains a slightly more flexible
version, covMcd()
than robust's
fastmcd()
, and similarly for covOGK()
.
OTOH, robust's covRob()
has automatically chosen
methods, notably pairwiseQC()
for large dimensionality p.
Package robustX for experimental, or other not yet
established procedures, contains BACON()
and
covNCC()
, the latter providing the
neighbor variance estimation (NNVE) method of Wang and Raftery (2002),
also available (slightly less optimized) in covRobust.
RobRSVD provides a robust Regularized Singular Value Decomposition.
mvoutlier (building on robustbase) provides
several methods for outlier identification in high dimensions.
GSE estimates multivariate location and scatter in the presence of missing data.
FRB performs robust inference based on Fast
and Robust Bootstrap on robust estimators, including
multivariate regression, PCA and Hotelling tests.
RSKC provides Robust Sparse
Kmeans Clustering.
robustDA for robust mixture Discriminant Analysis
(RMDA) builds a mixture model classifier with noisy class labels.
robcor computes robust pairwise correlations based on scale estimates,
particularly on FastQn()
.
covRobust provides the
nearest neighbor variance estimation (NNVE) method of Wang and
Raftery (2002).
Note that robust PCA can be performed by using standard
R's princomp()
, e.g.,
X < stackloss; pc.rob < princomp(X, covmat= MASS::cov.rob(X))
See also the CRAN task views
Multivariate and
Cluster
 Large Data Sets:
BACON()
(in robustX)
should be applicable for larger (n,p) than traditional robust
covariance based outlier detectors.
OutlierDM detects outliers for replicated highthroughput data.
(See also the CRAN task view MachineLearning.)
 Descriptive Statistics / Exploratory Data Analysis:
boxplot.stats()
, etc mentioned above
 Time Series:
 R's
runmed()
provides most robust
running median filtering.

Package robfilter contains robust regression and
filtering methods for univariate time series, typically based on
repeated (weighted) median regressions.

The RobPer provides several methods for robust
periodogram estimation, notably for irregularly spaced time series.

Peter Ruckdeschel has started to lead an effort for a robust
timeseries package, see robustts on RForge.

Further, robKalman, "Routines for Robust Kalman
Filtering  the ACM and rLSfilter", is being developed, see
robkalman on RForge.
Note however that these (last two items) are not yet available from CRAN.
 Econometric Models:
Econometricians tend to like HAC (heteroscedasticity and
autocorrelation corrected) standard errors. For a broad class of
models, these are provided by package sandwich.
Note that
vcov(lmrob())
also uses a version of HAC
standard errors for its robustly estimated linear models.
See also the CRAN task view Econometrics
 Robust Methods for Bioinformatics:
There are several packages in the Bioconductor project
providing specialized robust methods.
In addition, RobLoxBioC provides infinitesimally robust
estimators for preprocessing omics data.
 Robust Methods for Survival Analysis:
Package coxrobust provides robust estimation in the Cox
model.
OutlierDC detects outliers using quantile regression for
censored data.
 Robust Methods for Surveys:
On Rforge only, package rhte provides a robust
HorvitzThompson estimator.
 Geostatistics:
Package georob aims at robust geostatistical
analysis of spatial data, such as kriging and more.
 Collections of several methodologies:
 WRS2 contains
robust tests for ANOVA and ANCOVA from Rand Wilcox's collection.
 robeth contains R functions interfacing to the extensive
RobETH fortran library with many functions for regression,
multivariate estimation and more.
 Other approaches to robust and resistant methodology:

The package distr and its several child packages
also allow to explore robust estimation concepts, see e.g.,
distr on RForge.

Notably, based on these,
the project robast aims for the implementation of R
packages for the computation of optimally robust estimators and
tests as well as the necessary infrastructure (mainly S4 classes
and methods) and diagnostics; cf. M. Kohl (2005).
It includes the R packages
RandVar, RobAStBase, RobLox,
RobLoxBioC, RobRex.
Further, ROptEst, and ROptRegTS.
 RobustAFT computes Robust Accelerated Failure
Time Regression for Gaussian and logWeibull errors.
 wle Weighted Likelihood Estimation provides
robustified likelihood estimation for a range of models,
notably (generalized) regression, and time series (AR and
fracdiff).
 robumeta for robust variance metaregression
 ssmrob provides robust estimation and inference in sample selection models.