greybox v2.0.1 (Release data: 2024-06-18)
Changes:
* Corrected how the number of parameters is calculated in alm() for loss=c("MSE","MAE","HAM") and respective distribution=c("dnorm","dlaplace","ds").
* "NLOPT_LN_NELDERMEAD" is now used as the default algorithm for the model estimation instead of "NLOPT_LN_SBPLX". The latter produced biased estimates of parameters...
* New function for time series bootstrap, timeboot(), inspired by Maximum Entropy Bootstrap from meboot package.
* adi() function to identify type of demand under consideration.
* Likelihood for occurrence model is now calculated differently, depending on whether there are ARIMA elements or not. In the former case, it includes the differential entropy, while in the latter it doesn't.
* New loss function in alm(): "ROLE" - Robust Likelihood Estimator, which uses trimming for the scale parameters and the likelihood to avoid the impact of outliers on the estimates.
* New distribution in alm(): "dgeom" - Geometric distribution. This can be used for a model for demand intervals.
* Count distributions now return standardised residuals based on the empirical cumulative probability for each of the observations (sort of probability transform).
Bugfixes:
* lmCombine() and lmDynamic() now return loss parameter.
* Fixed a weird bug in vcov.alm(), where instead of inverting Fisher Information, the function would return it as is. This did not apply to all cases, but could have appeared in some situations, when Choleski decomposition would fail.
* Found some mistakes in the calculation of differential entropy in alm() with occurrence and some distributions.
* Fixed a bug in alm() with regularisation, which would not work in case of a simple linear regression.
greybox v2.0.0 (Release data: 2023-09-15)
==============
Changes:
* Starting from v2.0.0, the greybox package for R will be released under the LGPLv2.1 license. The source of the older version of the software under the GPL(>=2) is available here: https://github.com/config-i1/greybox/releases/tag/v1.0.9
greybox v1.0.9 (Release data: 2023-09-15)
==============
Changes:
* Documentation for distributions of greybox
* Extremity and Complex Extremity introduced, aligning them with the HAM paper.
* actuals() now also extracts actuals from the predict.greybox objects.
* accuracy() method for greybox classes.
greybox v1.0.8 (Release data: 2023-04-02)
==============
Changes:
* Smooth dynamic weights via LOWESS in lmDynamic.
* RIDGE is now properly implemented as in James et al. Introduction to Statistical Learning.
* LASSO works well now. The only issue is in doing estimation in small number of iterations. Currently ~2000 are needed to converge.
* Introduced parameter formula in stepwise(), lmCombine() and lmDynamic() to allow user to specify transformation of variables and the largest model to choose from.
* Also, subset is now available in stepwise(), lmCombine() and lmDynamic().
Bugfixes:
* PACF in plot.greybox() dropped lag 1 because of a typo.
greybox v1.0.7 (Release data: 2022-12-22)
==============
Changes:
* xtable() methods for greybox class to produce LaTeX tables from greybox outputs.
* alm() now accepts a vector of zeroes and ones in occurrence.
* Functions for Rectified Normal distribution.
* Rectified Normal distribution in alm().
Bugfixes:
* na.rm in hm et al. functions to get rid of NAs by default.
greybox v1.0.6 (Release data: 2022-09-29)
==============
Changes:
* cramer() now produces the unbiased estimates by default now (small sample correction).
* New plots for greybox: ACF/PACF of squared residuals. Should be handy for GARCH diagnostics.
Bugfixes:
* Fixed a bug in coefbootstrap for the cases of additional parameters estimation (e.g. shape in GN distribution).
* cramer() would return NA if one of the rows/columns in contingency table contained zeroes. This is now fixed.
* Tuned QQ plot for Poisson in alm().
* Fix for alm() and names of variables that contain numbers only.
greybox v1.0.5 (Release data: 2022-03-24)
==============
Changes:
* Simplified the generics outlierdummy(), temporaldummy() and sm().
* qs() and qlaplace() now use qgnorm() function. We no longer need to rely on lamW package in R.
* New method, implant() for implanting scale model into location one.
* Use analytic vcov in case of loss="MSE".
* New parameter, points, in tableplot. It specifies, whether to produce points in the categories. Helps in reading the plots.
* Order in which in plot.greybox() now matters.
* Some plots in plot.greybox() will now produce plots for the scale model if it is estimated (which=c(2:6,8,9,13:14)).
greybox v1.0.4 (Release data: 2022-02-05)
==============
Changes:
* Renamed variable "model" in "object" in sm() generic. This is needed for consistency with more complicated models (e.g. adam()), where "model" means something else.
* We now import forecast method from generics package.
greybox v1.0.3 (Release data: 2022-01-27)
==============
Changes:
* Fixed initialisation of Poisson in alm(), which should hopefully result in more accurate estimates of parameters.
* New distribution in alm(): Exponential via "dexp".
* From now on, predict.greybox() will not calculate vcov if interval="none".
* sm() now returns meaningful residuals (standardised ones).
Bugfixes:
* Switched off all registration of methods from greybox in cases of forecast / fabletools packages.
* A fix in spread() for variables that are not factors and not numeric.
* Use match.call() instead of list() to expand ellipsis in scaler() function in order not to evaluate the content.
* Remove print_level from the call in vcov.alm() if it was used.
greybox v1.0.2 (Release data: 2021-12-01)
==============
Changes:
* Trying to resolve the long lasting issue with forecast generic again.
* alm() now returns a proper formula if user asked for something like "variable~."
Bugfixes:
* spread() would not produce correct labels in cases of missing values in numerical variables.
greybox v1.0.1 (Release data: 2021-09-21)
==============
Changes:
* Trying to get rid of dependence on Matrix - it consumes a lot of memory.
* GMRAE in error measures, introduced by Yves Sagaert.
* na.rm parameter in error measures.
Bugfixes:
* A fix for plot(predict.alm()) for several levels and h=1.
* Fixed a bug with OLS estimation of parameters in alm().
* Fixes for the conflict with forecast and fableTools packages.
greybox v1.0.0 (Release data: 2021-06-27)
==============
Changes:
* scale parameter in alm(), allowing modelling the scale of distribution (GAMLSS style).
* hatvalues() and rstandard() now use varying scale if scaleModel was done.
* (standardised residuals)^2 vs fitted and abs(standardised residuals) in plot.greybox() (which=c(13,14)). This should allow doing diagnostics in case of scale model.
* A new method for constructing a scale model for the already existing lm or alm model, which is called "sm" - "Scale Model". This now has vcov, confint and summary methods.
* occurrence in alm() now also accepts formula. It will construct logistic regression in this case.
* Proper working predict.scale() and predict.greybox() with scaleModel and all distributions.
* coefbootstrap() working for scale now.
* vcov() now passes ellipsis to the alm() / sm() functions.
* Renamed cbias() into asymmetry().
* Introduced extremity() coefficient, based on hm. The name is not final.
* scale model now works with occurrence.
Bugfixes:
* detectdst would not work correctly with half-hourly data.
* A fix in coefbootstrap() for normal and related distributions.
* A fix in occurrence=formula.
greybox v0.7.0 (Release data: 2021-05-18)
==============
Changes:
* plot(..., which=c(8,9)) now drops residuals for zero actuals if occurrence model is present. Should look better and provide a proper message.
* We finally do not depend on forecast package. forecast method is now dynamically loaded if forecast is present.
* /donttest{} instead of /dontrun{} in examples in the documentation.
* on.exit() restore par for plot functions.
* Print time elapsed for the print.greybox methods if it is available.
* Reset par() in rmcb only if it is the default one.
* Changed the default values for xlim and ylim in rmcb() for outplot="lines".
greybox v0.6.9 (Release data: 2021-04-17)
==============
Changes:
* hatvalues() now returns 1 -1E5 instead of 1. Otherwise things break...
* nvariate() method to get number of variates in the response variable in the model.
* Fixed formula for BICc for multivariate models (vars) and added a reference to appropriate paper.
* maxeval in alm() for the model with i>0 is now 40*k.
Bugfixes:
* mcor(), assoc() and spread() are now more robust to missing values.
* A fix in alm() for an exotic case with i>0 and a variable not having any variability after the t>i.
* A fix for coefbootstrap in case of i>0 (previously didn't work).
* A fix for spread() not working for low count variable. It now treats it as factor.
greybox v0.6.8 (Release data: 2021-03-12)
==============
Changes:
* MAE, MSE etc now have parameter "holdout" instead of "actual". This is needed for consistency purposes with measures() function.
* qqplot for alm() with Poisson and Negative Binomial distributions.
* alm() with distribution=c("dpois","dnorm") and ar>0 now relies on analytical covariance matrix.
* alm(), stepwise() and lmCombine() now return the elapsed time.
* stepwise() now reuses testModel (internal object) instead of creating new copies.
* The return of forecast package to avoid conflicts.
Bugfixes:
* A fix for an exotic case of zoo + data.table, resulting in destruction of response variable.
* predict.almari() now supports bootstrapped covariance matrix.
* Fix in stepwise() which sometimes would make it select the same variable in the loop.
greybox v0.6.7 (Release data: 2021-02-14)
==============
Changes:
* Gamma distribution in alm().
* coefbootstrap() now uses changable sample size in case of ar!=0 in alm() and size not provided explicitly.
* Proper distributions in stepwise(), lmCombine() and lmDynamic().
* Scale the MSE part of LASSO and RIDGE in alm() by the sd(diff(y)).
* Parameter orders in alm() instead of ar and i. Currently supports only ARIMA(p,d,0). MA(q) is in future plans.
* Removed forecast package from imports. We now do not depend on it, but only suggest.
* alm() will not drop the variables with no variability anymore if fast=TRUE.
Bugfixes:
* temporaldummy() now does not return zoo from zoo objects. This would break the factors.
* Export print.bootstrap method.
* Added `use="complete.obs"` in cor() in alm(), just in case.
* Reset orders in coefbootstrap, because all orders are already in the object$data.
greybox v0.6.6 (Release data: 2021-01-07)
==============
Changes:
* Speed up the determination function for the numeric data.
Bugfixes:
* A bugfix in temporaldummy.POSIXt for days of week etc, not passing factors properly.
greybox v0.6.5 (Release data: 2021-01-04)
==============
Changes:
* Introduced stars in summary, indicating that we reject H0 on the specified level for some parameters.
* An option for temporaldummy() to return a factor instead of a matrix with dummies.
* terms in alm() and extractAIC method, so that step() works. extractAIC accepts ic parameter, so one can choose between the standard AIC, BIC, AICc and BICc.
* Don't produce covariance matrix of parameters if the interval="none" in predict.greybox.
* formula in alm() now also accepts "trend", fitting the global trend to the data.
* We now also use adam() in case smooth is installed and something needs to be predicted.
* Generalised Normal distribution functions, imported from gnorm package, but more efficient.
Bugfixes:
* A fix in qgnorm, pgnorm and rgnorm for very big values of beta - we use uniform approximation in that case.
greybox v0.6.4 (Release data: 2020-12-01)
==============
Changes:
* Updated the description of alm().
* plot.greybox(x, 1) now does not have the symmetric axes. This helps in reading the data in cases of fat tails.
* Updated vignette of ro() to reflect the new defaults.
* Reverting the hessian calculation without normalisation.
* New set of distribution functions: logitnorm.
* Confidence interval for the mean is now calcualted based on the normal distribution (CLT assumption).
* Renamed alm() from "Advanced Linear Model" to "Augmented Linear Model".
* Cook's distance plot now has quantile lines for 0.5, 0.75 and 0.95.
Bugfixes:
* coefbootstrap() would not work in case of poorly prepared names of variables (e.g. with spaces and quotes).
* A fix for the issue with hessian calculation.
* A fix of annoying bug with predict() applied without the newdata.
* A fix of the bug with names with special characters in alm(). It works, assuming that the formula has "+" sign only. In general, don't use special characters in names, use make.names() function.
* logLik was not calculated correctly in alm() for non-likelihood losses.
greybox v0.6.3 (Release data: 2020-10-20)
==============
Changes:
* Corrections in the main greybox vignette - just fixed some typos and formatting issues.
* Examples in alm() now rely on subset parameter.
* xregExpander() now selects between "extrapolate" and "naive" in case of gaps="auto".
* The as.data.frame.summary.greybox() method, which produces the matrix of coefficients. This should be handy when printing the outputs.
* We now support texreg package, so that the ALM regression outputs can be printed in a fine quality.
* detectdst() now complains, when it finds several missing hours
Bugfixes:
* alm() occurence!="none" would not work if the response variable was not called "y".
* Corrected the table with levels for the predict function.
* A fix in alm() with ar and complicated names of variables.
* A fix in subset parameter for normal distribution.
* The parameters are now standardised for the hessian calculation in order to get more accurate vcov.
greybox v0.6.2 (Release data: 2020-09-02)
==============
Changes:
* Use make.names() instead of gsub() for names of variables.
* ro() now has co=TRUE by default.
* More appropriate normalisation in LASSO / RIDGE for alm() + a fix for the case with lambda=1 (i.e. return mean).
* alm() will not check multicollinearity for LASSO / RIDGE.
* Renamed df into nu for dt and dchisq.
* Don't import gnorm, use internal functions instead.
* temporaldummy() now has the method for POSIXt, not just for POSIXct.
* stepwise now treats all dummy variables as factors. This is needed for purposes of the calculation of measures of association.
* New method: coefbootstrap() - do the bootstrap for coefficients of the model in order to get covariance matrix and other cool thingies (e.g. vcov(), coefint(), summary() and predict() functions).
* Two new methods detectdst() and detectleap(), which return start and end dates for dst change and leap year based on the provided object (POSIXt / Date / zoo / xts).
* predict() and forecast() methods now accept vector of levels, producing several bounds.
Bugfixes:
* Added abs() in mcor() in order to avoid rounding issues in R, which caused sqrt(-1e-16) to produce NaN.
* predict() and forecast() would not work if h=1 with ar>1 for alm().
* BLower & BUpper were needed in case of ARX model and provided B.
* Fixed bug in ro(), due to which it wouldn't work with data frames.
* stepwise() would not work if NA were returned in cor.
* alm() would not work correctly if parameters without names were provided.
* stepwise() would fail in case of factors selected in the model due to the incorrect naming of their expansion.
greybox v0.6.1 (Release data: 2020-08-04)
==============
Changes:
* stepwise() now passes arguments to alm(), if they are provided (e.g. "ar" parameter).
* Switched off i parameter in alm() - it seems to be broken and the whole arima part needs to be redone anyway.
* Remove prediction intervals from plot.greybox(object, which=7).
* Let tableplot() get levels from the data if it is a factor.
* alm() now accepts parameter FI instead of vcovProduce, similar to how it is done in smooth functions. The calculation of vcov is then done in the respective method.
* alm() now also allows specifying more parameters for the optimiser.
* loss parameter in alm(). You can now specify your own.
* Updated alm vignette in order to reflect changes to loss.
* Generalised Normal distribution in alm.
* New method outlierdummy() - create dummy variables based on outliers in the residuals.
Bugfixes:
* A fix for the evaluation of the formula in alm().
* A correct name for the used data in the occurrence.
* A bugfix in rmcb(), due to which the wrong variance was calculated for the coefficients of the model.
greybox v0.6.0 (Release data: 2020-05-19)
==============
Changes:
* The first default plot from plot.greybox() now is the "Actuals vs Fitted" instead of "Fitted over time", which is arguably more useful.
* Residuals diagnostics instruments, including hatvalues() and cooks.distance(). These don't work perfectly for non-normal distributions, but at least they give an idea. Nota that the values differ from lm() due to the different number of degrees of freedom.
* Residuals over time and cook's distance are now produced in plot.greybox() under numbers 8, 9 and 12.
* actuals() now has an explicit parameter all.
* sigma() now returns the value based on the non-zero data only.
* Make error measures a bit more robust towards what they process.
* Make ACF / PACF plots in plot.greybox() easier to read.
* graphmaker() now accepts matrices in lower and upper.
* parReset in graphmaker() now does not set its own par(), unless a legend is used.
* New method: temporaldummy() that creates matrix of dummy variables for the selected type of frequency of a period. e.g. this can produce dummies for weeks of year.
Bugfixes:
* stepwise() would fail if one of the variables contain a name with "`".
* lmCombine() and lmDynamic() would not work in cases with fat regression and stepwise giving more than 14 variables.
* Fix in alm() with "`" symbol.
* A fix in alm() for formula provided as alm(formula(ourModel), ...).
greybox v0.5.9 (Release data: 2020-03-29)
==============
Changes:
* A parameter lowess is added to the spread() function. It plots lowess lines on scatterplots and connects means on boxplots of the function.
* Inverse Gaussian now implies a pure multiplicative model - we take exponent of the (x'B). This way the mean is always positive.
* Moved to Hessian from pracma package instead of numDeriv. It is faster.
* rmc() is renamed into rmcb() and now focuses on doing regression on ranks. This is equivalent to MCB, but faster.
* determ() is now a method that returns determination for different classes.
* Removed dchisq from the alm().
* RMSSE in error measures().
* plot.rmcb() now has a "select" parameter.
* alm() now returns class "occurrence" if occurrence was used. This is needed in order to connect alm() with es() / mes() from smooth.
* Introduced a proper is.occurrence() method.
* Log Laplace and Log S distributions in alm.
Bugfixes:
* A fix of a bug in case the data is not provided explicitly.
* Fix for vcov, which used subset without evaluating it.
* A bugfix in rstandard and rstudent for distribution="dalaplace".
* graphmaker() would fail if start was not a vector.
* stepwise() would sometimes fail in case of distribution="dnorm", returing NULL qr.
greybox v0.5.8 (Release data: 2020-02-08)
==============
Changes:
* assoc() function now has a method parameter, which allows to force the function to use specific measure of association.
* assoc() now can produces just a vector or a non-square matrix, depending on the value of y.
* New function pcor() - partial correlations.
* Both assoc() and pcor() now give warnings in cases with categorical variables and pearson / spearman / kendall correlations. They will also convert the factors into the numeric values in order to be able to proceed with calculations.
* And a minor optimisation in assoc() - it now fully relies on cor.test(), ignoring cor() function.
* Use NumDeriv for the calculation of Hessian for plogis and pnorm.
* Use non-MLE variance in the calculation of vcov in case of mixture distribution models.
* summary() now also calculates the R^2 and R^2-adj, but does not print them out.
* Minor improvements in speed of mcor() and determination().
* alm() now does not reestimate parameters if distribution=c("dnorm","dlnorm").
* Updated vignette of alm() with more reasonable explanation of the occurrence variable.
Bugfixes:
* alm(): AR model with provided parameters would not work correctly.
* alm(): fixed the entropy calculation for the dinvgauss.
* plot.greybox() did not work in case of the infinite bounds.
* A bugfix for lmCombine() + alm(), which caused alm() to drop some of the variables, failing the lmCombine(). Now if fast=TRUE, alm() will not do that.
* stepwise() did not calculate the number of parameters correctly in case of distribution="dnorm" and factor variables.
* Fix for alm() when data is not provided, but is in the environment.
greybox v0.5.7 (Release data: 2019-12-10)
==============
Changes:
* plot.greybox() now also produces LOWESS lines on the scatterplots. There is a parameters that regulates this.
* A bit of tuning of plot.greybox() graphs.
* Inverse Gaussian distribution for alm() is now available and works okay. Note that this is a non-conventional model. Check vignette for alm() function.
* Legend for tableplot() is now by default FALSE.
Bugfixes:
* alm() would be stuck in some cases, when determination returns NaNs.
* Fix of a bug in alm() for the new version of R, where class(matrix) is now c("matrix","array"), and not just "matrix".
greybox v0.5.6 (Release data: 2019-10-29)
==============
Changes:
* plot.greybox() now produces 4 plots instead of one: Fitted over time, Standardised residuals vs Fitted, Absolute Residuals vs Fitted and Q-Q plot with the specified distribution. Should be useful for model diagnostics.
* In fact, plot.greybox() can now also produce Residuals Squared vs Fitted, ACF and PACF. Choose your value!
* An update in the alm() vignette, explaining the Box-Cox Normal distribution.
* Updates in lmCombine and lmDynamic - now we produce the correct covariance matrix of the parameters from the functions.
* rstudent() is now also available, together with the plots. See plot.greybox() for more detail.
* Tuned the vcov calculation for alm(). Hopefully, it will result in more accurate covariance matrix of parameters.
* alm() now estimates df for the distribution="dt" if it is not provided.
Bugfixes:
* Fixed issues with lmCombine and lmDynamic, when no variables are selected.
* A bugfix in the interval construction for the Student's distribution.
* Don't round the importance in lmCombine!
greybox v0.5.5 (Release data: 2019-09-19)
==============
Bugfixes:
* Removed c.ts(), as it caused recursive behaviour.
greybox v0.5.4 (Release data: 2019-09-16)
==============
Changes:
* The function measures() now allows selecting between Naive and Arithmetic Mean benchmarks for relative measures.
* A small optimisation in alm(), removing the unnecessary data from the memory.
* New method - c.ts() - enhancing the work of ts() class of stats package. Combines several ts objects in one, stacking them one after another.
* lmDynamic() now reports the weights for the models.
Bugfixes:
* Fix for spread() and names of variables with spaces and special characters.
* Similar fix for stepwise().
* Fixes for the names with spaces and special characters in lmCombine() and lmDynamic().
* Fixed a bug for alm(), which sometimes produced "y" instead of the proper name of the response variable.
* A bugfix in ro() with co=FALSE - it produced the forecasts for the wrong samples. Thanks to Zhan Peng for reporting the bug! :)
greybox v0.5.3 (Release data: 2019-07-31)
==============
Changes:
* The summary is now more consistent with the ones from smooth package v2.5.2.
* Corrected typos in the outputs.
* Renamed RelMAE into rMAE and RelRMSE into rRMSE for consistency purposes. Similar renaming of RelAME into rAME and RelMIS into rMIS.
Bugfixes:
* lmCombine and lmDynamic did not work if there was only one variable to combine.
* A fix for lmCombine and lmDynamic in case of "dalaplace".
* A fix for the selected variable in occurrenceModel with data.table object.
* predict.greybox() did not work correctly in case of no newdata provided.
* determination, association and mcor did not work with "data.table" and "tbl" classes well. Now they just change the classes of these objects to data.frame as a temporary solution.
greybox v0.5.2 (Release data: 2019-06-15)
==============
Changes:
* Added references to the vignettes in the documentation.
* Functions for Box-Cox Normal distribution (based on the original 1964 paper): dbcnorm(), pbcnorm(), rbcnorm() and qbcnorm().
* alm() now supports Box-Cox Normal distribution (thus box-cox transform of the data).
* Tuned the initial parameters in alm() for recursive models.
* Tuning in the code for the recursive model. It should work approximately 2 times faster now...
* Cosmetic changes in rmc(). In case of "mcb", the methods in the same group as the selected one, have the same darker colour of the mean.
* Rolling back the "quiet" parameter to "silent".
* Switch off entropy of NegBin and Poisson in alm() for now... Otherwise it becomes too inflated and unreasonably difficult.
Bugfixes:
* Fixed a bug with the estimation of models, where some of the variables did not have any variability (i.e. in cases of occurrence model).
* alm() with distribution="plogis" sometimes would not work well with Arima() function.
* A bugfix in plot(forecast), where a list was used in some of the cases for the holdout.
* A bugfix for dropping dummies in case of a multicollinearity in reused regression.
greybox v0.5.1 (Release data: 2019-04-27)
==============
Changes:
* The parameter "silent" is now renamed into "quiet" in ro(), xregExpander(), xregMultiplier() and xregTransformer() functions. It only specifies the output in console now.
* Similarly, "bruteForce" is now "bruteforce" in lmCombine(), lmDynamic() and determ(),
* "B" is now "parameters" in alm(),
* "checks" is now "fast" (note, the opposite meaning),
* "style" is now "outplot" in rmc().
* All the functions now return "fitted" instead of "fitted.values". This is not a big deal, the fitted() method will work as always.
* forecast() function now accepts h - the forecast horizon. If h!=nrow(newdata), then the function will either cut off values, or produce forecasts for the explanatory variables.
Bugfixes:
* alm() would not do checks properly when only one explanatory variable was used.
* lmCombine() and lmDynamic() would not work because they would refer to non-existent "ourModel". Thanks leungi for the bug report!
* lmCombine() and lmDynamic() used to warn about the computational time after doing things, not before that.
* predict.alm() would not work well if only one-step-ahead prediction was needed.
greybox v0.5.0 (Release data: 2019-04-20)
==============
Changes:
* New function - polyprod(), returning the product of two polynomials.
* alm() now has parameters ar and i, which define the order of respective elements of ARIMA model.
* alm() now checks for stationarity of AR.
* alm() now produces the fitted for the zero observations.
* alm() now accepts parameters for the nloptr.
* In case of occurrence model, the expected entropy is added to the likelihood.
* Tuning of the initials for the recursive model.
* alm() with distribution=c("pnorm","plogis") and ARI(p,d) now produces adequate estimates of probability and forecasts.
* Error measures imported from smooth. Accuracy() function is renamed into measures().
greybox v0.4.2 (Release data: 2019-03-10)
==============
Changes:
* nParam, nobs, sigma and AICc, BICc methods for the "varest" class of the functions for "vars" package. Should allow selection using the corrected AIC and BIC.
* graphmaker() now does not need for forecast to start at the right place. If start(actuals)=start(forecast), it will place forecast at the end.
* graphmaker() now also allows not to reset par, so that you can add arbitrary elements to the graph.
* rmc() now returns the exponentiated values of means and intervals in case of distribution="dlnorm".
* nParam() method has been renamed into nparam().
* A new method - actuals(), which returns actuals from the model (similar to "getResponse" of forecast package).
* nobs() and actuals() now also have a hidden parameter all, which determines, whether to return all the values or only demand sizes in case of occurrence model. The default behaviour for nobs is FALSE, and for actuals is TRUE. This is driven by the functions relying on these methods.
* xregExpander() now allows specifying how to fill in the gaps for the lagged variables.
Bugfixes:
* Fixed a bug in stepwise(), due to which the wrong fitted values were generated.
* Annoying bug with occurrence!="none" models and factors.
* lmCombine and lmDynamic did not work correctly with NAs.
* alm produced errors for simple regressions and occurrence.
* stepwise would produce errors, when there was no variability in the data.
greybox v0.4.1 (Release data: 2019-01-27)
==============
Changes:
* The extended vignette on marketing analytics tools (tableplot, spread, cramer, mcor, assoc and determ).
* determination() can now be smart and use stepwise(). This might be especially useful for cases of fat regressions diagnostics.
* stepwise() now also works with factors. And faster than the previous version, when dat is numeric.
* use parameter in mcor(), cramer() and assoc(). By default NAs are removed.
* lmDynamic and lmCombine now work with factors and with all the alm distributions.
* lmDynamic and lmCombine now also have the parameter paralle, which defines whether to make calculations in parallel or not.
* alm() now removes NaNs if they are present in the data.
Bugfixes:
* Fixes in mcor and assoc, which did not work correctly in some cases of factors provided as x or y.
* xregExpander() did not work appropriately when extrapolate=FALSE.
* Fixed an annoying bug in predict.greybox(), when the newdata did not contain the response variable.
* stepwise() sometimes produced data with wrong colnames. Now it doesn't.
greybox v0.4.0 (Release data: 2019-01-04)
==============
Changes:
* Added Burnham & Anderson in the library.bib in the vignettes.
* summary() now prints the name of the response variable.
* lmCombine now returns logLik adequate to the selected distribution.
* Sample size, estimated parameters and degrees of freedom are now also returned in the summary.
* Use Choleski decomposition in vcov.alm function instead of solve.
* xregMultiplier() function, allowing producing cross-products of variables.
* Two new cool functions: tableplot() - produces plots for the two categorical variables, showing graphically, where the most frequent values happen; spread() - plots a matrix of scatterplots / boxplots / tableplots, depending on the type of the provided variable.
* Added a clarification about the most efficient use of RMC (together with RelMAE / RelMSE).
* alm() now works with factors.
* spread() now allows doing log transforms of numerical data.
* New function: cramer() - that calculates Cramer's V and the according statistics. Good for measuring the association between the categorical variables.
* New function: mcor() - multiple correlation between the numerical and catgorical variables.
* New function: association() aka assoc() - returns the matrix of measures of association (the values depend on the types of variables under consideration).
* xregExpander() now has 'extrapolate' parameter which allows deciding whether the missing values need to be extrapolated or not.
* graphmaker() now does not plot forecast if it is NA. In addition, the legend is now slightly more flexible.
* summary() now prints the df for dchisq and size for the dnbinom.
* tableplot() now also accepts dataframes, plotting the first two columns.
* rmc() now should work much faster in cases of distribution=c("dnorm","dlnorm").
* plot.rmc() now only resets par(), when style="lines". In the "mcb" style, it won't change par, so that the user can add any elements they want.
* alm() now estimates sigma parameter for dfnorm directly using likelihood.
Bugfixes:
* lmCombine() and lmDynamic() did not work well when the data.frame was provided as data.
* pointLik() did not work with lmCombine() because scale parameter was not available. Similarly, it did not work with stepwise() in case of normal distribution.
* predict.alm() was misbehaving in case of non-null occurrence.
* determination() now works with factors.
* Additional explanations for RMC.
* Bugfix in alm() for cases of occurrence and the provided factors.
* vcov.alm would not work in cases of occurrence model having different set of variables than the sizes one.
* nParam() did not take the number of parameters in the occurrence part into account.
* plot.predict.alm() now works fine in case of newdata=NULL.
* Fixes in alm() for dalaplace, dnbinom and dchisq distributions and the usage of factors.
* predict.alm() would not work for the models with intercept only.
* alm() in cases of distribution="plogis" or "pnorm" some times could not produce errors correctly (due to exp(huge number)). Now it does.
greybox v0.3.3 (Release data: 2018-11-27)
==============
Changes:
* Student t distribution in alm.
* Beta distribution in alm. When will I stop? I guess, I'll do that when I stop procrastinating...
* New functions for three parameter log normal distribution.
* New function for the non-linear transformation of the provided variables - xregTransformer. Use with care!
* Renamed parameter "b" into "scale" in laplace, alaplace and s functions.
* lmCombine now returns a matrix with the selected variables and the respective information criteria.
Bugfixes:
* Corrected a typo of "plogos" in alm.
* is-functions for greybox now rely on "inherits" function.
* Some bugfixes in alm() with dchisq. But there's a lot of confusion there, including stuff in predict.alm.
* Stepwise was not calculating the number of degrees of freedom correctly in case of distribution="dnorm".
* Stepwise did not call for alm in case of distribution!="dnorm".
* Bugfix in rs, rlaplace and other r-functions, where thethe duplicates of the provided parameters were removed. This caused problems in cases of huge samples, when identical random numbers could have appeared.
greybox v0.3.2 (Release data: 2018-10-25)
==============
Changes:
* Updates in the vignette of alm.
* Although the square link is tricky in case of Chi Squared distribution, it is the correct thing to do. The alm() function now checks if the generated mu_t is positive, and if not, it returns a big number, forcing the solver to stick with the positive solution.
* If inverting Hessian fails, return very big values (meaning high uncertainty).
* predict.alm() now saves the original level of probability.
* New initialisation for plogis, pnorm, dpois and dnbinom in alm().
* dpois, plogis, pnorm and dnbinom now use maxeval=500. All the others have 100. This should improve the estimates of parameters in difficult cases.
* We now return only that data, that was used in the model construction in alm().
* lmCombine and lmDynamic now should work with the distributions of alm(). The only two that are not 100% correct are dchisq and dfnorm - the fitted values of those are incorrect.
* removed getResponse.alm. Now getResponse.greybox does what is necessary.
* Residuals of dnbinom and dpois are now calculated as y - fitted.
* predict with distriubion="dnbinom" in cases, when scale is not available, is now calculated based on the definition of scale via variance and mean.
* pointLik.ets() is now calculated differently, so that sum(pointLik) is close to the logLik produced by ets() function. The problem with logLik of ets() is that it is not calculated correctly, chopping off some parts of normal distribution. Total disaster!
* New set of distribution functions - for Asymmetric Laplace Distribution (ALD).
* alm() now estimates models with Asymmetric Laplace Distribution with predefined alpha parameter. This is equivalent to the quantile regression with alpha quantile, but is done from the likelihood point of view. It also allows estimating alpha in sample.
* The correct prediction and confidence intervals for the alm() with ALD.
* predict function now also works, when newdata is not provided (although why would you want to do that?).
Bugfixes:
* predict.alm() sometimes produced NAs in the lower bound.
* When having varying probability, plot.predict sometimes struggled to use the correct value.
* plot.predict.greybox() now passes values from ellipsis to graphmaker.
* The intervals for dnorm are now corrected for the cases of occurrence model.
* plot.predict works differently when there are Inf values in the bounds.
* predict() did not work correctly for simple linear regression.
* alm() returned a vector in data for cases of the model with intercept only.
* predict.alm() with distribution=c(dlaplace, ds, dfnorm) did not work in some cases of fixed level of probability.
* predict.alm() now writes lower and upper values in the existing elements of the list instead of creating the new ones.
* predict.alm() did not produce prediction intervals for "dnorm".
* plot.greybox() now checks whether there is a need to transform the data to the binary variable or not.
greybox v0.3.1 (Release data: 2018-09-07)
==============
Changes:
* Corrected some typos in README.md and added description of several functions.
* predict() and forecast() functions now produce confidence and prediction intervals for the provided holdout sample data. forecast() is just a wrapper around predict().
* Normal and log-normal distributions are now available in alm().
* rmc() now uses alm().
* stepwise(), lmCombine() and lmDynamic() can now also be constructed with distributions from alm(). They use lm() in case of "dnorm" and alm() otherwise.
* alm() now does not return vcov if you didn't ask for it (should increase speed of computation for large datasets).
* alm() can be constructed with the provided vector of parameters (needed for vcov method).
* We now use well-known analytical solutions for the cases of distribution="dnorm" of alm() and other functions.
* Code of lmCombine and lmDynamic is slightly simplified.
* We now use Choleski decomposition for the calculation of the inverse of matrices in alm.
* distribution="dlogis" is now available for alm().
* alm() now also supports logit and probit models, which are called using distribution="plogis" and distribution="pnorm" respectively (reference to the names of respective CDFs in R).
* alm() now has occurrence parameter, which allows dealing with zeroes in the data. In this case, a mixture distribution can be used.
* alm() with dlnorm now also returns analytical covariance matrix instead of hessian based one.
* stepwise(), lmCombine() and lmDynamic() now rely on .lm.fit() function, when distribution="dnorm", so the speed of calculation should be substantially higher.
* New functions for class checks: is.greybox(), is.alm(), is.greyboxC(), is.greyboxD(), is.rmc() and is.rollingOrigin().
* stepwise() now calculates only the necessary correlations. This allows further inceasing the speed of computation.
* alm() uses its own mean function, so this should also increas its speed.
* Correct prediction intervals for the model with the occurrence part and a new parameter in prediction function - side - which allows producing one-sided PIs.
* stepwise() should now work better with big data.
* Futher optimisation of stepwise in order to decrease the used memory.
* alm() and all the other functions now return "data" instead of "model" and don't produce terms and qr. This should save some space.
* vcov.alm() now uses call in order to reestimate the model.
* rmc() now returns groups of methods. This can be used for analytical purposes.
* alm() now uses a more refined parameters for vcov calculation for "dchisq" and returns a slightly different call with vcov.
* pointLik.alm() method for alm class.
* alm() now extracts meaningful residuals depending on the distribution used. e.g. dnorm -> y - mu, dlnorm -> log(y) - mu
* stepwise() now allows defining occurrence model. So now you can do something like: stepwise(ourData, distribution="dlnorm", occurrence=stepwise(ourData, distribution="plogis"))
* predict function now returns probabilities for the lower and upper intervals. So if you had side="upper", then the lower will be "0", and the upper will be the specified level.
* dpois and dnbinom distributions in alm. alm() allows producing prediction intervals for both of them. But covariance matrix of parameters for dnbinom might be hard to calculate...
* The dispersion parameter of dnbinom in alm() is now estimated separately, which now solves a lot of problems.
* Renamed parameter A into B for alm(). Very serious thing!
* distribution="dchisq" in alm() now estimates the non-central Chi Squared distribution. The returned scale corresponds to the estimated number of degrees of freedom, while mu is the exponent of the expectation.
* rmc() now colours the lines depending on the number of groups. If there's only one, then there's one group and the differences are not significant.
* Started a new vignette for the alm() function.
* graphmaker() is now moved from smooth to greybox.
Bugfixes:
* Fixed a bug with the style="line" in rmc(), where the grouping would be wrong in cases, when one method significantly differs from the others.
* logLik previously was not calculated correctly for the mixture models.
* Bugfix in hessian calculation, when Choleski decomposition works...
* Bugfix in pointLik for the models with occurrence.
* predict() function failed with newdata with one observation.
* Initials of both Poisson and NegBin in case of non-zero data are now taken with logs. This leads to more robust starting points.
greybox v0.3.0 (Release data: 2018-08-05)
==============
Changes:
* New cool function - lmDynamic() - that constructs a dynamic linear regression based on point ICs.
* New set of functions for Folded normal distribution.
* New function - alm - Advanced Linear Model.
* Folded normal distribution for rmc() with value="a".
* Proper model for chi-squared distribution in alm and rmc.
* Renamed distributions in the alm function.
Bugfixes:
* determination() function did not work in cases of 2 variables.
* vcov() and confint() were misbehaving when nVars==1.
greybox v0.2.3 (Release data: 2018-08-02)
==============
Changes:
* determination() now automatically drops variables with no variability.
* New function - nemenyi() - imported from TStools with minor bugfixes and corrections.
* It appears that Nikos is against the move of nemenyi() function from TStools to greybox. This was a misunderstanding between the two of us. So no nemenyi() function here, nothing to see here, move along!
* New function for multiple comparison of methods based on regression analysis - rmc(). This is a parametric analogue of nemenyi test. The function works with errors, their absolute and squared values and relies on lm / glm.
* New methods imported from smooth: errorType, pointLik and pAIC.
Bugfixes:
* plots of ro() were misaligned in case of co=FALSE
* ro() now also returns the correct actual values (previously they could be cut off when ci=FALSE).
greybox v0.2.2 (Release data: 2018-05-25)
==============
Changes:
* New description of the package and badges in README.md
* New function - determination() - returns R-squares for the provided data. This can be useful when you need to analyse the multicollinearity effect.
* nParam method for logLik class.
* BICc - new method for the classes, implementing, guess what?
* Updated description of the package in the help file.
* ro() now returns a class and has print and plot methods associated with it.
* ro() is much more flexible now, returning whatever you want in an adequate format.
* New methods for the greybox functions: confint, vcov.
* Renamed "combiner" into "lmCombine", because it makes more sense. We will use "combine" name for a more general function that would combine forecasts from arbitrary provided models (e.g. smooth, forecast and lm classes).
Bugfixes:
* sigma() method returned the wrong standard error in cases of combined models.
greybox v0.2.1 (Release data: 2018-05-01)
==============
Changes:
* New description of the package and badges in README.md
greybox v0.2.1 (Release data: 2018-05-01)
==============
Changes:
* print.summary now specifies digits. Summary does not round up anything. This corresponds to the normal behaviour of these methods.
* Implemented Laplace distribution, which is useful when models are estimated using MAE.
* Sped up qs() and qlaplace() functions using the inverse cumulative functions.
* New function - ro() - Rolling origin.
Bugfixes:
* qs() returned weird values when several 0 and 1 were specified as probabilities.
greybox v0.2.0 (Release data: 2018-03-10)
==============
Changes:
* combiner now uses a more clever mechanism in case of bruteForce==FALSE.
* combiner now also checks if the provided data has ncol>nrow and sets bruteForce if it has.
* Use Kendall Tau as default in cor() for stepwise.
* Don't use Kendall Tau as default everywhere - only for fat regressions.
* New summary and print methods for models from stepwise. No statistical tests printed, only confidence intervals and ICs.
* AICc for smooth functions in case of iSS models should take only the demand sizes into account, not all the parameters.
greybox v0.1.1 (Release data: 2018-03-05)
==============
Changes:
* We now do not depend on smooth. We suggest it. It's smooth that should depend on greybox!
* New function imported from smooth - AICc.
* New functions for the S distribution (the maximisation of likelihood of which corresponds to the minimum of HAM): ds, ps, qs, rs.
* stepwise now returns the object of two classes: greybox and lm.
* combiner now returns three classes: greybox, lm and greyboxC.
* nParam is moved to greybox from smooth.
Bugfixes:
* If smooth is not installed, plot forecasts using simpler function.
* The forecasts are now produced for the combined models in cases of fat regressions.
greybox v0.1.0 (Release data: 2018-03-03)
==============
* Initial release. stepwise() and xregExpander() are imported here from smooth package.
* combiner() function that combines lm() models. This thing is in the development right now.
* combiner() has a meaningful summary() now. Working to make it more accesible to lm functions.
* summary() for combiner now returns the list of values.
* stepwise() should now perform slightly better.
* combiner() can now be smart and use stepwise for the models pool creation.
* combined lm model can now be used together with predict() and forecast() functions.
* plot() and forecast() methods for the combined functions.