shapviz 0.9.3

sv_dependence(): Control over automatic color feature selection

How is the color feature selected, anyway?

If no SHAP interaction values are available, by default, the color feature v' is selected by the heuristic potential_interaction(), which works as follows:

  1. If the feature v (the on the x-axis) is numeric, it is binned into nbins bins.
  2. Per bin, the SHAP values of v are regressed onto v' and the R-squared is calculated. Rows with missing v' are discarded.
  3. The R-squared are averaged over bins, weighted by the number of non-missing v' values.

This measures how much variability in the SHAP values of v is explained by v', after accounting for v.

We have introduced four parameters to control the heuristic. Their defaults are in line with the old behaviour.

If SHAP interaction values are available, these parameters have no effect. In sv_dependence() they are called ih_nbin etc.

This partly implements the ideas in #119 of Roel Verbelen, thanks a lot for your patient explanations!

Further plans?

We will continue to experiment with the defaults, which might change in the future. A good alternative to the current (naive) defaults could be:

Other user-visible changes

Small changes

Bug fixes

Milestone: Working with multiple ‘shapviz’ objects

Sometimes, you will find it necessary to work with several “shapviz” objects at the same time:

To simplify the workflow, {shapviz} introduces the “mshapviz” object (“m” like “multi”). You can create it in different ways:

The sv_*() functions use the {patchwork} package to glue the individual plots together.

See the new vignette for more info and specific examples.

Major improvement: SHAP interaction values

The following dependencies have been removed:

Less picky interface

The calculations behind sv_importance() are unchanged, but defaults and some plot aspects have been reworked.

This is the initial CRAN release.