Binning provides a unifying framework for categorical and continuous predictors, as well as binary and continuous targets.
The binning process supports both continuous and categorical predictors. Continuous predictors can be put through an auto-binning algorithm that returns bin breaks optimized to a specific target. Unique values of categorical predictors can remain in their own individual bins, or can be combined into a coarser binning. In any case, the precursory steps for model development remain consistent across predictors, and also across projects.
Binning provides generalization in terms of both predictors and targets:
Note that, for continuous targets, bin-level predictive assessment is based on Normalized Mean, and variable-level assessment is based on R2.
Weight of Evidence derives its numeric value from the distribution of observations within each principal set. As shown below, even when population odds are multiplied by a factor of 10, the relationship between predictor and binary target remains unchanged.
Weight of Evidence provides for normalization:
Information Value provides for normalization:
This normalization provides a consistent basis for making comparisons. The variable-level Information Value can be used to compare the predictive strength of variables within a project, and also across projects. Predictors with higher Information Values have greater predictive strength than those with lower values:
Similarly, for continuous targets, Normalized Mean and R2 measures are both invariant to the population mean. Predictors with higher R2 values have greater predictive strength than those with lower values. This provides projects based on a continuous target with a consistent basis for making comparisons as well.
For more details, watch the recording of my webinar: FICO Webinar: Why Use Binned Variables in Predictive Models?