suehubbard

Practical Issues Solved by Binning

Blog Post created by suehubbard Advocate on Feb 21, 2018

Binning provides the ability to address possible operational and regulatory constraints, palatability issues, and computation of reason codes, where required.

 

Binning allows for applying constraints across levels:

practical issues 1.png

Pairwise constraints can be applied to any predictor bins, including those containing special or missing values. This allows for model coefficients to adhere to the monotonically increasing or decreasing pattern that is expected, or to maintain any Weight of Evidence pattern inherent in the data. Individually selected bins can also be constrained to receive a “neutral” coefficient.

 

The computation of Reason Codes helps to explain resulting model predictions on an individual observation level. Binned predictors allow for a comparison between the maximum contribution each predictor could add to the final score, and the actual contribution based on the observation’s predictor value. Predictors with the maximum difference can be cited as those most responsible for an observation’s below average score.

 

Binning makes it easy to explain the model to non-modelers.

Predictor binning is a precursory step to building classed models. Once model development is complete, scorecard coefficients can be scaled and presented as integer values. The result provides an at-a-glance understanding of the relationship between each predictor and its target.

 

Classed models are easy to interpret and understand:

practical issues 2.png

 

The scaled Weight assigned to each bin preserves the underlying relationship between predictor and target. Above average values generally suggest a higher propensity to be a “1”, and below average values generally suggest a higher propensity to be a “0”.  The average (or neutral) weight for each predictor is assigned to the bin labeled “otherwise”, which captures observations with missing or unknown information. This makes the model highly transparent and easy to interpret.

 

Not only do weight assignments depict positive versus negative traits when compared to the neutral value, but they also show the relative magnitude of the predictive content that each variable contributes to the model. Predictors with the most extreme weight values and the widest range around the average can have the most influence on the final score.

 

Most existing deployment systems support classed models.

Classed models can be coded in a variety of programming languages, which makes them compatible with most deployment systems.

 

For more details, watch the recording of my webinar: FICO Webinar: Why Use Binned Variables in Predictive Models?

Outcomes