Binning allows the grouping of any outlier with its neighbors in order to minimize its impact.
For continuous predictors, an observation with an extremely high value will be treated as all other observations contained in the highest bin. Likewise, an observation with an extremely low value will be grouped into the lowest bin, and will be less likely to artificially influence model results.
Binning extracts predictive signal from missing and/or special values.
Missing and/or special values can be placed in their own unique bin, and can be treated as any other level. There is often predictive signal associated with these two categories, indicating either a positive or negative trait in relation to the target. Binning helps incorporate this signal into the final model, rather than losing this information by assuming a neutral relationship.
For more details, watch the recording of my webinar: FICO Webinar: Why Use Binned Variables in Predictive Models?