tgoering@fico.com

Variations on a Data Theme

Blog Post created by tgoering@fico.com Advocate on Mar 10, 2017

Decision management solutions use data—we all agree on that. But exactly what data, in what form, and at what time, varies dramatically depending on your role in developing the solution.

 

DataSegments.png

 

If your role is to develop a predictive model, your job is to explore any data that might yield analytic insight into the problem that is being modeled. A typical modeling project starts with data engineering processes that extract the modeling dataset(s) from whatever data sources are of interest. This usually requires a whole set of ETL tools and skills just to get the data to where it can be analyzed. (If you don't know what ETL means, look it up—the point of this post is that all these people who are supposedly working together don’t know what the heck each other is talking about, or how they get their jobs done.) Then of course the data scientists have a whole bag of tricks that nobody outside of their close-knit fraternity know about, which they use to twist and manipulate the data into incomprehensible models that miraculously predict the future.

 

If instead your role is to develop the business rules that implement your decision policies, you take a much more abstract view of the data. The rules are written about the things in the decision domain, and all you need as a rule writer is to know what those things are and what properties they have. Until you’re ready for testing, you don’t even need any actual data—just the business terms (i.e. the data model) that describe the things. If you want to write a rule such as “if the applicant’s age is under 18, then deny the application” all you need to know is that you have business terms for the applicant’s age and the status of the application.

 

This all gets interesting when the data scientist and the business analyst each take their creations down the hall to their hapless IT developer and say “put these into production!”

 

Production data is far more constrained than the unlimited possibilities of the modeling data. Instead of a trained data scientist applying the transformations and models to a carefully prepared dataset, they must be applied programmatically to the production data. The transformations and models usually need to be re-coded, in different languages, by different people. It often takes months. Data that was available for modeling may not be available in production. Time-series data that continuously evolves needs to be carefully persisted and managed so that up-to-date values are always available to the model. Legally sensitive data needs to be redacted. There’s almost no end to the fun (or the potential for introducing errors) a developer can have turning a mess of arcane modeling code into something that will actually run.

 

Production data also consists of very real transactions instead of abstract business terms. That data must be meticulously mapped to the business terms so that the rules can be applied across all the data streams, message queues, transaction requests, or whatever else needs to be processed.

 

Once the models and rules finally get deployed out into production, there’s one more role that’s interested in the data, and that’s the business manager who is on the hook for showing that the solution is actually performing as expected. A whole different set of ETL operations pulls data out of the production data stores, plus any other sources that might have data that is useful for evaluating the KPIs (again, look it up—I’m not your nanny) that are of interest, and puts it in whatever form is required for whatever business intelligence tool is being used. Yet another set of skills can then be put to work turning that into a set of beautiful reports that the board thinks looks great, even if they don’t understand what any of it means.

 

Ultimately the success of a decision management solution depends on whether or not all the model and rule development projects actually get deployed out into production, and achieve the expected benefits, as shown in the reports. All the various efforts, by all the people filling the various roles, need to collaborate to produce the results. Enabling and supporting that collaboration, while allowing the individual roles to focus on their areas of expertise, is a critical aspect of any decision management project. 

Outcomes