Skip navigation
1 2 3 Previous Next

FICO Community Blogs

44 posts

"Go to your room.”


“Because I said so.”


This classic reasoning that is oh so familiar to most of us just does not cut it when it comes to artificial intelligence. When receiving instructions, advice, or even descriptions, questioning the reasons behind the decision is almost compulsory to understanding. It is not enough to simply predict an outcome or recommend best action without showing the connection to the data used to reach it.


All decisions are made by analyzing some data and determining an outcome; ‘because I said so’ almost always has an explanation, whether it is shared or not. Knowing the inputs used to reach the final decision is helpful beyond understanding the specific situation. In the context of decisions and analytics, receiving an explanation is valuable for several reasons, including: validation of models, trusting the outcomes from a model, and establishing that a model meets fairness and regulatory criteria.


Being aware of the data used to make a decision is instrumental to the ability to deliver consistent outcomes. This data can be used to build predictive models to move beyond prescriptive analytics to actionable insights.


When data and decision models are exposed, the reason the outcome was reached is suddenly illuminated and the process is no longer a black box.


“We are hosting a dinner party and you are a child. We do not want you to interrupt, therefore you must go to your room.”


“We are watching TV and you have not finished your homework. You must finish your homework therefore you must go to your room.”


The more data used to train a model, the more important it is to make the model explainable. While most personal decisions can be justified with a simple recount of a thought process, large scale, automated decisions are made by first analyzing thousands of data points to create a training model, and then applying insights from that data to drive desirable outcomes. This type of machine learning efficiently provides answers, but can leave the user ignorant to the data, such as event flows or clusters, that drive the final decision.


Explainable AI is useful in various scenarios. Now, I’ll review some examples of explainable AI applications and the corresponding benefits. I look at this from two key perspectives. First, I explore how a data scientist can obtain confidence that a model provides reasonable outcomes. This is called validation. Second, I examine what is necessary for an end user to trust the outcome of a model.


xAI 2.png


To trust the outcome of a model, a human must understand why an algorithm comes to a conclusion. Machine learning can be applied to anything if there’s a corresponding data set—but not every algorithm should be trusted. Models study data to identify similarities, that when detected together, lead to a consistent conclusion. However, correlation does not always equal causation. The reasons must be validated to avoid computers classifying similar things together for the wrong reasons.


Researchers at Shanghai Jiao Tong University created an automated interface on criminality to analyze face images with the goal of identifying criminals and non-criminals based solely on still pictures. The study used machine learning to find that criminals can indeed be identified through facial analysis. While their classification algorithms seemed to work, the researchers could not begin to explain why these people were classified as criminals beyond nose angle or distance between eyes. The algorithms found similarities to classify criminals, but these data sets alone are not enough to lock someone away. With the progress of machine learning, there will be countless algorithms that work for the wrong reasons. People cannot trust just any algorithm without validating the process. To truly trust a model, there needs to be an easy way to extract reasons behind the decision through efficient validation, which begins with understanding and explaining the model.


Trusting Outcomes

Consider a doctor making a diagnosis. As a trained professional, a doctor gains insight from their experiences. Once they have seen just a couple kids with the chickenpox, they can draw on those experiences to quickly diagnose a similar case. Adding artificial intelligence to the loop expands a single doctor’s experience tenfold. A model can be trained with diagnostic data to then assess a single patient’s symptoms and attributes, compare them to historical data, and produce a diagnosis. However, a doctor cannot simply trust an algorithm to make a potentially lifesaving or life ending diagnosis. The machine learning that occurred to reach the diagnosis cannot be a black box.


In order to trust the outcome, a doctor must be able to investigate the reasons a diagnosis was made. A human must be able to collaborate with the machine making the decisions. When providing treatment, the results of a machine learning algorithm cannot be taken at face value. An end user must be able to scrutinize the reasons behind a decision—shedding light on the interworkings of the processes that create the artificial intelligence.


Traditionally, these kinds of systems have been limited to use models that can be easily explained. This has, however, limited the types of models that can be used and also the extent of data that can be leveraged.


If the AI is explainable, it is possible for a human to investigate, understand, and ultimately trust a decision even if a sophisticated model such as a neural network is used.


xAI 1.png

“Why Should I Trust you” Explaining the Predictions of Any Classifier. M.T. Ribeiro et al. 2016


Fairness and Regularity Criteria

Legislation is already being put in place to ensure responsible use of artificial intelligence. The European Union Parliament passed the EU General Data Protection Regulation (GDPR) on April 14, 2016. The GDPR will be enforced beginning on May 25, 2018. This regulation represents an important step that must be observed as computer driven algorithms gain a larger part in making decisions that affect human beings. Article 22 of the GDPR states that people “have the right not to be subject to a decision based solely on automated processing,” meaning that a data controller must investigate the machine’s process before any of its decisions are used to decide a legal matter.


While the intent of the law is clear, the execution of explainable AI is not straightforward nor consistently possible. In order to effectively use machine learning we need to understand it. As the years go by, more and more systems will be making decisions about people with less and less human intervention.


What must be done to responsibly use this powerful technology? Machine learning needs to be made understandable and explainable. Professor Daniel Dennett describes the explainable AI problem well: “If [a machine] can’t do better than us at explaining what it’s doing…then don’t trust it.” The DARPA Explainable AI program outlines a program to address these challenges.


This problem reaches everyone: let’s keep the conversation going. I will continue my discussion of explainable AI in a follow up blog about the scope and approaches to xAI. In the meantime, how are you going to ensure that AI is explainable? What other risks will occur if it is not?

The opioid abuse problem in America is growing out of control. Over 2 million Americans have a substance use disorder involving prescription pain relievers, and 91 Americans die every day from opioid abuse.


While there are many contributing factors to this epidemic, including the highly addictive nature of the drugs themselves, one leading cause is particularly nefarious. We call them ‘Pill Mills'. These are clinics and medical providers known for readily providing opioids directly and through prescription.  In other words, while the majority of physicians and caregivers fulfill their Hippocratic duty, there are a few bad apples who intentionally facilitate drug-seeking behavior for increased profits.


pill mill 1.png

Figure 1 - Total opioid overdose hospitalization costs admissions 2011-12 HOD= heroin overdose. POD=prescription opioid overdose


How does a Pill Mill work? A clinic creates a very low barrier to receive opioids. The doctors, physician assistants (PA), or nurse practitioners (NP), employed at the clinic show a cavalier attitude towards dispensing narcotic medicine, generously believing a patient’s description of pain with minimal evidence. Easy acceptance of patients’ stated allergies to non-narcotic medicine like ibuprofen, naproxen sodium, and acetaminophen puts stronger drugs, like Fentanyl, on the table for even minor pains.


The motivating factor for the clinic is money: as more patients use this clinic, word of mouth spreads knowledge about the provider’s willingness to dispense drugs, so that even more patients seek out this provider. Now the clinic has many more patient visits to bill insurance companies. Often the prescriptions will be written in very low amounts, requiring the patient come in for another office visit to receive more.


Pill Mills cost insurance companies an estimated $50-70 billion per year, and detecting them is difficult. Insurance companies don’t want to falsely accuse hardworking legitimate providers. So to proactively tamp down on America’s opioid epidemic, as well as protect their bottom line, insurers are turning to technology and data driven approaches to find the needles in the haystack.


While most organizations’ policies can detect a single provider billing narcotic prescriptions at an unusual rate, collusion is more difficult to identify. This is a weakness that Pill Mills are able to capitalize on; they will leverage shared provider IDs across the clinic to normalize the rate of narcotic claims by any single provider.


For example, a clinic with two doctors, three PAs, and two NPs could all be colluding together as a Pill Mill. To do this, one doctors and one PA are dedicated to providing quick and easy access to the opioids, while the others see a typical patient workload. As the clinic bills the insurance company, the billing ID of the true opioid provider is replaced by the ID of one of the other providers. This will show a slightly higher than normal opioid prescription load across the board, but nothing high enough to trigger any red flags.


pill mill 2.png

Figure 2 - Two providers spread their opioid claims across all providers at clinic


World-class analytics like those run through FICO Decision Management Suite (DMS) could prove unmatched in detecting pill mills. How does it work?


FICO Identity Resolution Engine (IRE) reads across the depth and breadth of claims data, provider data, and more to understand and cluster unique providers, including those with multiple IDs, name variations, and operating in multiple facilities. With advanced search, entity resolution, and link analytic capabilities, IRE enables the connection and analysis of entities and entity relationships across disparate internal and external data. The end result is a graphical analysis of the interconnections in data, which can aid investigations as well as analytically detect patterns of cross-provider fraud, waste, and abuse.


As a real world example, one leading insurance consortium partnered with FICO and found dozens of providers involved in multiple different provider fraud schemes. One such provider had 27 different instances of himself with various names, facilities, specialties, and provider IDs operating across 5 different states.


Detecting this huge fraud was achieved using statistical pattern recognition to understand the characteristics of legitimate behavior versus patterns and outliers that flag abusive practices, based on linkages within a claim. By filing through thousands upon thousands of claims, these analytics detect the claims, clinics, and personnel that depart from legitimate behavior, and determine the precise indicators of suspicious activity. Because the models constantly analyze new, incoming data, they adapt to shifts in behavior; this facilitates detection of new and evolving fraud types, whether these changes occur abruptly or evolve subtly over time. These models return scores and “reason codes” with which investigators can understand why a claim or provider received a high potential fraud score and launch targeted investigations into the suspicious behavior.


Models are then executed by investigators in order to build, review, and test new strategies to combat Pill Mills. Driven by analytic scores and highly targeted rules, investigators can rapidly develop, adapt, and execute rules that address newly discovered emerging schemes. The providers and clinics operating suspiciously are ranked for insurers to review and investigate. As a result, pill mills can be uncovered even as they start to develop, cutting up the supply ring before it can take hold.


This same solution could be used by Prescription Monitoring Programs run by the state. By detecting fraud at the state level, even Pill Mill providers running outside of insurance could be detected. In this case, IRE and DMS would process all the prescriptions given out by both pharmacies and providers to detect collusion.


Pill Mills represent a particularly ugly side of an already cruel public health problem. Combating these complex systems—and defending the people they harm—requires no less than the same advanced detection software that’s used to protect consumers from identity theft. In strong, data-driven solutions, we have an important ally to help overcome this problem, and keep Americans healthy and safe.


By, Neil Stickels & Anya Vanecek

FICO has been using analytics and decision management for 60 years, but it’s still a relatively new concept to the industry at-large. FICO develops analytic applications to automate some of the world’s largest businesses including major financial institutions, airlines, and telecommunications companies plus industry leaders in retail, manufacturing, insurance, and energy. Tune in to the podcast to learn how these various industries are taking advantage of advances in analytics, optimization, and decision management technologies.


Whether you have a technical or business role, these podcasts will impact the way you look at the latest trends in analytics today.


A Taste of the Podcasts


Less Artificial, More Intelligent –How can we decode analytics to make them more applicable to everyday business problems? Scott Zoldi, Chief Analytics Officer at FICO, discusses how artificial intelligence has evolved over time to now solve business issues and make smarter decisions.


“Try to understand what analytics means. How do they work?
To derive a level of comfort or even an intuition that makes it less of a black box.”



Live from the Garner Data and Analytics Show –Benjamin Baer, Sr. Director of Product Marketing at FICO, interviews some of the leading experts of Big Data and Analytics live on the showroom floor in Grapevine, TX March 6 – 9, 2017.


“We’re having a lot of in depth conversations around what is
visualization versus what is visual analytics.” Tableau



IoT and Real Time Streaming – Conceptually, the Internet of Things has been around for a while. However, with the introduction of streaming technologies use cases have become extremely popular. How do we relate IoT to analytics? How can combining IoT and Streaming Analytics help drive better business decisions? Shalini Raghavan, Sr. Director of Analytic Product Management at FICO, discusses how technology is gearing up to handle the Internet of Everything.


“Bring together the data & use analytics to evaluate what behavior might be captured
in that data & then make business decisions with those 2 combined.”



Streaming Analytics and Stateful Event Processing

Why is streaming analytics so important? George Vanecek, FICO’s VP and Chief Technical Architect, discusses the advantages of event stream processing of data in motion. Many streaming analytics problems are fundamentally stateful, and must be considered with real-time, historical, and derived contexts in order to provide insights. Like moving from still photography to live video, George explains how high speed event processing is Big Data’s next milestone.


“If you ignore the states and the context it’s similar to showing a single frame from a movie.
Kind of like a picture without showing the previous frame in motion.”



Subscribe and listen to these podcasts on iTunes or Blubrry.


About Decoding Analytics

Companies of all sizes are increasingly forced by economic and competitive pressure to step up their analytics game. This means using data informed insights to have more robust customer interactions, improve the bottom lines, and digitally transform their businesses. From decision modeling to predictive analytics to supply chain optimization to pricing strategy, we are exploring how 21st Century businesses decode and apply analytics software to significantly impact the way their businesses operate. These quick 10 – 30 minute podcasts feature industry leading experts that will help educate you about the advanced technology thousands of customers are leveraging today.


Every few weeks we will be rolling out fresh episodes to our subscribers. All you need to do is subscribe to Decoding Analytics on iTunes or visit our dedicated podcasting website. By subscribing to either of the feeds, you will be notified of the latest program as soon as it is published.


For more information on FICO Decoding Analytics please email us at and subscribe to the podcast at either: iTunes or Blubrry.

The Big Data promise continues to be just that, a promise. This is not because the technologies are flawed, but because effort is not focused in the right place.


8 in 10 organizations confessed to Gartner they are not able to exploit Big Data for competitive advantage. Following the Gartner Hype Cycle, Big Data travelled through the Technology Trigger from 2011-2012 into the Peak of Inflated Expectations in 2013, then down into the Trough of Disillusionment in 2014.


Can machine learning save Big Data from the Trough of Disillusionment? It’s no secret that machine learning is of great interest right now; it was identified by Gartner to be at the Peak of Inflated Expectations in 2016. With all this hype, everyone wants to do it, but few are implementing it well.


The formula for effective machine learning goes beyond just Big Data + Open Source Libraries + Data Scientists. Currently, poor data leads to lack of operational results, and this is causing tension between executive expectation and technical staff. To close this gap and effectively operationalize machine learning, companies need clean data, focused decision frameworks, and innovative analytic approaches such as self-calibrating AI models.


All About the Data

All informed decisions begin with data, so companies must invest in data quality and data governance. High quality results start with high quality data. That means that data gathered from multiple sources is aligned, monitored, and refreshed. The objective of the analytics must be top of mind when creating a data-stream (or creek) vs data-lake. Specific objectives require specific relevant data and domain knowledge. For example, analyzing available credit card balance data is not helpful if one attempts to detect fraud because it is a lagging indicator of fraud. When fraud is detected and flagged, it’s too late; the fraud already occurred and the majority of funds were successfully removed! Careful review of the specific data elements is needed, and they must be relevant to the analytic objective. This type of data collection represents a required shift in focus from the algorithm to the decision parameters necessary to solve the problem.


Adaptive Models

Collecting clean, relevant data is only the first step. An entire process must follow to operationalize the data. Machine learning is introduced to offer a high level of automation where the machine learns complex features from raw data that drive prediction or detection, instead of the expert knowledge of the human. However, this is not to say expert knowledge is not essential. Domain knowledge is used in the identification of data and operationalizing scores through rule strategies, which are then used to ensure delivery of business value. Optimization is further applied to improve the process and strategies around the analytic scores.


Analytic models manifest real-world wisdom, but the real world is constantly changing. The fluidity of the real world requires models to be continually updated, as illustrated in the image below. Build it and forget it will not work in the long run. Traditional models are trained on historical data and model weights are frozen thereafter. However, if customer behaviors or data patterns change, there is no mechanism in place in these traditional methods to “adjust” the model weights in real time, rendering a model degraded if the environment changes. This lack of adaptation leads the decisions system to under deliver on the value promise. This is where machine learning and artificial intelligence can improve the process.


SZ Blog 1.png

Machine learning and Artificial Intelligence makes the decision-making process cyclical as it is constantly improving based on new data and events


Companies must invest in adaptive and streaming models that learn from each new data-point to optimize their prediction. Today, many companies recognize that these shifts in behaviors require AI that adapts such as Multi-Layered Self-Calibrating models. These models self-learn to identify risky feature values and unique real-time, on-the-fly, latent feature creation to combat rapidly changing environments.


In addition to this real time adjustment, adaptive analytics technologies reflect real-time feedback from analysts working cases. This results in a self-learning and adapting model that is constantly responding to the production environment.


Further, other forms of AI are being increasingly utilized to find the needles in the haystack of knowledge. As an example, auto-encoder technology is consistently used by leading corporations to monitor how data and features are changing between development and product environments; the reconstruction error from these models directs one to new patterns. Implementing self-learning AI surfaces new information, patterns, and predictive features, which allow data scientists to detect and plan for changes in the future. Consequently, models are improved in a targeted and efficient way.


SZ blog 2.png

Self-Learning AI in Action

As an example, self-learning AI is leveraged by Stanford University to help address grant spending compliance. An original list of expert rules was compiled to capture domain knowledge as an intelligent base for the system and its basic features. Multi-Layered Self-Calibrating AI models then learned behaviors that were not captured by the established rules, surfacing latent features and optimal ways to calibrate the detection of non-compliance. Further, the self-calibrating analytics are constantly adjusting the models and the output is continuously monitored to identify high-risk outlier invoices for human review. This system identifies non-compliance and new schemes to help analysts organize and trace follow-ups.


An Integrated System

This integrated system was only possible with the use of a robust end-to-end platform. An example of such platform is the FICO Decision Management Suite (DMS), which has the power to weave intelligence throughout the entire decision process of analyzing data to make decisions that drive profitable action. DMS handles the variety and velocity of data needed to enable deployment of innovative self-learning AI-powered decisions. Continuous improvements can be made through optimization tools, and integrated, self-service collaborative development. This is all tied together with universal model governance and model management.


To achieve the promise of Big Data and machine learning, one must look beyond just an algorithm. The desired business value proposition must be kept top of mind from the inception of a project all the way through execution, with intelligence injected every step of the way. This is available in the FICO Decision Management Platform and corresponding applications-- try it for free here.


Read other articles by Scott Zoldi about machine learning, Big Data, analytics, and more on the FICO Blog.

Drug addiction is a powerful and destructive debilitation. Death rates associated with the growing Opioid Epidemic rose an astonishing 72.2% from 2014 to 2015 across the country, according to data from the Centers for Disease Control and Prevention. A total of 33,091 Americans died from opioid overdose in 2015; 91 people every day. These addictions cost families and the American economy dearly.


There is a great need for innovation in our medical system to address this growing failure of care. Fortunately, advanced analytic technology is rising to the occasion.


Drug-seeking behavior can be hard to identify. Though drug addicts can be predictable, their patterns of behavior and medical backstories are sophisticated. Wherever and however they can access their drug of choice, they will. And increasingly, that is not some deserted alley in the dead of night – it’s the emergency rooms of hospitals.


Emergency Rooms are an ideal target because they are required to treat and at least stabilize all patients admitted. That said, walking into a hospital to get a drug fix is a bold move, and executing it takes guile. Multiple variations of various personas, combinations of first and last names, addresses, symptoms, and medical histories enable addicts to cycle through multiple hospitals over time. The highly private nature of health information means that networks rarely compare their records. So when denied Fentanyl at one hospital, an addict could simply drive across town to a hospital in another network. Under a variation of their fluid and internalized pseudonym, she’s just a patient needing immediate care. It’s remarkable what one can achieve when desperate.


Of course, not everything can be faked, and this is where technology can help. The first tip-off to a drug-seeker persona is that a patient received a controlled drug. See if you can identify the others…


A woman, born March 16th, 1958, with grey hair and a medium build, walks into Hospital A complaining of lower back pain. She’s in a lot of pain: on a scale of 1-10, she’s at an 8, reporting that she’s been missing work, unable to sit at her desk for any amount of time. In fact, she took the bus into Cincinnati all the way from her home an hour away, just for the opportunity to stand. In her case, morphine is out of the question: she’s highly allergic. As a new patient with no insurance card in hand, her medical records cannot be accessed. The doctor offers her a shot of Fentanyl to get her through the day, and encourages her to seek treatment from a primary care physician.


One week later, on the other side on Cincinnati, a grey-haired woman enters Hospital B complaining of back pain. Her new patient form states that she was born August 8th, 1960, works as a store clerk, and that she’s allergic to morphine. Her back pain is excruciating: she’s been out of work all week, simply unable to stand for more than a few minutes. The ER here is crowded, and the diagnosis is clear enough. The clinician provides her a low dose of Fentanyl, and sends her home to rest her apparently herniated disks.


Did you identify the patterns? These women are one and the same. For one provider, a single drug seeker, much like our example, cost the provider over $500k in unbilled medical expenses. Advanced analytic technology helped prove it— and enabled those hospitals to intervene.


A custom-built solution leveraged cluster analysis to identify and map seemingly unique individual characteristics, and definitively tied them to the same person.

Opioid Link Analysis.png

Cluster analysis initially matched iterations of a single identity by sorting through the hospital’s data records and pairing seemingly redundant data. Additional information like emergency contact information, clinical data like height and weight, and ultimately the care that was rendered, sharpened the edges of fuzzy clusters into tangible shared personas. A few key tip-offs merged what had seemed like one-time visits into patterns of drug seeking behavior: the morphine allergy, a clever trick to get a stronger opioid; the age range; the lack of an insurance card; the general geographic location. Analytic tools then scored these drug-seeking identities, and provided the hospital a ranked list of potential drug-seeking patient personas to review.


All in all, 39 ER visits (and 39 treatments) were linked to the same person. The medical network issued a warning to their regional facilities, showing an image of this woman and listing her aliases. One week later, she stopped by another network hospital. This time, she was not “treated.”


While this data goes a long way towards identifying and stopping drug-seeking behaviors within hospital networks, more can and must be done to address this growing public health crisis. Fortunately, there is much more technology can do.


Each state owns and runs Prediction Drug Monitoring Programs, which operate as databases of every schedule two or three narcotic prescribed and filled. These are checked prior to every prescription of an opioid, to ensure prescriptions don’t go to those with a history of abuse. Of course, this only succeeds if the patient is using their real name.


Fortunately, drug-seeking behavior is patterned. Even without clues as obvious as names, link analysis programs could identify problematic providers and patients using data from a state’s Prediction Drug Monitoring Program files. Through analysis, the programs could help uncover ‘pill mills,’ or groups of people working together to accumulate and then sell prescription drugs from numerous providers. It could catch areas in which pills are given too freely or prescribed at unreasonably high amounts.


This problem is more common and less sinister than you might expect. One would not want to falsely cut someone off from a drug they need. That sets the default to letting the bad stuff go, assuming that these represent the significant minority. Yet, as the opioid crisis explodes, such gut decisions are becoming more dangerous.


One thing is clear—there is a great need for innovation in our medical system. With the network-level view that technology can provide, greater accountability and success. Analytics take on the responsibility of identifying potentially suspicious behavior, so that doctors and nurses can focus on providing life-saving care. In an industry that swears to “first, do no harm,” technology is the ultimate ally. By cutting down on missed connections and avoidable mistakes, link analytics and scoring technology help us fulfill that promise. 

If you are following Streaming Analytics, you have likely observed a movement in big data systems towards event stream processing to continuously analyze events, score them, and make decisions in real-time. This movement is also adopting online ML (machine learning) and predictive analytics based on incremental algorithms that learn from, and adopt to, changes reflected in the event data.


My support around this movement has been to architect streaming analytics platforms that are better at managing stateful information. As the time for ingesting and analyzing each event shrinks from weeks and days to minutes and seconds, purely stateless event processing that uses external systems to handle stateful information becomes less practical. With state management platforms specifically designed to support streaming analytics, development of analytic solutions accelerates rapidly, and solution complexity is reduced.


When I refer to a streaming analytics platform, I am referring to a platform with processing, messaging, coordinating, and state persisting core services that operate in concert. This platform executes solutions as several either scheduled or long-lived jobs. Often, some of these jobs will be stateful.


State management allows the stateful jobs to define states configured for the specific needs of a solution. This includes profile sets containing profiles with name/value properties that may also be based on moving windows such as time interval or count. The state management of these profile sets can be configured to provide immediate or delayed persistence to guarantee state retention between jobs or after system failures. It can also be configured to provide consistency of the states for a distributed process when states may be shared between processes. Without expanding on this here, let me say that state consistency coupled with guarantees of exactly-once, at-least-once, or at-most-once processing in a distributed data flow architecture is one of the key features of a well-designed state management system.


In my last two blogs, Stateful Event Processing and The Need for States and Models in Event Processing,  I outlined the methods and benefits of adding state management to event stream processing solutions designed as data flows on distributed architectures. In this blog, I look at the creation and use of models in the stream analytics as it relates to stateful processing.


For the most part, analytics models are trained periodically and off-line, using batch processing systems that analyze entire datasets. Data is first collected into datasets over a period of time, and a model’s needed configuration is identified using a variety of ML algorithms. This results in models that encapsulate the statistical distributions, cluster geometry, constraint-based rule sets, decision trees, score cards, etc. These models are then periodically embedded in long-lived event processing solutions to classify, score, and make decisions about events as they arrive. This dual phased approach is convenient given that the ML algorithms typically require large datasets to be analyzed with multiple passes. This is something that is hard to do directly in an event processing flow.


However, for some solutions, it is becoming possible to embed this training process directory into the event stream processing data flow. Reworking the multi-pass offline ML algorithms to learn and adapt incrementally has lower computational cost in resources and shorter processing times. Redesigning the offline algorithms to use incremental techniques can render the collection and storage of the entire datasets unnecessary by enabling the same models to be derived in-the-moment as new events arrive.


Consider the (slightly oversimplified) example of computing a moving average or standard deviation of a particular value; values such as the average age of customers or the standard deviation of each customers’ purchases as tracked by their bank. While these measures alone are not models, they may be used by models. These measures must be updated continuously in order to keep the models relevant. This is achieved by computing and tracking the moving average and variance of each entity’s number of events, the sum of the values, and the sum of the square of the values. From these three quantities, the measures of interest may be computed and refreshed at every new event without needing the entire dataset.



Contextual Information flowing in and out of moving boxes much like stateful event processing.


Without needing states, data in the events provide real-time context. Events readily offer information such as location, weather, or time. But decisions also need stateful information based on historic contexts to be added to profiles from the observations in the moment, context such as what was spent in the last 12 months. Contextual profiles must also encompass stateful information based on derived context in addition to the real-time and historical contexts. Derived context is obtained by analyzing many related events over a period of time; say how many times have we called this customer in the last month, or the ratio of failed transactions in a one hour interval. Examining all three types of contextual information enables organizations to build decision solutions that can enable, yet not be limited to, use cases such as marketing, customer service, origination, fraud, or risk.


Now, consider that the platform offers stateful APIs that allow the steps in the processing data flow to identify the entity of interest from an event and understand the entity’s state through its profile that holds such contextual information. If this can be made performant, reliable and consistent across a distributed system, the solution and online algorithm developers need not waste time solving such problems.


One issue in architecting a platform arises when deciding how much of the state management support should be left to the design and implementation of the solution developers, and how much should be provided out-of-the-box as stateful APIs in the underlying platform. In platforms that adopt data-flow architectures based on distributed computing to increase throughputs by parallelizing, while offering fault tolerance and processing guarantees such as exact-once processing, such efforts are non-trivial.


One thing I have learned at FICO is that having solution developers correctly design and implement their own state management in a distributed stream processing environment is time consuming for them and hard to get right. This argues that the support should be provided by the platforms and not be left to the solution developers. It is my belief that in enhancing the underlying platforms, these movements of leveraging the state management in streaming analytics will accelerate the online ML and use of incremental algorithms.


FICO’s DMP Streaming (DMP-S) is an evolving platform that offers hosted solutions such as stateful APIs. The platform thus allows solutions to extract and track contextual information from events, thereby improving analytics be they for real-time, near time, or batched events. The solutions can extract and analyze contextual information from events and persist such information as profiles.


With this perspective, we see that state management provided in the event stream processing platforms such as FICO DMP Streaming simplifies the overall complexity of stateful solutions allowing developers to focus on their designs and their use of algorithms, analytics, and models. What currently depends on several phases and different tools, technologies, and methodologies to separately analyze data, train and retrain models, and process live streams, may be converging to simpler big data eco-systems. From a more-distant perspective, the eco-systems behind data discovery, data mining, machine learning, online machine learning, and the use of models in real-time event filtering, cleansing, analysis, scoring, and complex event determination are slowly coalescing into effective suites of interoperable tools and execution environments. With stateful event processing platforms providing streaming analytics with contextual information and on-line analytics, solution developers can start at a much higher level of abstraction and reduce the needed number of tools and different execution environments.


Yet as the abstraction layer rises, the burden is now on us platform architects to address performance issues, scalability, latency constraints, resource costs, reliability and consistency to name a few challenges. For more on that, stay tuned for my next blog.

When I analyze data and decisions with customers and colleagues, I often hear the question: “why bin the data?” In this blog series, I’ll explore the several benefits of binning, starting with its value for data exploration and insight generation. Examples in this blog are drawn from retail marketing, but apply across business contexts.


To answer the burning question, why bin? I’ll first explain: what is binning. For numeric variables, it’s the process of dividing the continuous range of a variables into adjacent ranges, slices, groups, classes or “bins” (there are so many names). For example, you can group customers by Age, into ranges like 18-25, 26-35, etc. With discrete variables, binning is the process of combining raw data values into similar groups, like binning State codes CA, OR, and WA into “West Coast”.


Binning has some terrific properties for predictive modeling and is often associated with scorecards that thrive on discretized numeric predictors. But even short of formal model construction, binning can help people quickly explore data and unlock the signals and surprises otherwise hiding in datasets. The key elements of binning are the calculations of information value and weight of evidence, which immediately quantify associations and allow for visualization of the relationships between any variable and a business outcome.


In retail marketing, an effective, automated binning algorithm quickly surfaces key insights into customer behaviors. Retail marketers typically measure Recency, Frequency, and Monetary (R,F,M) with a target outcome of response performance. In this blog, we bin and analyze these same variables, using FICO Scorecard Professional, to demonstrate the insights binning can provide with a real data set. The data are summarized to determine Weight of Evidence (WoE), which in this dataset indicates the relative likelihood of responding, allowing marketers to quickly identify patterns in a customer’s likelihood to buy.


Recency (R)

The most immediately appealing aspect of binning is the ability to visualize the relationship between predictors and performance. Using Scorecard Professional’s binning capabilities to look at Recency (R) allows marketers to see the relationship between time since the customer’s last purchase and how likely they are to respond to an offer.


Recency Binning



As shown in the table above, the differences in WoE of binned groups of recency values are visualized in the right most column above, providing quick insight from the data. We see a positive WoE for customers who shopped within the last 10 months, meaning this set of customers is more likely to respond. Customers who have not shopped in 10-18 months have a slightly negative WoE, indicating this group is a little less likely than the average customer to buy again. If a customer has not purchased in more than 18 months, the WoE gets considerably more negative. This bears out the often regarded truth in marketing – a “hot” shopper is worth mailing to, but once it’s been 18 months since you’ve brought someone in the door, you are much less likely to see them again. Since the data was categorized and then analyzed, this adverse relationship between time and response is immediately evident, providing actionable insight for marketers to effectively target outreach.


Frequency (F)

Looking at Frequency (F) in the table below we see the number of times someone has purchased over their lifetime has a positive relationship with WoE. One of the common plagues for retailers is the dreaded ‘one time buyer,’ and the binned data shows that one time buyers represent 42% of the population; and they are clearly the least likely to respond to offers. In general, this is a difficult insight for retailers to harvest because there is very little behavior upon which to base a marketing campaign. Understanding that the set of customers who have only bought once are not likely to respond to offers, a retailer might choose to minimize their marketing spend on these customers, or seek to take advantage of the little data they do have (channel, type of merchandise, etc.) to better target them.  If the spend is high enough, they might choose to invest in “off-us” information to gain more insight.


Frequency Binning



Monetary (M)

Grouping customers by Monetary (M), as shown in the table below, reveals a positive pattern of dollar purchase over a lifetime and response to offers. This means that the more a customer has spent in the past, the more they are likely to spend in the future. Binning customer data by their spending habits makes use of readily available information which can help marketers target customers that are likely to spend more based on their past behavior.


Monetary Binning



Categorical Channel Data

When predictors are already in categorical form automated binning may not be necessary. Still, visualizing this data adds insight to the information we gathered from binning R, F, and M variables. In this case, retailers have categorical data about the channels their customers use to find and purchase their products. In the table below: (--R) stands for retail shopping in store, (C--) stands for catalog shopping ordered through the phone or mail, (-W-) stands for web purchases, and (---) stands for shoppers that have not been active in the last 12 months. If a shopper used more than one channel, they are assigned more than one value. Analyzing this channel data is a simple matter of counting the number of responders and non-responders by channel and calculating the WoE to display results, there is no algorithm needed.

Channel Usage Binning

channel usage.png

In FICO Scorecard Professional color is used to indicate the thickness of data. Red or grey warns that data is getting thin, thus response rates & WoE may be unreliable. Bright blue indicates lots of data.


Analyzing this channel data reveals that shoppers who have been inactive for at least a year are very unlikely to respond to a new mailing. Web customers follow close behind inactive shoppers in non-responsiveness, while retail only shoppers are very responsive and catalog shoppers fall somewhere in between. The best response rates, however, can be seen from cross-channel customers, meaning that it is most effective to target shoppers who purchase through multiple channels.


Binning customer data allows retailers to quickly detect patterns based on a diverse set of variables. Visually representing the WoE calculated for different variables provides insight into customer behavior through several different lenses. This binning summarizes the data, providing intuitive visualizations and making it easy for marketers to effectively tailor campaigns to groups that are most likely to respond based on past behavior. By binning customer data across specific variables, like R, F, and M, marketers can easily see patterns in the relative responsiveness for each group. Binning organizes the data and presents the analyst with information that results in actionable insights.

You can analyze the data mentioned in this blog and test out the binning process for yourself with a free trial of FICO Scorecard Professional.


In future blogs we’ll address further applications of binning in data discovery, the advantages of binning for dealing with non-linear relationships and missing values, and the use of binning for scorecard development and strategy design.

Medaglia Pic.jpgDoctor Andrés L. Medaglia is a professor in the Department of Industrial Engineering’s Center for Optimization and Applied Probability (COPA, for its Spanish acronym) at Universidad de los Andes, Colombia. As an active member of the FICO Academic Partner Program (APP), Dr. Medaglia and his students have complimentary access to FICO Xpress Optimization Suite to solve real world problems and conduct meaningful research.


"For me one of the strongest features of Xpress is its flexibility to have a wide range of uses and users with different abilities."

Doctor Andrés L. Medaglia, Universidad de los Andes, Colombia


COPA helps organizations in Latin America better design and improve their systems using advanced analytics. Their work involves a wide range of systems from health to infrastructure. However, despite the fast pace of growth in the region, many companies continue to lack the analytical power to solve the variety of complex new challenges that continually arise.


Medaglia started using Xpress back in 2002. At the time, he was a very strong and avid user of AMPL, and while he still thinks it is an adequate modeling language, he was thrilled when he first used Xpress. The elegant architecture allowed him to play different roles in the development process of optimization-based decision support systems, and that caught his attention.


One of the first problems he solved with Xpress required slimming down the Colombian coffee supply network, while continuing to provide a high level of service to the coffee growers. The report is titled Solution methods for the bi-objective (cost-coverage) unconstrained facility location problem with an illustrative example, and the abstract describes the project as follows:


Coffee Picture.jpg"The Colombian coffee supply network, managed by the Federación Nacional de Cafeteros de Colombia (Colombian National Coffee-Growers Federation), requires slimming down operational costs while continuing to provide a high level of service in terms of coverage to its affiliated coffee growers. We model this problem as a biobjective (cost-coverage) uncapacitated facility location problem (BOUFLP). We designed and implemented three different algorithms for the BOUFLP that are able to obtain a good approximation of the Pareto frontier. We designed an algorithm based on the Nondominated Sorting Genetic Algorithm; an algorithm based on the Pareto Archive Evolution Strategy; and an algorithm based on mathematical programming. We developed a random problem generator for testing and comparison using as reference the Colombian coffee supply network with 29 depots and 47 purchasing centers. We compared the algorithms based on the quality of the approximation to the Pareto frontier using a nondominated space metric inspired on Zitzler and Thiele's. We used the mathematical programming-based algorithm to identify unique tradeoff opportunities for the reconfiguration of the Colombian coffee supply network. Finally, we illustrate an extension of the mathematical programming-based algorithm to perform scenario analysis for a set of uncapacitated location problems found in the literature."


While that is just one project in which Xpress was used, COPA students and researchers have worked on over 30 Xpress related projects. Medaglia likes to use Xpress for its flexibility. What captured his attention was “the nice and elegant architecture that allowed [him] to play different roles in the development process of optimization-based decision support systems.” The powerful Mosel language on Xpress-IVE provides a friendly development environment for teaching beginners. “Stepping up the ladder, in a more advanced (or research-oriented) setting, [he] use[s] the iterative capability of Mosel that goes beyond a declarative language.” These features are quite useful for developing “more complex solutions that require advanced decomposition techniques, column generation, or the use of callbacks to customize the branch-and-bound procedure.” Then, after the prototyping phase when performance is of the utmost importance, one can always “rely on invoking the optimizer from a general-purpose language like Java.”


Xpress Optimization Suite enables operations research professionals, analysts, and consultants to quickly find the mathematically best solution for industry problems. Students and researchers at COPA have been using Xpress for 14 years. Medaglia notes that “it is always comforting to see how well the Xpress-MP optimization engine has evolved with the years.” When evaluating FICO’s complete optimization suite, Medaglia praises the advantages of using a flexible framework on top of a very strong engine.


You too can benefit from the power, flexibility, and ease-of-use that Xpress offers. More information here:

Decision management solutions use data—we all agree on that. But exactly what data, in what form, and at what time, varies dramatically depending on your role in developing the solution.




If your role is to develop a predictive model, your job is to explore any data that might yield analytic insight into the problem that is being modeled. A typical modeling project starts with data engineering processes that extract the modeling dataset(s) from whatever data sources are of interest. This usually requires a whole set of ETL tools and skills just to get the data to where it can be analyzed. (If you don't know what ETL means, look it up—the point of this post is that all these people who are supposedly working together don’t know what the heck each other is talking about, or how they get their jobs done.) Then of course the data scientists have a whole bag of tricks that nobody outside of their close-knit fraternity know about, which they use to twist and manipulate the data into incomprehensible models that miraculously predict the future.


If instead your role is to develop the business rules that implement your decision policies, you take a much more abstract view of the data. The rules are written about the things in the decision domain, and all you need as a rule writer is to know what those things are and what properties they have. Until you’re ready for testing, you don’t even need any actual data—just the business terms (i.e. the data model) that describe the things. If you want to write a rule such as “if the applicant’s age is under 18, then deny the application” all you need to know is that you have business terms for the applicant’s age and the status of the application.


This all gets interesting when the data scientist and the business analyst each take their creations down the hall to their hapless IT developer and say “put these into production!”


Production data is far more constrained than the unlimited possibilities of the modeling data. Instead of a trained data scientist applying the transformations and models to a carefully prepared dataset, they must be applied programmatically to the production data. The transformations and models usually need to be re-coded, in different languages, by different people. It often takes months. Data that was available for modeling may not be available in production. Time-series data that continuously evolves needs to be carefully persisted and managed so that up-to-date values are always available to the model. Legally sensitive data needs to be redacted. There’s almost no end to the fun (or the potential for introducing errors) a developer can have turning a mess of arcane modeling code into something that will actually run.


Production data also consists of very real transactions instead of abstract business terms. That data must be meticulously mapped to the business terms so that the rules can be applied across all the data streams, message queues, transaction requests, or whatever else needs to be processed.


Once the models and rules finally get deployed out into production, there’s one more role that’s interested in the data, and that’s the business manager who is on the hook for showing that the solution is actually performing as expected. A whole different set of ETL operations pulls data out of the production data stores, plus any other sources that might have data that is useful for evaluating the KPIs (again, look it up—I’m not your nanny) that are of interest, and puts it in whatever form is required for whatever business intelligence tool is being used. Yet another set of skills can then be put to work turning that into a set of beautiful reports that the board thinks looks great, even if they don’t understand what any of it means.


Ultimately the success of a decision management solution depends on whether or not all the model and rule development projects actually get deployed out into production, and achieve the expected benefits, as shown in the reports. All the various efforts, by all the people filling the various roles, need to collaborate to produce the results. Enabling and supporting that collaboration, while allowing the individual roles to focus on their areas of expertise, is a critical aspect of any decision management project. 


Journey Through the IEEE

Posted by robin.d Feb 28, 2017

Three years ago, I became a member of the IEEE. It started when I assisted the production of a Watson conference in a local university. In the conference, I was impressed by IEEE’s mission to advance technology to the benefit of humanity. At first, I was driven by the need to network as much as the desire to drive change for the better. I started volunteering as a webmaster in a local computer society chapter when I was a graduate student. I liked it so much that when I was offered a Vice Chair (VC) position, which was advocated for by other local members, I happily accepted. That was in August of 2015, and I have been involved in activities such as local conference creation on computer topics and attending local IEEE executive meetings ever since.


Vice Chair of Computer Society in Quebec is all about creating events with and for the local members, that was fun but I wanted to do more. I became the social media chair for the incoming CCECE2018 conference after being chosen by the general chair of the event. Three years passed on this journey, and now I have taken another role in the organization. In January 2017, I started my new role as IEEE Canada Vice Chair of Industry Relations. This is a unique opportunity to bring industry and academia closer together, driving and encouraging collaboration for the greater good. This takes the form of encouraging chapters to get in touch with their local industry, industry round tables in IEEE conferences, and any activities to strengthen the link between IEEE and the industry.


How did I get there? I met Mr. Kexing Liu chair of Industry relations at Government Technology Exhibition Conference (GTEC), a Canadian federal conference where I was on a booth duty for FICO! After a brief talk about IEEE, I volunteered to be his assistant. Yes, I knew Kexing from previous IEEE events, so I sent a CV to IEEE Canada and then I was selected. My switch from graduate student to senior member, in March 2016, helped; I used my previous 10+ years of experience to get a senior status.



IEEE industry relations helps researchers and professionals get exposure with real world context


Industry does not traditionally keep the pace of academics on innovation. By establishing relations with future researchers and professionals through its numerous student branches, IEEE helps them to interact and get exposure with real world context. By volunteering, students can not only gain great experience, but can also access the IEEE networking experience throughout the profession. On a wider perspective, IEEE helps research findings make their way into production faster.


For more information about IEEE, you can refer to our Canadian page at:

Visit our Facebook page for information about meetings and activities:

A lot of our members in the Quebec section are academic. We encourage everyone to connect with the industry and participate in the discussion about what they are giving back to the community. You may also be interested in the Woman in Engineering group ( ) and our student branch ( ). I look forward to hearing your ideas.

I’m sure you shower nearly every day, washing your hair with the same shampoo. But have you ever given a single thought to the process that got that bottle of shampoo into your hand? We are living in a time when the bulk of our interactions are secretly driven by artificial intelligence (AI). Not secret in the Big Brother sense, but secret because the technology is constantly working behind the scenes to make human lives easier. As a consumer, your purchase choices are being influenced by technology before you even step into the store.


To allow one person to simply choose a bottle of shampoo takes a diverse set of choices, ranging from R&D protocol to chemical transfer practices to shelf space optimization. Sure, we all have our brand and our store. But if you zoom out past your decision to pick up that bottle and throw it in your cart, you discover that your one personal decision to buy shampoo could not have happened without an absurd amount of analytics.

purchase decisions.png

AI is used to inform several decisions before a shopper makes the choice to buy


Let’s start in the lab. That shampoo you picked up is just a refined mix of chemicals. While you as a consumer may only really care about the ingredients that make your hair smell like roses (or summer breeze or shea butter), the manufacturers must perfect the concentration of ingredients to comply with various regulations. What can be sold in Montreal cannot necessarily be sold in Los Angeles or Paris. You can imagine the stress of attempting to make one universal formula that conforms to the codes of different countries. To ease and hasten this arduous process of defining and complying to different rule sets of each country, a leading global cosmetics company uses FICO Blaze Advisor rules management software.


Think of a global company selling their shampoo in a dozen countries; that can easily require them to apply tens of thousands of rules to determine if their product can even be sold. There are more than 6,000 rules dictating chemical combinations and concentrations in Japan alone. That’s a lot of rules, and it makes for quite a complex undertaking. Typically, the legal team understands the intricacies of legislation and the IT team understands how to code rules. With Blaze Advisor, the legal experts can communicate directly with the rules system, effectively eliminating the need for IT to translate and thereby freeing them up to focus on innovation instead of maintenance. This subtle perk of software automation creates immense agility since legal experts can communicate with legal teams all over the world in their own language. Once this is done, it is up to R&D to tweak the ingredients so that the shampoo can be sold everywhere.


Now that the formula has been optimized for sale in different countries, the shampoo must actually be made. This means the transportation of chemicals must be coordinated: not an easy task. Chemical companies must consider several variables, like the type of truck and time of day, before they ship the raw ingredients of your shampoo. With unpredictable variations in supply chain configuration, it is difficult for chemical companies to keep their material master data updated. For example, a chemical company simplified this logistical nightmare by preparing an accurate and responsive set of master material data using a combination of FICO Blaze Advisor and SAP Workflow. The implementation of this technology allowed the company to create a tailored rule set to consistently eliminate inaccurate and incomplete requests from their system. This system intelligently takes all available data about best practices of chemical and gas transfer to help the company decide how and when to transfer the ingredients, effectively cutting a 45-day process down to less than one day.



Chemical companies implement Blaze Advisor business rules software to help coordinate the transfer of chemicals


At this point, the shampoo formula has been established and approved, all the chemicals have been safely transferred, and we’re not even halfway to your cart. Once everything inside the bottle is safe and legal, shampoo companies must get to work, or rather let rules and optimization software get to work on their behalf, to sort out packaging and shelf placement. More on that in the next blog in this series.


by Fernando Donati Jorge and Makenna Breitenfeld

It is time to refresh the standard benchmarking library for mixed integer programming - MIPLIB ( so that it continues to reflect the state-of-the-art. We want you to play an active part in the process. You can contribute now and help shape the future of mixed integer programming by submitting your instances at


Since its first release in 1992, MIPLIB has become a standard test set used to compare the performance of mixed integer linear optimization software and to evaluate the computational performance of newly developed algorithms and solution techniques. It has been a crucial driver for the impressive progress we have seen over the last decades, but seven years have passed since the last update in 2010.


While there were 134 out of 361 problems unsolved with the release of MIPLIB2010, there are only 76 unsolved instances left as of this writing, tendency falling. The latest release of FICO Xpress could solve the previously unsolved pigeon-19 in less than a second and moved three more instances from “hard” to “easy”. New challenges are needed! 



Submit your instances at


MIPLIB2017 will be the sixth edition of the Mixed Integer Programming LIBrary. To continue the diversity and quality standards of the previous editions, the MIPLIB committee is looking for interesting and challenging (mixed-)integer linear problems from all fields of Operations Research and Combinatorial Optimization, ideally ones which have been built to model real-world problems. The goal of MIPLIB is and has always been, to give a broad coverage of the many different areas in which mixed integer programming is used to improve decision making.


We believe that there are many great MIP models out there which could fit the purpose of MIPLIB. Medium-hard MIPs that solve to optimality within a few minutes, hours or one or two days are the best candidates.


While certainly interesting, the following are not well-suited for MIPLIB:

  • Simple models that solve within milliseconds to seconds
  • Hard models where even the root LP problem cannot be solved within an hour
  • Numerically challenging models such as those with huge coefficients, lots of singular bases etc., which are often not well-suited for benchmarking because of the sensitivity of potential solutions to numerical tolerances
  • Pure LPs, quadratic problems and instances with SOSs - these don't qualify as true MIPs in the MIPLIB sense
  • Specific instances of problems that exhibit strange behavior - if you have a number of instantiations of the same or similar models, with a mix of simple to medium to hard, it is best to submit them all, or at least a representative sample of them, but not only the outliers


Submit your instances at The submission deadline is February 28, 2017.


MIPLIB 2010 was the first MIPLIB release that has been assembled by a larger committee with members from academia and industry. With MIPLIB2017, the group of participating institutions has grown even bigger; it is a joint initiative by Arizona State University, COIN-OR, CPLEX, FICO, Gurobi, Matlab, MIPCL, MOSEK, NuOPT, SAS, SCIP, and Zuse Institute Berlin. We believe that it is a great achievement to have such a huge variety of optimization experts working on a joint project.


Please contact Timo Berthold ( with any questions about MIPLIB. Timo is a main developer of the FICO Xpress Optimizer and a committee member of MIPLIB2010 and MIPLIB 2017. Check out his blog about The Two Faces of Parallelism.

Everyone makes decisions, every day. But the institutional history of decision making is vast and varied. Decision-making systems of the 1980s attempted to simulate the knowledge and analytical skills of human experts by structuring decisions as a hierarchy of goals, and then working backwards from these goals; a process known as backward chaining in AI lingo. The late 1990s saw the emergence of the “Business Rules Revolution,” turning the Expert System approach of the 80s on its head. The new prevailing thought was that decision-making should be defined primarily using declarative, modular and independent rules, which would be applied to data until a goal was reached, in a process known in AI lingo as forward chaining. The phase we are currently in began in the late 2000s. This phase shifts the focus from rules to decisions, reviving some of the expert system’s ideas, but preserving the benefits of the business rules approach: rules are still declarative, modular and independent, but they only have meaning in the context of a decision.

evolution.jpgDecision making has evolved from hierarchical expert simulation to bottoms up rules based strategies to top down Decision Modeling of today


Let’s double click on the latest “wave” in the decision management world. How does it help you better codify and articulate analytically powered decision making processes? Well, the success of the Business Rules Revolution meant not only that we had great software like FICO Blaze Advisor, but also that a whole practice emerged around requirements gathering with a focus on business rules. The problem was, projects approached with this rules first mentality were prone to scope creep, and this was caused by the very nature of the Business Rules Revolution. As the name implies, the focus was on the rules, which naturally led to a bottom-up approach for requirements gathering.


The issue with focusing solely on business rules is simple: lack of context. Business rules are only relevant within the context of the greater decision that needs to be made. To improve upon this, the old approach was turned on its head and decision modeling was created. Now, context comes first and only then should the relevant rules be harvested. Hence, a top-down approach was born: Decision Requirements Analysis, which can be outlined with the creation of a Decision Requirements Diagram (DRD for short). Instead of starting with the individual business rules, or trying to force domain experts to think about everything required to make a decision up front, DRDs start with the decision you want to make, and then work your way back from that “goal”.


A Decision Requirements Diagram begins with a top level decision and maps the data and knowledge needed to get there. Made with FICO DMN Modeler.


So, you have a top level decision, what next? You need a way to work your way back from that goal. What is required to make this decision? Data and Business Knowledge for sure, but also the outcome of other decisions. These three components make up the structure of a DRD that can be visually represented in a diagram. By systematically decomposing all necessary decisions and only stopping when there are no decisions left to be decomposed, a DRD provides all the needed context for the knowledge required by the decisions.


Decision Modeling is the latest and greatest in analytically driven decision making, and FICO helped develop the standards and tools so businesses can take advantage of it. We were one of the first to leverage personal computers to put the power of decision-making systems in the hands of domain experts in the 1980s. In the 2000s we empowered business users to manage business logic directly with the introduction of the award-winning Business Rules Management System Blaze Advisor. Now, we have introduced FICO DMN Modeler, our way of learning from the past – and the manifestation of the evolution in decision-making technology. Anyone can use DMN Modeler and create their own DRDs; try the product for free on the FICO Analytic Cloud and join us in the decision modeling revolution.


By Fernando Donati

& Makenna Breitenfeld

Preventative healthcare profits from keeping people healthy and well. For a concept so simple, it’s remarkably complex to execute. Designing programs that prevent illness and complications requires complex models to determine the care best suited to each patient’s medical history, lifestyle, environment, and unique personality. Designing such models is worth the effort: According to one company known for its preventative health care programs, people who report high well-being cost 20% less to treat and are up to $20,000 more productive. Not to mention they’re, well, better.


Brilliant as the human mind may be, hard-coding preventative health care decision logic for one individual is tremendously costly, in terms of both finances and human error – and the United States has some 326 million people in it. Millions of these people are hospitalized for chronic conditions. When they leave, they do so uneducated, confused, and unprepared to manage their condition. In 2004, some 20% of the 12 million Medicare hospital discharges resulted in readmission within 30 days. This number has improved only 1-2% since, and as a result the American economy loses billions of dollars each year.


It would be overly simplistic and unfair to blame health care providers alone for these high readmission rates. Health care is immensely complex. Health information is transitory and high-stakes. To manage it all across millions of unique patients something a bit more artificially intelligent is necessary. Fortunately, emerging technologies are offering solutions that cut costs even as they improve the quality of preventative patient care.


Chief among these upcoming technologies are well-designed analytic models that can adapt to an individual patient more efficiently and accurately than humans. Using massive amounts of patient data to design the right set of questions, a machine can learn about the patient, storing and correlating data as needed to answer those questions, analyze the outcome of various care decisions, and suggest actions better suited to improve that patient’s health conditions.


FICO® Blaze Advisor provides a platform to build such sets of rules and analytical models. A carefully designed solution could analyze an individual’s health profile and environmental conditions to make health recommendations or alert potential concerns, and advise the health practitioner on the best way to engage the patient to pass along that knowledge or treatment plan. It could remind practitioners to follow up with patients flagged as at high-risk of prematurely quitting their treatment regimen or otherwise impairing their own recovery.


Technology that enables healthcare providers to identify each patient’s unique best post-treatment care can go a long way towards empowering change. One example of rules implementation scenarios that FICO® Blaze Advisor helps create is already in place and making an impact for a health initiatives provider covering nearly 70 million lives. The initiative uses predictive analytics to identify patients at high risk of readmission post-treatment, then invoke a collaborative care model to inform discharge planning and follow up. In its initial phase, 22% fewer patients were readmitted than prior to the system’s implementation as a result of the more tailored care. Applied to the aforementioned 2004 scenario, that’s 2,640,000 fewer people returning to the hospital, and that many more people receiving better care.


The system described is essentially a brilliant repurposing of those familiar business rules that Blaze is designed to formulate, execute, and test. Tuned to health concerns, the same technologies that drive targeted marketing campaigns, shopping recommendations can enable health care providers to approach every individual and their condition effectively. Within an arena so variable and complex as healthcare, an efficient and simple rules management platform can empower doctors and nurses to apply patient care rules within their everyday practice. These practioners’ broad knowledge of medicine and acute view of their patients can be used to build out a system that first deeply understands the patient profile and then applies health care models (rules) to generate data-driven healthcare decisions.


With respect to Grey’s Anatomy, each patient’s conditions are far more than the sum of their symptoms, and require a more adaptive guide. Patients’ lives deeply influence both what illness they might contract and how likely they are to respond to a certain intervention or treatment. Designing a system to help keep people healthy, then, means computing a person’s best treatment based on a ‘360° view of the patient.’ This includes factors beyond the scope of their health records; it is also relevant where they live and who with, their financial resources, their personality, and more.


A rules system can evaluate virtually unlimited external factors to determine a patient’s health risk, and enable health professionals to adopt or adapt an effective treatment model for that unique patient. Using a data-informed system grounded in designed logic, treatment can be more consistent and intervention can be thoughtfully executed when necessary, improving the patient's chances for sustained health. Such understanding and agility is unparalleled by anything a single health care provider can provide unaided.


That’s not to suggest that healthcare decisions should be exported to technology. Non-tech savvy clinical employees – that is, those closest to the patient – are still the best people to utilize models to improve patient care. After all, these are the creative and nimble ones holding the practical knowledge to implement the best possible methods of care. So it stands to reason that these doctors, nurses, and clinicians should have an analytical execution environment in which they can organize data-informed rules and rule flows to run predictive models and maintain rules themselves – without IT intervention.


With such a platform, these healthcare providers can locally deploy campaigns to intuit new information about their patients. They could build macro-level analyses of patients to, for instance, allow practitioners to assess post-recovery challenges common within the hospital and thus more likely to affect new patients. Or, they could use an individual patient’s record and profile to deliver more specific care for one injury, based on how their body has responded to treatments in the past.


With a deep understanding of the patient, a system can determine the best course of action to keep that patient well. It can similarly determine, deploy, and monitor the allocation of resources necessary to provide that care. It’s a complex set of rules and variables that no mind should have to attempt to calculate alone. For the sake of our population and economy, not to mention the health and sanity of our healthcare providers, it’s a problem worth conquering.


I’ll admit: I’m a sucker for banner ads. Their algorithms just know me so well, and offer me such compelling products. As they should: these advertising algorithms know me. As I browse social media sites, store fronts, blogs – virtually any place on the internet – I generate huge amounts of clickstream data. We all do. Marketing agencies buy and analyze that data in order to build what is called a "360° view of the consumer" This scope enables them to deliver more targeted ads and services. Now, every time anyone loads a page, they receive an experience that is totally unique. The internet knows what to show, specifically, because it knows what that person, specifically, wants and needs.


We call that process "scoring." As one generates data, whether through browsing or just living life, he or she generates data. This data often follows a predictable pattern of behavior. AI (artificial intelligence) data models make those predictions. The more adaptive the models are to new and changing data, the better they predict. Mapped across populations, they can discern certain patterns that identify individuals who score highly in the model. These high scorers are the audience. Certain actions, like ads, can be designed to appeal to that audience, and subsequently be executed for the individual, who will ideally behave as planned; they’ll click the ad. But to limit this potential to the commercial realm misses its potential.


What if our health care did what marketing does already? The algorithms that tailor ads could similarly empower doctors with a multi-dimensional, 360° view of their patients. With it, health care providers could offer their patients the care they need, when they need it, in the method they’re most receptive to.


Consider the powerful link between smoking and lung cancer. That a patient smokes is data. That they’ve smoked for 20 years is data. That this 20 years of smoking puts them at high risk of cancer is a model. Reaching out to them about cancer screenings is a decision. Deciding whether and how to reach out to change a patient’s behavior is a strategy. This is the same formula that an ad agency uses when determining what banner ad to show you, but instead of motivating you to buy a gadget (or in my case, organic meal delivery services), it can maximize the efficacy of the healthcare you receive.


It’s not perfect in its predictive power, of course, but with the ability to piece together seemingly inane data into a 360° view of a patient, the same artificial intelligence that advertises products can be an extremely valuable tool for improving health care and outcomes.


That’s easy enough to theorize; implementation takes more effort. But this isn’t some fantastical scheme – the analytic tools needed to collect and analyze the data process it according to certain rules, and return a viable health care solution already exist. At FICO, we call it a Propensity Score. With it, healthcare providers can use a proprietary combination of rich third-party data to build a 360° view of the patient to determine the patient’s environmental health risks, likelihood to engage, and level of responsible healthcare consumption. This data can be a lot more reliable than a patient’s self-reporting.


After all, whether one is likely to quit their treatment protocol halfway through a prescription isn’t the sort of information that a new patient questionnaire can capture. It’s not really the sort of information that can be captured at all. Health information is, after all, extremely private, and not always obvious even to the patient. Patients don’t necessarily know that certain environmental conditions are strongly correlated with diabetes and obesity (NOTE: diet and physical activity are strong indicators of overall health, including ones’ risk for diabetes and obesity). That’s what health indicators – data beyond the scope of a patient’s secured records and self-reporting – is for, and fortunately, these can be found virtually anywhere analysis occurs.


Once risk is identified, providing the right messages at the right times via the right medium can dramatically improve a patients’ receptiveness to treatment. If a patient isn’t demonstrably concerned with their health, a form email is not the way to get them into the office. Direct contact, like a phone call or home visit, might be more impactful. How a system measures a patient’s likelihood to engage with a doctor’s chosen method of communication is determined by how the question is defined, but could be informed by virtually anything: medical histories, lifestyle, level and focus of education, or even online habits. Knowing how to engage the patient is almost as important as identifying health risks, and in this way can be just as data-informed.


Once the patient is willingly through the doors and seated on the crinkly paper-covered exam bench, care in the truest sense of the word must still be provided. Again, data processing can help the provider determine the most successful treatment plans. Now that the patient is here, a practitioner can know how to allocate their effort and resources to best serve their patient, based on that patient’s expected behavior. A skeptical patient might benefit from a few extra minutes frankly discussing risks of their illness. A complicated treatment plan might be more successful with careful take-home instructions, and a follow-up call one week into treatment. This well-informed plan of action can extend the period of wellness that follows.


For health insurance providers, providing effective care to patients is core to the job. Yet it can be incredibly complicated, because patients are diverse. Every patient has a different physical body and medical history, lifestyle, and personality. Data surrounds every facet of that being and can therefore provide tremendous insight. Knowing what one does about a patient – descriptive data – can predict that patient’s respective risk for any ailment – predictive data. Determining the appropriate treatment and method of communication is an act of decision modelling. United in a single program, this information and decisioning power can help actually modify patients’ behavior, and return better health outcomes.


Such technology is already at work in the advertisements that convince me to buy overpriced organic popcorn. Imagine how lives could improve if healthcare shared that power.