Model Monitoring - Can You Afford Not to? Part 2

In a recent blog entry, I discussed the important topic of model monitoring. That entry examined the case for model monitoring, covered some fundamentals, and addressed several business considerations. In this installment, I discuss model monitoring from the tactical perspective, including a few methods that should be considered for a successful model monitoring program.

Depending on the number and complexity of models that exist for an insurer, model monitoring runs the risk of becoming overwhelming very quickly. A first step to building a solid model monitoring program is to catalog models in use, including any state-specific versions and assessing the relative importance of each. That could be assessed in a variety of ways including by the number of policies that a model potentially impacts, or the premium volume that a model has influence on. An insurer may also want to assess the complexity of the model and its potential stability. Combining all of the different characteristics helps lead to determining which models are most important to monitor first (relative priority).

Once that is done, an insurer can turn to defining what it is they want to monitor for each model. This involves having a good understanding of the purpose of the model and what was expected in terms of outcomes, such as average score, percentage of business automated, percentage of claims that pass through without manual handling, etc. Using the Population Stability Index (PSI) is a well-accepted way to measure how much a modeling population has changed over time, and if it is enough to raise concern.

While it is ideal to monitor inputs and outputs of a model, this can be overwhelming at the beginning. I’d recommend starting small and building up. In this regard, it would be best to focus on the final outcome and start with monitoring the output. By doing so the ultimate business purpose of the model is monitored. While this does not mean that there isn’t something that could be going wrong within the model, at least the end result is being reviewed. Over time, more and more aspects of the model can be added to the monitoring plan.

Once monitoring metrics are defined, an insurer must then assess what data is available. Unfortunately, this can cause a monitoring plan to be altered, since sometimes the necessary data is not fully available or with the frequency desired. Lack of good data, or latency of data, can be issues with regards to model monitoring—in terms of monitoring as well as for how much time and effort must be exerted.

Building a repeatable process (having the proper tools) is also a key aspect of developing a successful monitoring plan. Being able to dive deeper into the data when necessary is important, as is the ability to visualize data. Multiple software tools exist currently that can help provide this functionality.

It’s ideal to automate as much of the monitoring as possible. The concept here is to set tolerances on data and have alerts or flags trigger when data is “out of tolerance.” This can be done for inputs and outputs. A major benefit of the approach is to focus—as efficiently as possible—those employees that are responsible for the model monitoring function.

Instead of having to manually look at every result and determine if deeper analysis is needed, alerts point employees to those results that are of most concern. This is a “way to look at more without having to look at more.” Setting tolerances can be a balance of many considerations, including the workload capacity of the model monitoring team. See below for a few examples of wider versus tighter tolerances.

In the top graph, the tolerance is set wider which will cause fewer results to flag (denoted by the black arrows) than with the tolerance set tighter (as in the bottom graph). Volume of data can and should also be considered when determining the most effective tolerances. There may also be situations when a certain value triggers an immediate review. For example, if the average daily value of the scores above was zero for one day, that could signify an issue in the scoring system and an immediate alert could trigger at that point.

As we’ve discussed before, model monitoring is a critical part of the modeling lifecycle, though it can often be forgotten. It is important to devote the proper resources and attention to it, including from a technical standpoint. In fact, a model monitoring program may even help shape an insurer’s data strategy. If you are interested in strengthening your model monitoring program, we at Pinnacle would appreciate the opportunity to discuss your plans with you.