How To Build an Explainable ML Model
Ensemblex’s comprehensive approach to explainability goes beyond modeling techniques to deliver explainable models that consistently pass rigorous validation and reviews by government regulators, including ECOA compliance. Multiple clients successfully use these models today, across business lines and geographies.
In May, the CFPB issued a circular that clarified its stance on machine learning (ML) models:
“…ECOA and Regulation B do not permit creditors to use complex algorithms when doing so means they cannot provide the specific and accurate reasons for adverse actions.”
This shouldn’t be a surprise. Explainability has been the standard since the ECOA was enacted. New technology—in this case ML—doesn’t change that requirement. It just means that lenders need to be mindful of how they build ML underwriting models.
In the wake of the CFPB’s circular, much of the discussion around explainability has concerned the precise techniques used to generate reliable adverse action reasons for ML models. In our view, explainability is more than a calculation. It’s a foundational principle that’s built into our entire development process.
If you’re considering incorporating ML into your underwriting model, or you’re concerned about your model’s explainability, here are some ideas to keep in mind:
1. Start with approved variables
Delivering a compliant model means starting with a list of variables that are FCRA-compliant and intuitive predictors of credit.
Much of the buzz around ML and other advanced technologies often centers on the use of “big data,” and an ever-growing range of data sources. However, when it comes to underwriting credit risk, boring is better for data.
We work with our clients to derive maximum power from data, but that data comes from the client’s own business and other industry-respected sources. We’ve proved time and again that it’s possible to build an incredibly powerful machine learning model without drawing from exotic data sources.
2. Reduce complexity
We never chase complexity for its own sake. While ML models can ingest more data than other traditional techniques, a cornerstone of our modeling approach is carefully balancing simplicity with performance.
Throughout model development, we iteratively assess all available data across multiple dimensions, to ensure that we are including the minimal number of variables needed to deliver desired performance improvements. Oftentimes, only a handful of variables deliver the bulk of the predictive power. Removing less effective variables will reduce the complexity of the model — and potentially the cost — while making the model easier to explain.
3. Get comfortable explaining how the model works
Lenders should always be able to explain how their models work, especially if they’re developed by an outside firm. At Ensemblex, we partner with clients throughout the entire process, so that they’re intimately familiar with all aspects of our model and its decisions.
That partnership starts with variable selection and data preparation. We carefully consult with them through all design and technical choices and then conduct a rigorous partnered analysis into all model decisions. For clients with little previous ML experience, we take extra care to explain our methods at each stage of the process and to tie our modeling decisions to meaningful business metrics. The result is an ML model that our clients feel comfortable owning and explaining.
4. Balance advancement with understanding
As the techniques and algorithms for both ML development and explainability have evolved in recent years, we’ve stayed at the forefront of understanding and deploying those tools in our own models and on behalf of clients. This includes using Shapley values for both general model explainability and applicant-level adverse action reasons, a technique upon which much of the conversation around explainability is now centered. Based on Game Theory, Shapley’s algorithm defines the additive contribution of each variable to an applicant’s model score, relative to a given benchmark.
With that said, we work hard to balance the explainability alongside both industry acceptance and our client’s understanding. While we applaud those at the cutting edge of further refining ML explainability tools, we’re careful to make sure any techniques we employ are clearly understood by our clients and sanctioned by industry professionals and regulators. This measured approach has allowed us to launch and manage multiple ML underwriting models, as both business owners and consultants, with zero enforcement actions from regulators.
Bottom line: Make explainability the core of the process, not a feature of the model.
ML models can be safely and transparently developed and explained with an array of established tools, as long as you have the right partner to guide you along the way. We at Ensemblex are uniquely qualified to be that partner. If there’s anything we can do to help, please don’t hesitate to reach out.