top of page

FinRegLab, a nonprofit focused on innovations that advance responsibility and inclusiveness in the financial sector, has released the results of a broad research project entitled Explainability and Fairness in Machine Learning for Credit Underwriting (1). Noteworthy for its range of empirical testing and numerous collaborators, including many leading financial technology companies, the paper is a comprehensive review of the tools available to help lenders responsibly and compliantly use ML for credit underwriting. Namely, ensuring ML underwriting models are both explainable and fair.

The stakes are high. ML underwriting is already driving many credit decisions, and that share is growing. With its ability to handle lots of data and give more accurate decisions, ML has the potential to improve business results and expand access to credit. At the same time, “the very quality that fuels ML models’ greater predictive power—their ability to detect more complex data patterns than prior generations of credit algorithms—makes them more difficult to understand and increases concerns that they could exacerbate inequalities and perform poorly in changing data conditions.” (FinRegLab)

FinRegLab’s review concludes that many of the emerging tools for improving accuracy and reducing disparities in ML models show great promise. But despite that promise, there is still no easy answer to these concerns. Critically, FinRegLab found “no ‘one size fits all’ technique or tool that performed the best across all regulatory tasks.” Accordingly, the paper concludes that more industry and regulatory guidance is needed to help stakeholders navigate the available tools and their associated trade-offs.

3 Key Facets of Compliance

While much of FinRegLab’s latest project focuses on fairness and inclusion, their overview helpfully buckets compliance into three areas: adverse action, model risk management, and fair lending.

Adverse Action

Adverse action refers to the regulations that require the disclosure of both key factors that lead to either a denial of credit or negative effects on pricing. For traditional regression-based scorecards, the process is relatively simple. Point values from an applicant’s scorecard are compared to some baseline values, and the differences in the applicant’s scores for specific attributes are ranked. These scorecard attributes are mapped to reason statements that group features together and state their meaning in accessible language.

For ML credit models, coefficients or point values for specific features are not readily available. Moreover, ML models use more data and find interactions in that data that more traditional models miss. ML models therefore require different adverse action methods. Many methods are already in use across the industry. FinRegLab's research is a valuable comprehensive comparison of the perfomance of these tools for different compliance tasks. Our decade of experience supports FinRegLab’s conclusion: the best tool will varies by use-case, but the sound performance of popular methods is reassuring and promising.

Model Risk Management

Model Risk Management (“MRM”) is a broader category that refers to the oversight and governance of the entire model lifecycle. Put simply, organizations need to have a deliberate framework that ensures their business fully understands their credit model and that it performs as expected. As with adverse action, explainability is a critical component of MRM, and ML models demand the use of more advanced explainability techniques. That said, much of sound MRM is unchanged by the evolution of ML.

Fair Lending

Of the three areas of compliance framed by FinRegLab, fair lending is unquestionably the one generating the most attention across the lending industry. Fair lending refers to both the specific prohibition of using race, gender, or other protected characteristics in underwriting models, as well as the use of other superficially neutral features that still lead to disparate impacts in decisions for protected classes. Traditional fair lending compliance has revolved around ensuring that obviously discriminatory features are excluded from your model build, then testing and removing features post hoc. Typically, this testing is grounded by comparing a measure of model performance (for example, Area Under the Receiver Operator Curve, or “AUC”) and a measure of disparate impact (for example, Adverse Impact Ratio, or “AIR”, which measures the ratio of approval rates between protected classes and benchmarks).

Naturally, many lenders have been applying this same technique to ML models. But because of the complex structure of ML models, with many longer feature lists and interactions between features, the efficacy of this method is questionable. In our experience, post-hoc feature removal on ML models has minimal impact on AIR. This is no surprise given the great care taken to remove potentially problematic features throughout the development process. Indeed, FinRegLab found that post hoc feature removal had minimal impact on the actual treatment of protected classes by ML models while often negatively impacting performance.

An alternative approach for fair lending and the search for less disparate alternatives (”LDAs”) is evolving in the ML space. It takes a proactive approach toward fairness and inclusion, incorporating it into the model development process. Automated platforms scan a wide array of available model features and iterate on potential models with the aim of identifying fairer model versions. According to FinRegLab’s research, this method is far more effective at reducing disparity. However, that does not mean this method is overall a better approach to fair lending compliance. There are significant tensions between this method and modeling best practices which must be considered.

Fair Lending Trade-offs

If an automated search for LDAs is conducted as a post hoc process, for example by a separate compliance team or vendor, then it’s in tension with the fundamentals of MRM. Sound MRM, even for advanced ML models that can handle more data, demands careful stewardship of data: examining feature importance, correlations, special values, stability over time, and monotonicity. In our engagements, tools and algorithms support feature selection, but manual processes and discussions are invaluable.

Removing selected features from your carefully selected list as part of post hoc compliance review is one thing. But pivoting to an LDA discovered through an automated de-biasing search, which may completely overhaul your chosen features, is another matter entirely. It negates much of the work done throughout the development process and could produce a poorly performing model divorced from your business goals.

The obvious way to avoid this dilemma is to incorporate automated de-biasing methods and the search for LDAs into the development process itself. Much of the MRM outlined above could take place in concert with the search for LDAs. But as FinRegLab points out, this would be a fundamental departure from the safeguards lenders have traditionally put into place to avoid disparate treatment in their models. Put simply, a basic way to ensure that data such as race or gender is not blatantly used in underwriting is to wall off demographic data and other personal information from model development. FinRegLab notes: “A threshold question is whether specific de-biasing techniques are permissible under fair lending laws to the extent that they use data about protected class membership in different ways than traditional mitigation approaches.” So, until more regulatory guidance or clarity is forthcoming, lenders using traditional safeguards will have to conduct these LDA de-biasing tests separately from core model development. As a result, the tension between these techniques and more holistic MRM will remain.

No One-Size-Fits-All Approach

FinRegLab’s approach and conclusions resonated with our team at Ensemblex. We helped pioneer the use of ML for credit underwriting over a decade ago. When we put our first ML underwriting model into production, ML in financial services was the domain of a few edgy start-ups. We now see wide-spread acceptance of ML’s potential in financial services, with techniques for understanding and managing ML growing in tandem. These tools show tremendous promise and are continually improving.

But after deploying ML underwriting models for the past decade across a range of geographies and verticals, one of our key lessons is that there is indeed no “one-size-fits-all” approach to ML, explainability, and fairness. Each business must make the best use of available tools for their specific products, data, and business considerations. If you’d like to understand more about these trade-offs and how to navigate them, give us a call. We’d love to hear from you.


(1) Unless noted otherwise, all quotations and other references to FinRegLab are drawn from Machine Learning Explainability & Fairness: Insights from Consumer Lending and/or Explainability & Fairness in Machine Learning for Credit Underwriting: Policy & Empirical Findings Overview.

  • Writer's pictureLeland Burns

Ensemblex’s comprehensive approach to explainability goes beyond modeling techniques to deliver explainable models that consistently pass rigorous validation and reviews by government regulators, including ECOA compliance. Multiple clients successfully use these models today, across business lines and geographies.

In May, the CFPB issued a circular that clarified its stance on machine learning (ML) models:

“…ECOA and Regulation B do not permit creditors to use complex algorithms when doing so means they cannot provide the specific and accurate reasons for adverse actions.”

This shouldn’t be a surprise. Explainability has been the standard since the ECOA was enacted. New technology—in this case ML—doesn’t change that requirement. It just means that lenders need to be mindful of how they build ML underwriting models.

In the wake of the CFPB’s circular, much of the discussion around explainability has concerned the precise techniques used to generate reliable adverse action reasons for ML models. In our view, explainability is more than a calculation. It’s a foundational principle that’s built into our entire development process.

If you’re considering incorporating ML into your underwriting model, or you’re concerned about your model’s explainability, here are some ideas to keep in mind:

1. Start with approved variables

Delivering a compliant model means starting with a list of variables that are FCRA-compliant and intuitive predictors of credit.

Much of the buzz around ML and other advanced technologies often centers on the use of “big data,” and an ever-growing range of data sources. However, when it comes to underwriting credit risk, boring is better for data.

We work with our clients to derive maximum power from data, but that data comes from the client’s own business and other industry-respected sources. We’ve proved time and again that it’s possible to build an incredibly powerful machine learning model without drawing from exotic data sources.

2. Reduce complexity

We never chase complexity for its own sake. While ML models can ingest more data than other traditional techniques, a cornerstone of our modeling approach is carefully balancing simplicity with performance.

Throughout model development, we iteratively assess all available data across multiple dimensions, to ensure that we are including the minimal number of variables needed to deliver desired performance improvements. Oftentimes, only a handful of variables deliver the bulk of the predictive power. Removing less effective variables will reduce the complexity of the model — and potentially the cost — while making the model easier to explain.

3. Get comfortable explaining how the model works

Lenders should always be able to explain how their models work, especially if they’re developed by an outside firm. At Ensemblex, we partner with clients throughout the entire process, so that they’re intimately familiar with all aspects of our model and its decisions.

That partnership starts with variable selection and data preparation. We carefully consult with them through all design and technical choices and then conduct a rigorous partnered analysis into all model decisions. For clients with little previous ML experience, we take extra care to explain our methods at each stage of the process and to tie our modeling decisions to meaningful business metrics. The result is an ML model that our clients feel comfortable owning and explaining.

4. Balance advancement with understanding

As the techniques and algorithms for both ML development and explainability have evolved in recent years, we’ve stayed at the forefront of understanding and deploying those tools in our own models and on behalf of clients. This includes using Shapley values for both general model explainability and applicant-level adverse action reasons, a technique upon which much of the conversation around explainability is now centered. Based on Game Theory, Shapley’s algorithm defines the additive contribution of each variable to an applicant’s model score, relative to a given benchmark.

With that said, we work hard to balance the explainability alongside both industry acceptance and our client’s understanding. While we applaud those at the cutting edge of further refining ML explainability tools, we’re careful to make sure any techniques we employ are clearly understood by our clients and sanctioned by industry professionals and regulators. This measured approach has allowed us to launch and manage multiple ML underwriting models, as both business owners and consultants, with zero enforcement actions from regulators.

Bottom line: Make explainability the core of the process, not a feature of the model.

ML models can be safely and transparently developed and explained with an array of established tools, as long as you have the right partner to guide you along the way. We at Ensemblex are uniquely qualified to be that partner. If there’s anything we can do to help, please don’t hesitate to reach out.

A few years ago, I had a conversation with the CEO of a promising credit startup that crashed and sold for parts, just one year after a big raise.

“We raised too much money,” he said. “I thought we could figure out the economics later.”

It turns out the team had pursued an aggressive growth strategy — acquiring lots of customers without focusing on lifetime value and chasing several new opportunities at once.

That decision proved to be fatal for his company. We see a lot of companies that have been doing the same over the last several years. In an environment where the next raise appears to be a given, that strategy can work – for a while.

We’re entering a time when a number of high-growth fintechs are going to experience the sudden impact of a changing economy. Investors are pulling back. Consumers are feeling increasing pain on a number of fronts. The combination means, for some, the runway may be shorter than they expected. Fintechs need to learn how to drive profitability. Now.

My goal today is to help managers assess their position and make the most of the runway they have left. Here are some of the questions managers should be asking themselves, along with some advice based on a few decades of experience in credit, both as a consultant and an operator.

How are we doing?

Don’t look at your P&L. Look at your unit economics.

At Capital One we often talked about “speedboating.” Growing businesses are often outrunning (speedboating) their metrics, especially in the credit industry where lenders have high upfront costs and recoup their investment over long periods of time. Speedboating can make a sustainable business look bad on a P&L statement (e.g. acquisition costs outpace earnings) or a bad business look good (e.g. revenues precede losses). The only way to understand whether your business model is sustainable is to look at the unit economics.

What drives our economics?

Identify your key profit drivers and use them to manage your business. For example, if you have a buy now, pay later (BNPL) business, you probably care a lot about first pay defaults and repeat rates. You may be able sustain a loss on the first loan (due to high upfront costs) if you get enough profitable repeats. If not, you need to bring the economics of the first loan into the black.

How resilient are our economics?

The ability to absorb “hits” is crucial to building a sustainable business. We apply stress scenarios to the key drivers. For example, if repeats stay at the same level, what’s the highest FPD rate that the business can tolerate? Once you understand the bounds, diligently monitoring the performance of those key metrics is key to surviving.

How do we guide the business to profitability?

Credit companies never start out profitable. Acquisition channels take time to figure out. Operating costs improve with scale. Attrition takes work to solve. All of this should improve with time, capital, and lots of hard work.

Lifetime value bridges are a great way to visualize that work, one step at a time.

Figure - Lifetime Value Bridge

To make the bridge work, each step must be realistic and actionable. With some steps, cost savings are relatively easy to assess and realize. For example, you may have contracts that give discounts for scale. Other steps will be much tougher. Reducing attrition, for instance, is typically much easier to imagine than to actualize.

How do we organize the team to deliver results?

Create teams that are focused on delivering results against each step in the bridge. For example, if lowering attrition is one of the steps in the bridge, one team should be assigned to the task of identifying opportunities, executing tests, and delivering results. In this way, each team is focused on a single task and accountable for delivering results.

How do we get buy-in with leadership?

You can’t build a profitable business on the back of unprofitable products. Unit economics provide a great framework for that conversation. The entire team should be able to take in the data and internalize the steps they need to take to drive the business to profitability.

Get started today

The environment has changed. Investors are reigning in investments, valuations are coming down, and a new era is beginning. Like 2009, strong businesses will survive, compelling new businesses will start, and innovation will continue. This environment creates opportunities for managers who can adapt to the new reality.

Focus on unit profitability, manage your economic drivers, understand your resilience, and use a bridge. In the end, a profitable business is a sustainable business. And the easiest raise, is the one you don’t need to do.

You’re not alone. You’re part of a team. You have investors. They all have networks. Ensemblex is a part of those networks and we’re here to help.

bottom of page