How Do You Transition from Human Credit Analysts to Empirical Models When Some Judgment Can’t Be Automated?

Brandon Homuth
Apr 20
3 min read

When lenders move from judgmental credit decisions to empirical models, the biggest challenge isn’t the math. It’s the messy middle.

A client recently told us:

“Our analysts don’t just apply policy — they fix data. They interpret incomplete bank statements, fractional credit card statements, and half-finished proofs of income. How can a model handle that?”

It’s a fair question — and one we hear often.

In manual underwriting shops, analysts wear many hats: decision-maker, data validator, and sometimes detective. They bring judgment, but they also fill gaps in messy data and unstructured documents.

That mix works — until you try to scale.

Why Judgmental Decisioning Becomes a Bottleneck

In early-stage lending businesses, human analysts are the heart of risk management. They see patterns, weigh nuance, and adapt to imperfect information. But as the business grows, judgment becomes the constraint.

Even great analysts differ in how they interpret the same applicant. One might overlook a missing bank statement page; another might decline the same file. The result is unpredictable outcomes, inconsistent loss rates, and a process that’s expensive to maintain.

And unlike a model, you can’t tune a human being.

You can’t tell an analyst to “be 20% more aggressive” or “tighten by 10%.” You can’t measure feature importance or calibrate consistency.

Empirical models don’t just replace people — they introduce control, scalability, and explainability. They let you know why a decision was made and what happens if you change it.

The Hidden Fear: What About the Hard-to-Automate Parts?

The team we worked with was ready to build their first empirical model — but nervous about losing the analysts’ “last mile” judgment on messy inputs.

Some examples:

Bank statements arrived in inconsistent formats, sometimes missing totals or transaction headers.
Credit card statements came in fragments — page 1 this month, page 2 next.
Proof of income (POI) documents were sometimes mis-uploaded or incomplete.

These weren’t credit policy issues — they were data quality issues. But the analysts were quietly solving them on the fly, often without recording what they fixed.

So when they thought about automation, they weren’t just worried about accuracy — they were worried about blindness.

Without the analysts catching these edge cases, would the model make wrong decisions and no one notice?

How to Know What’s Worth Fixing

Our advice was: don’t try to automate everything at once.

Instead, use the transition to measure where human fixes actually matter.

We helped the client:

Keep humans in the loop — but post-decision. Let the model render a decision, then have analysts review a sample of approvals and declines. Record what they’d have changed, and why.
Tag and categorize the issues. Was it a data problem (e.g., misread POI), a policy gap (e.g., model doesn’t know how to handle self-employed income), or a true credit judgment?
Quantify the impact. For each issue, ask:
- How often does this happen?
- Does it affect top predictive variables?
- Does it change the decision outcome?
- Would fixing it materially change portfolio risk or approval volume?
Prioritize fixes based on value, not annoyance. If a messy bank statement format occurs 3% of the time and impacts low-importance variables, it’s not worth an automation sprint. But if a recurring document issue affects income accuracy — a top-3 driver in your model — that’s worth solving early.

This approach turns a vague fear (“what if the data’s wrong?”) into a quantitative roadmap for automation investment.

Building Confidence in the Model

Once they had visibility into what analysts were catching, confidence in the model grew fast.

The next step was to run the model and human analysts in parallel for a few months.

By comparing model vs. human decisions on the same applications, they could see:

Where outcomes differed,
How those applicants actually performed, and
Whether the model’s early delinquency rates (FPD, MOB3) held steady or improved.

The results spoke for themselves:

The model’s performance matched — and soon exceeded — human outcomes, with faster approvals and dramatically lower cost per decision.

Even more important, the leadership team learned where automation mattered and where it didn’t.

From Judgment to System

Transitioning from manual underwriting to empirical models is not about replacing human judgment; it’s about systematizing what works and quantifying what doesn’t.

Human analysts are invaluable — but they’re also inconsistent, expensive, and hard to scale.

Models bring precision, repeatability, and transparency.

And for all those messy document cases? The goal isn’t perfection — it’s progressive automation. Fix the high-impact gaps first, measure continuously, and let data — not fear — decide where human judgment should remain.

That’s how lenders move from craft to discipline — without losing what made them good in the first place.