top of page

Are You Overfitting to a Weird Economic Period?

  • Writer: Leland Burns & Jim McGuire
    Leland Burns & Jim McGuire
  • Apr 27
  • 4 min read

It’s a question we hear often during model builds:


“Are we overfitting to a weird period in the data?”

Sometimes the concern is macro — COVID, stimulus, rate shocks.


Other times it’s more internal — a change in product, channel, or geography.


Either way, the underlying issue is the same:


Does the data we trained on actually reflect the environment we’re about to operate in?

That’s not always an easy question to answer. But it’s one you have to ask.



What “Overfitting” Means in This Context


In a textbook sense, overfitting means a model performs better on its training data than it does on unseen data.


But in credit modeling, the more subtle version shows up over time.


A model can:


  • Perform well on training data

  • Perform well on in-time test data

  • Still degrade in production


Not just in AUC — but in business outcomes:


  • Approval strategies miss targets

  • Loss rates drift higher than expected

  • Pricing assumptions break down


That’s often a sign the model has learned patterns tied to a specific period — not patterns that generalize.



Why This Happens More Than You Think


Time-based overfitting isn’t just about macro shocks. It can come from a wide range of changes.


1. Macro and Market Conditions


COVID is the obvious example:


  • Stimulus programs changed borrower behavior

  • Lender policies shifted dramatically

  • Bureau data itself behaved differently


Interest rate changes can have similar second-order effects:


  • Different borrower selection into products

  • Changes in repayment behavior

  • Shifts in underlying risk


Even robust datasets can behave differently under these conditions.



2. Changes in Your Own Business


Sometimes the bigger risk isn’t the economy — it’s you.


Common examples:


  • Expanding into new geographies

  • Adding new retail or channel partners

  • Launching new products

  • Changing underwriting or marketing strategy


In one case we saw, a lender had strong performance with a narrow set of retail partners — but struggled as they expanded. The original model had effectively learned a very specific customer profile that didn’t generalize.


That’s not a modeling bug. It’s a data reality.



There’s No Perfect Fix — But There Is a Process


One of the most important things to acknowledge:


There’s rarely a clean solution.


You’re almost always balancing tradeoffs:


  • More data vs more relevant data

  • Longer history vs cleaner regimes

  • Stability vs recency


The goal isn’t perfection. It’s awareness and control.



What We Do to Guard Against It


1. Out-of-Time Validation


This is the most important check.


We hold out the most recent portion of data — and don’t touch it during model development. That dataset becomes a proxy for “tomorrow.”


If performance drops meaningfully there, it’s a clear signal something isn’t generalizing.



2. Feature-Level Stability Checks


We don’t just evaluate the model — we evaluate the inputs.


For each key feature, we ask:


  • Is the distribution stable over time?

  • Does its relationship with risk hold?

  • Are reporting definitions changing?


A simple example: bureau inquiry counts have changed meaning over time due to reporting shifts. A feature that once signaled risk may gradually lose that meaning.



3. Segment-Level Performance


We test models across meaningful slices:


  • Channels

  • Products

  • Geographies

  • Customer types


A model that looks strong overall can break down in specific segments — often where the business is evolving.



4. Baseline Comparisons


We almost always benchmark against a stable reference (e.g., a bureau score).


Not because it’s perfect — but because it provides context.


If both the baseline and new model shift similarly over time, the issue may be the data environment.


If only the new model degrades, that’s a stronger signal of overfit.



5. Targeted Diagnostics (When Needed)


When we suspect a specific issue — like COVID-era distortion — we go deeper:


  • Compare feature importance across time periods

  • Use tools like SHAP to see what’s driving predictions

  • Analyze how relationships change between “normal” and “disrupted” periods


This helps distinguish between:


  • A model that’s broken

  • And a world that’s changed



Real-World Tradeoffs


Example: Modeling Through COVID


In one indirect auto project, we had no clean way to avoid COVID-era data:


  • Going further back introduced outdated dynamics

  • Excluding the period reduced sample size and relevance


So we included it — but with heavy scrutiny:


  • Validated across pre- and post-COVID periods

  • Stress-tested assumptions

  • Built in conservative guardrails


The model held up — but only because we treated the data with caution, not blind trust.



Example: Expanding a Business Beyond Its Roots


In another case, a lender expanding into new geographies saw performance deteriorate quickly.


Their model wasn’t wrong — it was just trained on a narrow, highly specific customer base.


We adjusted:


  • Training strategy

  • Policy design

  • Segmentation approach


The fix wasn’t just technical. It required acknowledging that the future didn’t look like the past.



Final Thoughts


Overfitting to a time period isn’t always obvious. And it’s not always avoidable.


But it is manageable — if you approach model development with the right mindset:


  • Be skeptical of in-sample performance

  • Validate against the future, not just the past

  • Understand your data, not just your metrics

  • And most importantly, align the model with how the business is evolving


Because the real risk isn’t that your model is “wrong.”


It’s that it’s perfectly tuned to a world that no longer exists.

bottom of page