The Predictive Playbook – Week 2 RB Projections: Results & Lessons

Posted by:

|

On:

|

🔗 GitHub Repo: https://github.com/ndbryant21-eng/fantasy-football-ml
📓 Notebooks folder: https://github.com/ndbryant21-eng/fantasy-football-ml/tree/main/notebooks


🚀 TL;DR

  • Built the RB twin of my QB pipeline: train → project → pull actuals → score.
  • Training set: 9,742 rows · OOF: MAE 4.91, R² 0.246.
  • Week 2 (2025) results (RBs only):
    • Top-12 overlap: 50.0%
    • Top-24 overlap: 87.5%
    • Top-36 overlap: 100%
    • “Start” tier (projected top-24) — Precision: 0.833, Recall: 0.625, F1: 0.714
  • Calibration (actual ≈ 5.037 + 0.526·proj) nudged fit:
    • MAE: 5.315 → 5.220
    • RMSE: 6.516 → 6.171
    • R²: 0.019 → 0.120

🧠 Model (RB)

  • Regressor: tree-based (non-linear, robust to mixed scales) + post-hoc linear calibration.
  • Features:
    • Player form: r3/r5 rolling means (shifted) for rushing & receiving volume/TDs.
    • Team context: team points r3/r5, home/away, opponent.
    • Vegas implied team total (computed from spread & total when available).
  • Why this mix? RB scoring = usage + TD variance. Team environment/vegas help with TD odds; short rolling windows capture role/health shifts faster.

🧪 Data Flow

  • Primary tables via nfl_data_py: weekly, schedules, lines.
  • Fresh-week fallback: when the weekly table isn’t posted yet, derive actuals from PBP.
  • RB-only guardrails:
    • Prefer rosters (GSIS IDs) → else build a historical weekly position map → always exclude likely QBs (≥5 pass attempts).

🔧 Week-of Fixes (that mattered)

  • Home/Away: pulled straight from the target-week schedule slice (fixed earlier all-zero bug).
  • Vegas implied: attaches as soon as lines exist (stays NaN safely if not).
  • Team r3/r5 backfill: fills context when the current week is thin.

📊 Headline Results (Week 2, 2025 · RBs)

  • Top-N overlap vs actual ranks
    • Top-12: 50.0%
    • Top-24: 87.5%
    • Top-36: 100.0%
  • Tier performance (did “Start” actually finish top-24?)
    • Start: 83.3% hit rate
    • Stream: 100% (small sample)
    • Stash: 80%
    • Sit: n/a (no scored sits)
  • Classification view (Start = positive) Pred Start | Pred Not-Start Actual Top-24 15 | 3 Actual 25+ 9 | 1 Precision: 0.833 · Recall: 0.625 · F1: 0.714
  • Calibration
    • Fit: actual ≈ 5.037 + 0.526·proj
    • Gains this week: MAE 5.315 → 5.220, RMSE 6.516 → 6.171, R² 0.019 → 0.120
    • Saved: models/rb_calibrator_w2_2025.joblib

Read: baseline model ranks reasonably; raw scale is a bit optimistic for some tiers—light calibration helps without flattening ranks.


📉 Biggest Misses: Why They Happened & What I’m Changing

A few RB outcomes swung hard on role/TD variance and game script. Here are the largest errors, grouped by under-projected spikes (too low) and over-projected duds (too high), plus concrete fixes entering Week 3.

🔺 Under-projected spikes (actual ≫ projected)

  • Javonte Williams (DEN) +15.2 — Goal-line/role bump: bigger short-yardage share or positive-script carries than r3/r5 implied.
    Fix: add inside-5 carry share (r3/r5), team red-zone rush rate, and a script weight from spread/total.
  • Rhamondre Stevenson (NE) +8.6 — Receiving usage spike / consolidated snaps in neutral/negative script.
    Fix: track route rate & two-minute drill snaps via proxies (targets share r3, LDD carries).
  • Christian McCaffrey (SF) +7.4 — Top-end bias: model under-credited TD equity despite high implied team total.
    Fix: couple Vegas implied with player TD share priors (inside-10 carry share; RB TD share r5).
  • Kenneth Walker (SEA) +7.3 — Explosive run or goal-line tilt vs split expectations with Charbonnet.
    Fix: add explosive-run rate allowed (defense r3/r5) and RB1 vs RB2 carry share separation.
  • De’Von Achane (MIA) +6.8 — Efficiency outlier (big plays) and high-leverage touches.
    Fix: include team pace/plays, boom-rate (10+ yd run share r5) + opponent missed-tackle proxy.
  • J.K. Dobbins (LAC) +6.3 / Jaylen Warren (PIT) +4.8 / James Conner (ARI) +5.0 — Role concentration inside the 10 + script alignment.
    Fix: explicit goal-line ownership metric and neutral-script run rate by team.

🔻 Over-projected duds (projected ≫ actual)

  • Rico Dowdle (DAL) −11.4 — Committee gravity and/or pass-heavy script; rolling usage overstated early-down share.
    Fix: add usage-volatility penalty (variance of carry/target share r3) and an opponent pass-funnel flag.
  • Aaron Jones (MIN) −10.9 — Snap/rotation cap or limited routes; older-RB volatility can burn volume priors.
    Fix: decay older-season priors faster and weight recent snap share above touches.
  • Zach Charbonnet (SEA) −10.4 — RB2 usage fell vs expectation (Walker dominance).
    Fix: depth-chart-aware features: RB1/RB2 shares and crowding index (RBs ≥25% touch share r3).
  • Breece Hall (NYJ) −6.5 — Tough front and/or flow suppressed rush attempts & targets.
    Fix: opponent RB-defense allowed (r3/r5): rush EPA/success allowed & RB receiving yards allowed.
  • Jahmyr Gibbs (DET) −6.7 — Still solid; receiving/TD expectation ran hot.
    Fix: tier-aware shrinkage at the top + keep calibration on.

🧪 Quick diagnostic checklist I run on each miss

  • Usage: rush att, targets, snap%, route%, 2-min/drill snaps (proxy via targets)
  • Leverage: inside-10/inside-5 carries, red-zone opps
  • Script: actual team run rate by quarter vs expected from spread/total
  • Opponent: RB rush/receiving allowed r3/r5, stuff%, explosive-run% allowed

If two or more light up (e.g., snap% + GL share), it’s a modeling gap—not just randomness.


🧱 Tier Takeaways

  • Stream looks centered but volatile (that’s streaming).
  • Start captures most of the right names; small tier-aware calibration trims hot projections at the very top without killing rank signal.

🛣️ What’s Next

  • Opponent RB-D allowed (r3/r5): rushing EPA/success, RB receiving yards/target, explosive-run% allowed.
  • Goal-line & red-zone priors: inside-5 carry share (player & team), team RZ rush rate.
  • Committee & volatility features: RB Herfindahl index, r3 variance of carry/target share.
  • Script weighting: expected plays + run rate from vegas baked into usage expectation.
  • Auto-calibration in inference: load last known calibrator per season/week if present.
  • A tiny weekly runner that saves outputs/rb_week<week>_<season>_scored.csv and appends to the metrics log.

🔗 Project Links

If you’ve got ideas on opponent context or usage priors you want to see, drop an issue in the repo. On to Week 3. 🏈📈

Posted by

in