How We Predict Votes: Issue-Level Stance Scoring
A deep dive into how we moved from abstract ideology axes to specific policy issue stances for predicting legislative votes in the South Dakota 101st session — and why it produces better predictions.
The Problem with Left-Right Axes
Our first prediction model scored every bill on two abstract axes: economic (-1 = free-market, +1 = interventionist) and social (-1 = traditional, +1 = progressive). We then built a 2D ideology profile for each legislator from their voting history, and predicted votes using the dot product between bill and legislator vectors.
This worked reasonably well at the bill level (87.7% accuracy on floor vote pass/fail), but had fundamental limitations:
- Too coarse. Knowing a legislator is “economically conservative” doesn't predict their vote on a datacenter tax incentive bill.
- Poor confidence distribution. The 2-axis model crammed 27,000+ predictions into “medium confidence” with only 84.8% accuracy.
- Not interpretable. An ideology score of (-0.3, -0.1) doesn't explain why we predict a certain vote.
Issue-Level Stance Scoring
Instead of two abstract axes, we define 84 specific policy issues organized under 25 topic categories. Each issue is a directional action phrase:
For a property tax bill, we directly compare a legislator's property tax stance to the bill's property tax stance. No abstract mapping required.
How It Works
Data flows from bill text through Claude scoring to per-legislator stance profiles, which drive whip count predictions.
Step 1: Score Bills on Policy Issues
Claude Haiku reads each bill's title, description, AI summary, and topic tags, then identifies 1-5 relevant policy issues from our canonical list. For each issue, it scores:
- Stance (-1.0 to +1.0): What does a Yea vote mean for this issue?
- Relevance (0.0 to 1.0): How central is this issue to the bill?
Example: HB1261 (Property Tax Relief Act)
We scored all 1,202 bills across both the 2025 and 2026 sessions. 1,064 bills received at least one stance; the remaining 138 are procedural with no policy substance.
Issue Coverage
Claude identified 138 unique issues across all bills (84 from the canonical list plus 54 novel issues it proposed). The top 12 issues by bill count:
Step 2: Derive Legislator Stances
For each legislator and each issue, we aggregate their Yea/Nay votes on all bills involving that issue:
Recency weighting ensures current-session votes count more than historical ones. The confidence threshold is 15 votes (vs. 50 for the old model).
Result: 110 legislators with stance profiles across 11,290 total legislator-issue pairs.
Step 3: Predict Votes
The logistic sigmoid is centered at +0.10 rather than 0 — a legislator needs meaningful positive alignment to predict Yea, not just any positive value. This reflects the reality that in a ~90% Republican legislature, most alignment values skew positive.
Earlier versions blended stance alignment with party loyalty (70/30 or 60/40), but backtesting showed the loyalty component hurts accuracy — it can't predict direction without knowing the outcome. The current model uses stance alignment alone, falling back to loyalty only when no stance data exists.
Backtest Results
We backtested all model variants against 799 floor votes (25,228 individual vote predictions) from the 2025-2026 sessions.
The Calibration Problem
Earlier stance models (70/30, 50/50) achieved strong overall accuracy but were badly miscalibrated: they predicted 98.6% of bills would pass, correctly identifying only 6% of actual failures. In a ~90% Republican legislature where most bills align with majority positions, raw alignment values skew heavily positive.
The calibrated sigmoid addresses this by shifting the decision boundary to +0.10 and dropping the party loyalty blend (which can't predict direction without knowing the outcome).
Bill-Level Accuracy
The calibrated model trades some overall accuracy (81.9% vs 88.1%) for dramatically better failure detection — the metric that matters most for a useful whip count.
Failure Detection
The key improvement: The calibrated model correctly predicts 27.8% of bill failures (20/72) while maintaining 87.6% pass recall (598/683). The previous model caught only 6% of failures. A whip count that says “likely passes” for every bill isn't useful — catching contested votes is the whole point.
Individual Vote Accuracy
Individual accuracy is 79.7% with the calibrated model. The raw stance models score higher here, but their predictions cluster too tightly around “yea” to produce useful bill-level confidence spread.
Confidence Calibration
The five confidence buckets are calibrated so that actual yea rates match expectations:
“Likely Yea” predictions are correct 96.7% of the time. “Likely Nay” predictions are correct 93.7% of the time. The tossup bucket sits at 56.4% yea — close to a genuine coin flip.
Limitations
- AI scoring subjectivity. Claude's stance assignments are probabilistic, not deterministic.
- Novel issue normalization. 54 issues beyond the canonical 84 need periodic review and merging.
- Historical session limitations. Only 2025 and 2026 sessions are available.
- False alarms. The calibrated model predicts “fail” for 85 bills that actually passed, mostly lopsided or procedural votes where stance alignment happens to be near-neutral.
Summary
| Metric | Stance (uncalibrated) | Calibrated Sigmoid |
|---|---|---|
| Bill-level accuracy | 88.1% | 81.9% |
| Failure recall | ~6% | 27.8% |
| Pass recall | ~99% | 87.6% |
| Individual vote accuracy | 82.6% | 79.7% |
| “Likely Yea” actual yea rate | n/a | 96.7% |
| “Likely Nay” actual yea rate | n/a | 6.3% |
The calibrated stance model is now live on leg.noahbrown.io. The backtest was run against 799 floor votes (25,228 individual predictions) from the 2025-2026 sessions.