# Energy Decision Stack Scoring: Sensitivity Analysis

## Executive Summary

This sensitivity analysis tests the robustness of the Energy Decision Stack's core finding: **that above-field (non-Physical) layers represent approximately 90% of the wage bill at risk**. The analysis asks: how much does this headline metric shift if compressibility scores are systematically or randomly off by ±1 or ±2 points?

**Key Findings:**
- The above-field wage bill share is **highly robust** to scoring uncertainty.
- Even extreme directional errors (e.g., all Physical operations scores wrong by ±1) shift the metric by only ±2.6 percentage points.
- Monte Carlo testing across 1,000 random perturbations (±1 and ±2) shows negligible variance (std dev < 0.7%).
- The directional finding—that most exposed dollars are above field—survives all reasonable scoring scenarios.

---

## Methodology

### Wage Bill Calculation

The wage bill for each role is computed as:

```
wage_bill = (employment × layer_median_comp_proxy × compressibility_score) / 10
```

Where:
- **employment**: Estimated headcount from BLS or industry source
- **layer_median_comp_proxy**: Median compensation for the layer ($58K–$180K)
- **compressibility_score**: 0–10 scale, where 10 = fully compressible (exposed)

### Layer Categorization

**Above-Field (Non-Physical):**
- Technical operating intelligence (139 roles)
- Corporate and administrative (96 roles)
- Capital markets and investing (41 roles)
- Governance and oversight (28 roles)
- Advisory, legal, and assurance (57 roles)

**Physical Operations:**
- Physical operations (43 roles)

### Above-Field Wage Bill Share

Defined as:

```
share = (wage_bill of above-field roles / total wage_bill) × 100%
```

---

## Baseline Results

| Metric | Value |
|--------|-------|
| **Above-field wage bill share** | **90.25%** |
| Above-field wage bill | $212.7 billion |
| Total wage bill | $235.6 billion |
| Physical operations wage bill | $23.0 billion (9.75%) |

---

## Scenario Analysis: Directional Shocks

### Scenario 1: All Physical Operations +1

**Interpretation:** Physical operations scores are systematically underestimated; field work is more compressible than believed.

| Metric | Value | Change |
|--------|-------|--------|
| Above-field share | 87.68% | **-2.57 pp** |

**Top 5 Roles Affected (by wage bill change):**

| Role | Layer | Original Score | New Score | Original Wage Bill | New Wage Bill | Wage Bill Change |
|------|-------|---|---|---|---|---|
| Lineman | Physical | 2.4 | 3.4 | $1.67B | $2.37B | +$696M |
| Roughneck/driller crew | Physical | 2.5 | 3.5 | $1.02B | $1.42B | +$406M |
| Roustabout | Physical | 2.5 | 3.5 | $0.80B | $1.12B | +$319M |
| Field mechanic | Physical | 3.3 | 4.3 | $0.96B | $1.25B | +$290M |
| Data center technician | Physical | 4.5 | 5.5 | $1.17B | $1.44B | +$261M |

---

### Scenario 2: All Physical Operations -1

**Interpretation:** Physical operations scores are systematically overestimated; field work is less compressible than believed.

| Metric | Value | Change |
|--------|-------|--------|
| Above-field share | 92.98% | **+2.72 pp** |

**Top 5 Roles Affected (by wage bill change):**

| Role | Layer | Original Score | New Score | Original Wage Bill | New Wage Bill | Wage Bill Change |
|------|-------|---|---|---|---|---|
| Lineman | Physical | 2.4 | 1.4 | $1.67B | $0.97B | -$696M |
| Roughneck/driller crew | Physical | 2.5 | 1.5 | $1.02B | $0.61B | -$406M |
| Roustabout | Physical | 2.5 | 1.5 | $0.80B | $0.48B | -$319M |
| Field mechanic | Physical | 3.3 | 2.3 | $0.96B | $0.67B | -$290M |
| Data center technician | Physical | 4.5 | 3.5 | $1.17B | $0.91B | -$261M |

---

### Scenario 3: All Non-Physical Scores +1

**Interpretation:** Above-field compressibility is underestimated; office roles are more exposed.

| Metric | Value | Change |
|--------|-------|--------|
| Above-field share | 91.24% | **+0.99 pp** |

**Top 5 Roles Affected (by wage bill change):**

| Role | Layer | Original Score | New Score | Original Wage Bill | New Wage Bill | Wage Bill Change |
|------|-------|---|---|---|---|---|
| Landman | Advisory | 8.3 | 9.3 | $2.44B | $2.73B | +$294M |
| Petroleum engineering group | Technical | 7.1 | 8.1 | $1.55B | $1.77B | +$218M |
| Land technician | Advisory | 8.1 | 9.1 | $1.43B | $1.61B | +$176M |
| Geologist | Technical | 6.9 | 7.9 | $1.18B | $1.35B | +$171M |
| Production engineer | Technical | 7.3 | 8.3 | $1.18B | $1.34B | +$161M |

---

### Scenario 4: All Non-Physical Scores -1

**Interpretation:** Above-field compressibility is overestimated; office roles are less exposed.

| Metric | Value | Change |
|--------|-------|--------|
| Above-field share | 89.01% | **-1.24 pp** |

**Top 5 Roles Affected (by wage bill change):**

| Role | Layer | Original Score | New Score | Original Wage Bill | New Wage Bill | Wage Bill Change |
|------|-------|---|---|---|---|---|
| Landman | Advisory | 8.3 | 7.3 | $2.44B | $2.15B | -$294M |
| Petroleum engineering group | Technical | 7.1 | 6.1 | $1.55B | $1.33B | -$218M |
| Land technician | Advisory | 8.1 | 7.1 | $1.43B | $1.25B | -$176M |
| Geologist | Technical | 6.9 | 5.9 | $1.18B | $1.01B | -$171M |
| Production engineer | Technical | 7.3 | 6.3 | $1.18B | $1.02B | -$161M |

---

## Monte Carlo Testing: Random Perturbations

### ±1 Point Perturbation (1,000 iterations)

To simulate realistic scoring uncertainty, we randomly assigned each of the 404 positions a ±1 perturbation to their compressibility score and recalculated the above-field share. This was repeated 1,000 times to understand the distribution.

| Statistic | Above-Field Share |
|-----------|-------------------|
| Mean | 90.24% |
| Median | 90.23% |
| Std Deviation | 0.49% |
| 5th Percentile | 89.45% |
| 95th Percentile | 91.07% |
| Min | 88.93% |
| Max | 91.91% |

**Interpretation:** Even with fully random ±1 errors across all 404 positions, the above-field share stays within a narrow band (88.9%–91.9%), with 90% of outcomes falling between 89.45% and 91.07%. This is a range of only **±1.6 percentage points** from baseline.

---

### ±2 Point Perturbation (stress test, 1,000 iterations)

To stress-test the robustness, we applied even larger random perturbations (±2 points) to each role's score.

| Statistic | Above-Field Share |
|-----------|-------------------|
| Mean | 90.26% |
| Median | 90.26% |
| Std Deviation | 0.71% |
| 5th Percentile | 89.11% |
| 95th Percentile | 91.45% |
| Min | 87.91% |
| Max | 92.35% |

**Interpretation:** Even under this stress scenario with larger ±2 point errors, 90% of outcomes fall between 89.11% and 91.45%—still a narrow range of only **±2.3 percentage points**. The metric remains fundamentally robust.

---

## Key Insights

### 1. Physical vs. Non-Physical Asymmetry

The magnitude of the effect differs substantially by layer:
- **Physical operations ±1** shifts the metric by ±2.6 percentage points (directional shock has large impact)
- **Non-physical ±1** shifts the metric by ±1.2 percentage points (directional shock has smaller impact)

This is because Physical operations represents only 9.75% of the total wage bill. A 10% relative shift in this layer (from ±1 on scores in the 2–5 range) translates to a modest absolute shift in the overall share.

### 2. Extreme Scenarios Are Unrealistic

The worst-case scenarios (all Physical +1 or all Physical -1) require *systematic* bias in scoring, where every single Physical operations role is misdated in the same direction. The Monte Carlo testing shows that random errors are much more likely, and those show minimal variance.

### 3. The Directional Finding Is Robust

Across **all scenarios**, above-field roles represent **87.7% to 93.0%** of the wage bill. Even in the most pessimistic scenario (Physical ops all underscored), above-field exceeds 87%. In the most optimistic (Physical ops all overscored), it reaches 93%.

The core claim—**that most AI-exposed wage dollars are above field**—holds under any reasonable scoring perturbation.

---

## Robustness Summary by Scenario

| Scenario | Share | Change | Interpretation |
|----------|-------|--------|-----------------|
| **Baseline** | 90.25% | — | Observed |
| Physical +1 | 87.68% | -2.57 pp | Field work more compressible (worst case) |
| Physical -1 | 92.98% | +2.72 pp | Field work less compressible (best case) |
| Non-physical +1 | 91.24% | +0.99 pp | Office work more compressible |
| Non-physical -1 | 89.01% | -1.24 pp | Office work less compressible |
| MC ±1 (5th %ile) | 89.45% | -0.80 pp | Random errors, pessimistic outcome |
| MC ±1 (95th %ile) | 91.07% | +0.82 pp | Random errors, optimistic outcome |
| MC ±2 (5th %ile) | 89.11% | -1.14 pp | Stress test, pessimistic outcome |
| MC ±2 (95th %ile) | 91.45% | +1.20 pp | Stress test, optimistic outcome |

---

## Sensitivity Range: ±2.7 percentage points

The **total sensitivity range** is from 87.68% (Physical all +1, the direction that minimizes above-field share) to 92.98% (Physical all -1, the direction that maximizes it). This is a spread of **5.3 percentage points**, with the baseline (90.25%) roughly in the middle.

For practical purposes, readers can expect the true above-field share to lie within **87% to 93%** under realistic scoring uncertainty. This is a tight range for a metric affecting strategic workforce planning.

---

## Methodology FAQ

### Why use ±1 perturbations?

Compressibility scores are 0–10 scales, typically assigned by subject-matter experts reviewing role descriptions and task structures. A ±1 point error reflects reasonable uncertainty in judgment—not a gross misestimate, but a plausible oversight or disagreement. ±2 is more conservative, capturing cases where domain experts significantly miscalibrate a score.

### Why test Physical and non-Physical separately?

The two layers have different baseline scores and employment bases. Physical operations roles are lower-scored (median ~3.5) but highly consequential. Non-physical roles are higher-scored (median ~7.5) and represent the bulk of the wage bill. Testing separately reveals which layer drives sensitivity.

### Why use Monte Carlo?

Directional shocks (all Physical +1) ask "what if we're systematically wrong?" Monte Carlo asks "what if we have random errors?" The latter is more realistic in practice, since scoring teams rarely agree on direction, and errors tend to scatter. 1,000 iterations produce stable statistics (small standard errors) for the percentiles.

### Are negative compressibility scores allowed?

No. If a scenario would push a score below 0, that value is clamped at 0. In practice, few Physical operations roles have scores below 1.0, so lower clamping is rare. For upper bounds, scores above 10 are clamped at 10. Upper-end clamping does occur: 11 non-Physical positions already score above 9, so a +1 directional shock on Non-Physical roles necessarily clamps these at 10.

---

## Parametric Stress Grid: Employment and Compensation Assumptions

The score perturbation tests above vary compressibility scores while holding employment and compensation fixed. A separate question: what if the employment and compensation assumptions for Physical operations are wrong?

The claims ledger (C21) cites a range of ~76%–~96% under "broad stress testing of field headcount (0.5×–2×) and comp (0.75×–1.5×)." This section documents that calculation.

### Method

Hold all above-field positions at their baseline values. Multiply Physical operations employment by a headcount multiplier (0.5× to 2.0×) and Physical operations layer compensation proxy by a comp multiplier (0.75× to 1.5×). Recompute the above-field wage bill share under each combination.

### Results: Above-Field Wage Bill Share (%)

| HC mult \ Comp mult | 0.750× | 0.875× | 1.000× | 1.125× | 1.250× | 1.500× |
|---------------------|--------|--------|--------|--------|--------|--------|
| **0.50×** | 96.1% | 95.5% | 94.9% | 94.3% | 93.7% | 92.5% |
| **0.75×** | 94.3% | 93.4% | 92.5% | 91.6% | 90.8% | 89.2% |
| **1.00× (baseline)** | 92.5% | 91.4% | **90.3%** | 89.2% | 88.1% | 86.1% |
| **1.25×** | 90.8% | 89.4% | 88.1% | 86.8% | 85.6% | 83.2% |
| **1.50×** | 89.2% | 87.6% | 86.1% | 84.6% | 83.2% | 80.4% |
| **2.00×** | 86.1% | 84.1% | 82.2% | 80.4% | 78.7% | 75.5% |

**Range under stated stress bounds (0.5×–2.0× headcount, 0.75×–1.5× comp):**

- Minimum: **75.5%** (HC=2.0×, comp=1.5× — field employment doubled and comp increased 50%)
- Maximum: **96.1%** (HC=0.5×, comp=0.75× — field employment halved and comp reduced 25%)
- Displayed as: **~76% to ~96%**

The minimum scenario (75.5%) is extreme: it assumes field employment is twice what we estimated AND field compensation is 50% higher. Even then, above-field still accounts for three-quarters of the exposed wage bill.

### Role-Only Analysis (373 roles, excluding 24 workflows + 7 artifacts)

Excluding workflow and artifact rows and applying the same parametric stress:

- Baseline role-only above-field share: **89.4%**
- Stress range (0.5×–2.0× HC, 0.75×–1.5× comp): **73.8% – 95.8%**
- Displayed as: **~74% to ~96%**

The lower bound (73.8%) is close to the BLS-only anchor-row floor (74.2%), confirming that even the most pessimistic parametric stress on roles alone converges with the measured-only baseline.

### BLS-Only Anchor Floor

Using only the 60 BLS/industry-sourced rows (no formulaic estimates): above-field share = **74.2%**. This is the floor — the share computed with zero reliance on modeled employment estimates.

---

## Reconciliation: Three Sensitivity Methodologies

Three different tests produce three different ranges. Each answers a different question:

| Test | What it varies | Range | Question answered |
|------|---------------|-------|-------------------|
| **Score perturbation** (directional shocks) | Compressibility scores ±1 for all roles in one layer | 87.7% – 93.0% | What if we're systematically wrong about one layer's compressibility? |
| **Monte Carlo** (random perturbation, ±2) | Each role gets a random ±2 score change, 1,000 iterations | 87.9% – 92.4% (full range); 89.1% – 91.5% (90% CI) | What if every role has random scoring error? |
| **Parametric stress grid** (employment & comp) | Physical ops headcount 0.5×–2.0×, comp 0.75×–1.5× | 75.5% – 96.1% | What if our employment and compensation assumptions for the field are wrong? |

The parametric grid produces the widest range because it varies the input assumptions most aggressively (doubling field employment is a large structural change). The score perturbation and Monte Carlo tests are tighter because they vary scores within a narrower band (±1 or ±2 points on a 0–10 scale).

All three tests confirm the directional finding: above-field layers carry the majority of AI-exposed wage dollars under any tested specification.

### Canonical Baseline Figure

The exact computed baseline is **90.2518%** (from the production dataset). This is displayed as:

- **~90%** in headline text and hero sections (rounded for readability)
- **90.25%** in sensitivity analysis tables (two decimal places for precision)
- **~90.3%** in HTML tooltips (one decimal place with tilde to indicate rounding)

All three refer to the same underlying calculation. The differences are display rounding, not different computations.

---

## Conclusion

Sensitivity analysis confirms that the Energy Decision Stack's headline finding—**approximately 90% of AI-exposed wage dollars are in above-field layers**—is **robust** across three independent testing methodologies:

1. **Score perturbation:** 87.7%–93.0% (directional shocks of ±1 point)
2. **Monte Carlo:** 88.9%–92.4% (random ±2 perturbations, 1,000 iterations)
3. **Parametric stress grid:** 75.5%–96.1% (field employment 0.5×–2.0×, comp 0.75×–1.5×)

Even the most extreme combination — doubling field employment, increasing field compensation by 50%, and applying systematic +1 score errors to all Physical operations roles — would still leave above-field at approximately 73% of the exposed wage bill. The directional finding survives every tested specification.

---

## Data

- **Dataset:** energy_decision_stack_dataset_v32_augmented.csv
- **Total positions analyzed:** 404 (373 roles + 24 workflows + 7 artifacts)
- **Above-field positions:** 361
- **Physical operations positions:** 43
- **Analysis date:** 2026-04-01 (score perturbation and Monte Carlo); 2026-04-02 (parametric stress grid)

---

*Score perturbation and Monte Carlo analyses were generated computationally using pandas, NumPy, and Python 3. Parametric stress grid was computed on the same dataset using the same wage bill formula. Monte Carlo iterations use a fixed random seed for reproducibility.*
