Charts on this page visualize AI exposure scores across 404 energy positions. Each exhibit can be explored interactively. Data is available in the downloadable CSV dataset.
Data Deploy first Role map Research

The people and the money are in different buildings.

We scored every position in energy by AI exposure. The biggest workforce is where the smallest dollars are. The industry has been staring at the wrong building.

404 positions · 6 layers · BLS-sourced employment · Inspired by Karpathy

Scroll
2.5
Roughneck
Pipe, mud, steel. AI doesn't touch this job.
8.5
Production accountant
Same spreadsheet. Same format. Rebuilt from scratch every month.
Above the field
90%

of every AI-exposed dollar in energy is spent on people who never touch a wellheadModeledExposure-weighted wage bill = employment × layer comp proxy × compressibility score / 10. Employment: 60 anchor rows (BLS/industry-sourced), 344 formulaic estimates (layer base × tier multiplier, decoupled from compressibility score). Compensation: six layer-level proxies ($58K–$180K). "Above the field" = technical, corporate, advisory, capital markets, and governance layers. Under broad stress testing (0.5×–2× field headcount, 0.75×–1.5× field comp), the above-field share ranges from ~76% to ~96%. Role-only range: ~74% to ~96%.Under an earlier model that coupled employment to compressibility score, the baseline was ~81%. The decoupled model gives 90.25% (exact: 90.2518%), displayed as ~90%. The direction holds across all tested specifications.. Not roughnecks. Not linemen. Not the control room. Analysts, packagers, preparers — the people who build the paperwork the field runs on.

Five findings in 30 seconds

01

The mismatch is the story. Roughly 29% of energy headcount works in the fieldDerivationPhysical operations layer = ~29% of estimated headcount across 404 positions. Employment: 60 BLS-sourced + 344 formulaic (layer base × tier multiplier). Layer bases: Physical 22K, Technical 8K, Corporate 10K, Advisory 7K, Capital mkts 5K, Governance 3K.. Less than 10% of the exposure-weighted wage bill is thereDerivationExposure-weighted wage bill = Σ(employment × layer comp proxy × compressibility score ÷ 10). Physical layer: ~29% of heads but comp proxy is $58K (lowest) and avg score is ~3.5/10. Non-field layers have higher comp ($72K–$180K) and higher scores (6.5–8+). Result: ~90% of exposed wage bill sits above field.. Twenty-nine percent of the headcount. Ten percent of the exposed dollars.

02

Titles survive. Hours don't. The board still signs. The engineer still attests. What compresses is the prep stack underneath — drafting, reconciliation, packaging. The assemblers are exposed, not the decision-makers.

03

The upside isn't efficiency. Cheaper analysis doesn't speed up the same memo. It lets you run the forty scenarios you've been skipping — because each one costs three engineer-weeks, and nobody has three spare engineer-weeks.

04

Three workflows already pencil. Treasury/lender readiness, ownership and title, regulatory planningMethodologySelected as beachheads because: (1) company controls the prep work and document assembly, (2) recurring cycle (redet, title curative, rate case = annual or semi-annual), (3) clear budget owner (Treasurer, Land Manager, VP Regulatory), (4) compressibility scores 6.5–8.2 across constituent roles. See Exhibit 8 for full economics. Note: treasury prep is company-controlled; title/curative and regulatory are semi-controlled (external parties govern the outcome, but the company owns the analytical prep).. Company-controlled prep work. No regulator, lender, or board permission needed to start the analytical pilot.

05

Energy is AI's own bottleneck. 2,300 GW sitting in U.S. interconnection queuesMeasuredLBNL "Queued Up" 2025 Edition. Total active capacity in U.S. interconnection queues as of end-2024.. An estimated 60–70% of that wait is document and regulatory workModeledSunya decomposition of FERC Order No. 2023 interconnection timelines into phases. Document-intensive phases (feasibility study, system impact study, facilities study, IA negotiation) vs. physical construction. Not observed from project-level data. Directional estimate.. Compress the paperwork and more data centers come online. More data centers mean more AI compute. AI is waiting in line behind its own permitting stack.

Eighteen months ago, models couldn't reliably parse a 200-page reserve report. They can now. The window between "models can do this" and "your competitor already is" is closing faster than it did for cloud, mobile, or the internet.

All scores are modeled estimates. Every claim is tagged by confidence level. Full methodology below.
Exhibit 1 · 404 positions across 6 layers

The people are in the field. The money isn't.

Roughly 29% of estimated energy employment works in physical operationsModeledSunya classification of 404 positions (373 roles, 24 workflows, 7 artifacts) into 6 organizational layers. Employment estimated from BLS OES (60 anchor rows) and formulaic estimates (344, using layer base × tier multiplier, decoupled from compressibility score). Percentages computed across all 404 rows. Excluding workflows/artifacts changes the above-field share by ~1 percentage point.. Less than 10% of the exposure-weighted wage bill is thereModeledExposure-weighted wage bill = employment × layer comp proxy × compressibility score / 10. Computed across all 404 positions. Physical layer has ~29% of headcount but ~9% of exposed wage bill. The gap widens further at the wage-bill level because physical roles have both lower compensation proxies ($58K vs $95K–$180K) and lower compressibility scores (avg ~3.5 vs ~6.5 above-field).. Most people in the industry sense this. Now there are numbers on it.

A Permian Basin engineering team has twenty reservoir engineers. The people who actually assemble the reserve report? Three analysts in a windowless conference room on the second floor. The engineers make the decisions. The analysts make the package. Ask the CEO which group matters more. Then ask which group is still doing the same job in five years.
Pattern observed across E&P engineering departments

Left: where the people are. Right: where the exposed dollars concentrate. The lines cross. Ninety cents of every AI-exposed wage dollar sits above the field — seventy-four cents even when restricted to BLS-anchored rows only.

What could break this

Faster packets ≠ faster decisions. A compressed borrowing-base package still waits for the VP's calendar, the lender committee, and the bank's own reserve engineers. Maybe 30% of cycle time is work product. The rest is organizational friction AI doesn't touch.

Bad source data breaks everything. Well logs from the 1970s were hand-transcribed. Conflicting spreadsheets, stale type curves, vintage assumptions. AI on bad data produces bad results faster.

Weakly trusted outputs increase review burden. If the model's work isn't trusted, reviewers spend more time checking than they saved. The net effect can be negative until confidence builds.

How robust is the ~90% figure?

Below: what happens when we stress field-layer headcount and compensation — the two inputs most likely to be underestimated. Non-field employment estimates and compensation proxies are held constant; stressing those would require a broader sensitivity test. The pattern holds across the range tested because the compensation gap between layers dominates the math.

Rows: field employment multiplier (how much larger or smaller the physical layer headcount might actually be). Columns: field compensation adjustment. The base case uses the current model assumptions. Even at the extremes — 2× field headcount at 1.5× field pay — the above-field share still shows the majority of exposed dollars sitting outside physical operations.

Extended sensitivity: above-field parameter stress

What happens when we vary the parameters the above-field layers use? Rows: multiplier on all non-Physical layer bases. Columns: multiplier on all non-Physical compensation proxies. Both directions weaken the result by making above-field layers less dominant.

What drives the ~90% — and what survives stress

Three factors build the above-field concentration. The decomposition below shows how much each contributes (under the score-neutral employment model), and the stress tests show what happens when we deliberately weaken the most attackable assumptions.

Factor decomposition

Employment alone
70.5%
+ compensation proxies
79.4%
+ compressibility scores
90.3%

Even with uniform compensation and uniform compressibility, ~71% of estimated headcount sits above the field (under score-neutral employment). Compensation proxies add ~9 percentage points. Compressibility scores add another ~11. The direction is established by headcount distribution alone — the other factors amplify it.

Stress tests

Scenario
Above field
Baseline — all 404 positions, score-neutral employment
90.3%
Flatten compensation to $80K for all layers
(removes the pay-gap amplifier between Physical at $58K and Governance at $180K)
85.2%
Cap compressibility at 7.0 for all rows
(assumes above-field roles are less compressible than scored)
89.0%
Both: flat compensation + capped scores
(harshest combination — removes both amplifiers)
83.4%
Anchor rows only (60 of 404, BLS/industry-sourced)
(excludes all formulaic employment estimates)
74.2%
Roles only — exclude workflows and artifacts (373 of 404)
(addresses the mixed-ontology objection)
89.4%

The above-field share ranges from 74–96% across all stress tests. Even under the harshest tested specification (BLS-anchored rows only, excluding all formulaic estimates), roughly three of every four exposed wage dollars still sit above the field. The directional finding is not an artifact of the employment model, the compensation proxies, or the scoring method.

What could break this

344 of 404 employment estimates use a deterministic formula (layer base × tier multiplier) rather than observed data. Three tier bands (0.75×, 1.0×, 1.25×) assigned by stable hash provide structural spread, not an economic model. The compensation proxies are layer-level averages, not role-specific wages. If physical-layer headcount is substantially larger than modeled, or if above-field compensation proxies are too high, the ~90% figure compresses — but the stress grid shows it stays above 74% even when physical headcount doubles and physical comp increases 50%. The anchor-row test (74.2%, using 60 BLS/industry-sourced rows only) provides the hardest floor. The direction holds across all specifications. The exact percentage is scaffold-dependent — treat it as an interval (74–96%), not a point estimate.

Compressibility scores
404 / 404
All positions scored
Augmented axes
304 / 404
Decision criticality, reasoning demand, company control
Employment anchoring
60 / 404
BLS-sourced · 344 use formulaic estimates

What this does not claim

Modeled, not observed. Every score is informed estimation. No before/after workflow data exists yet. Validation against deployed systems is v33.

Exposure is not replacement. A high compressibility score means the prep stack compresses. The title, the signoff, and the judgment remain human.

Scores rank directionally. A 0.4-point difference between two roles is noise, not signal. Read tiers, not decimals.

Employment and comp are proxies. 60 of 404 positions use BLS-sourced employment. The remaining 344 use a formulaic estimate (layer base × tier multiplier) — structural scaffolding, not survey data. Compensation uses six layer-level proxies ($58K–$180K), not role-specific wages. The layer-level pattern is the claim; individual role numbers are not precise.

External approvals stay external. Compressing a borrowing-base packet does not make the lender approve faster. Compressing a rate-case filing does not make the PUC rule sooner. Company-controlled loops compress. Externally governed outcomes do not.

Energy is finance in 1975.MethodologyThe analogy: pre-quant finance relied on individual judgment, manual analysis, and relationship-driven capital allocation. Systematic approaches (Black-Scholes 1973, index funds, algorithmic trading) didn't replace finance — they restructured who captured value. Energy today: $3.3T invested annually, largely driven by individual analyst judgment, manual scenario analysis, relationship-driven deal flow. The pattern match is structural, not predictive.

$3.3 trillion in annual investmentMeasuredIEA World Energy Investment 2025. Total global energy investment including clean energy, fossil fuels, and grids.. Almost all of it driven by individual judgment, individual spreadsheets, individual analysts running four scenarios when they should run forty.

The firms doing this in 1975 — Kidder Peabody, Drexel, Salomon Brothers — don't exist anymore.

Exhibit 2 · 404 positions

404 positions. Same story every time.

Area = headcount. Color = AI exposure. The biggest boxes are the lightest — most people work in roles AI barely touches. The dark boxes are small but expensive. Every strategy deck worries about the big light boxes. The money is in the small dark ones.

The mismatch from Exhibit 1 holds everywhere. Which roles matter most? Rank by score and you get one answer. Rank by dollars and the list reshuffles.

Exhibit 3 · Wage-bill ranking

Score tells you what's compressible. Dollars tell you where to start.

Trade-exception processing and non-op accounting score highest on the AI exposure index. But they don't dominate the economic pool. Landman, lineman, petroleum engineering, land tech, payroll, geologist, production engineer — those dominate the wage bill. Score tells you what's compressible. Dollars tell you where to start.

Top 20 roles ranked by estimated exposure-weighted wage bill. Line length = relative wage share. Circle color = compressibility score.

Exhibit 4 · Scenario analysis (hypothesis)

The system surfaces the decision. The engineer applies judgment.

Two versions of this story. The incremental one: AI makes the same analysts faster. The structural one: the system runs forty scenarios, flags the three that actually change the decision, and delivers them before anyone asks. The bottleneck shifts from evidence assembly to judgment. That's where it should be.

A shale E&P ran four development scenarios last year. They should have run forty. Each one required a reservoir engineer working three weeks. Nobody ever decided to skip the analysis. It was just too expensive. The board approved a $2B capital program based on four scenarios. Everyone signed off. Everyone knew it wasn't enough. That's not incompetence. It's how the industry works. For now.
Pattern observed across shale development teams

Karpathy's framing applies here directly: "To get the most out of the tools that have become available now, you have to remove yourself as the bottleneck."ObservationAndrej Karpathy, No Priors podcast, January 2026. In context, Karpathy was discussing AI research workflows, but the principle generalizes: when preparation cost drops, the binding constraint shifts from assembly capacity to decision quality. In energy, that means the reservoir engineer stops being bottlenecked by scenario prep time and starts being bottlenecked by judgment — which is exactly where you want the bottleneck. In energy, "the bottleneck" is currently evidence assembly. When that compresses, the constraint shifts to judgment — exactly where it should be.

Two paths (hypothesis — each requires validation)

Path A · Copilot

Same workflow, faster. Analysts run 4 scenarios in 3 days instead of 3 weeks. The decision stays the same. Cost savings: 10–20% on analytical labor.

Path B · Intelligence

Different structure entirely. The system composes 40 scenarios from capabilities, flags the 3 that change the decision, and delivers them before the cycle starts. The decision changes. Value: 10–100× the labor savings.

The gap today: No published evidence for either path in energy. Financial trading precedent supports Path A. v33 targets the question: do additional scenarios actually change the final decision, or does the team anchor on the first?

Annual model-token scenarios per operating unit. Hypothetical — not observed.

The Jevons trap — and why G&A is the wrong denominator

Most companies will aim AI at the safest, lowest-value work first. Email drafts. Meeting summaries. Formatting. That's natural — nobody gets fired when a model hallucinates a meeting recap. But it misses the point by 40:1.

A mid-size E&P spends $50–100M on G&ADerivationRange based on public filings. Pioneer Natural Resources reported ~$200M G&A in 2023 (including merger costs; ~$100M excluding). Diamondback Energy runs ~$0.69/boe cash G&A. Mid-market operators with 50–100K boe/d production typically fall in the $50–100M range depending on accounting treatment.. It spends $2–6B on development capital. The ratio is 30-to-1 or worse. The entire AI-for-energy pitch is aimed at the smaller number.

This is the Jevons pattern applied to reasoning. When Jevons observed that cheaper coal didn't reduce coal consumption — it increased it — he identified a dynamic that maps directly onto analytical work. Make analysis cheap and the company doesn't just do the old analysis faster. It asks questions it couldn't previously afford to ask. Four scenarios become forty. The team runs the sensitivity work it was skipping because each additional case cost three engineer-weeks. Total demand for analytical work goes up, not down.

That's why this is not an efficiency story. Saving 20% of G&A on $75M is $15M. A 2% improvement in capital allocation on a $3B development program is $60M. The industry is optimizing the line item that doesn't matter.

In 2019, Concho Resources drilled 23 wells at 230-foot spacing on the Dominator padMeasuredConcho Resources Q2 2019 earnings. Dominator project: 23 horizontal wells across five Wolfcamp targets in the Delaware Basin. 230-ft horizontal spacing vs. area average of ~600 ft. Assembled with 7 rigs from 6 contractors and 5 frac fleets. — roughly a third of what the rest of the Delaware Basin was running. The thesis was more wells per section, more resource recovery. What happened was well interference: production was tracking 38% below the rest of Concho's Wolfcamp projects — and headed toward 45% below pre-drill estimatesModeledDominator cumulative production tracked 38% below the rest of Concho's Wolfcamp projects (measured). Independently developed type curves projected wells would fall short of initial estimates by as much as 45% (SPE JPT, Oct 2019). Well-to-well decline of 38% between first and final wells on the pad. Source: SPE Journal of Petroleum Technology; Concho Q2 2019 investor presentation.. The stock dropped 22% in a single day — $4.4 billion in market capMeasuredConcho Resources stock fell 22% on August 1, 2019, following Q2 earnings disclosing Dominator underperformance. Single-day market capitalization decline of approximately $4.4B. Source: Bloomberg, securities litigation filings (Labaton Sucharow).. The team that selected the spacing — reservoir engineers, geologists, a VP, a CEO — earned a combined $3–5M a year. They destroyed a thousand times their own compensation in one decision with inadequate scenario coverage.
Concho Resources, Delaware Basin, 2019

That's the leverage structure of this industry. A handful of knowledge workers control a capital budget that dwarfs their salaries by orders of magnitude. The Dominator wasn't a failure of intelligence. It was a failure of scenario coverage — and scenario coverage was rationed because analysis was expensive. Make analysis cheap, and the question changes from "can we afford to run more cases?" to "can we afford not to?"

This leverage structure isn't unique to oil. A frontier AI lab decides which $100M+ training run to fundReportedSam Altman stated GPT-4 training cost "more than $100 million." Meta's Llama 3.1 405B reportedly cost ~$170M. Projected trajectory: largest training runs expected to exceed $1B by 2027. Sources: Fortune, Epoch AI. — five people in the room. TSMC allocates $52–56B in fab capex for 2026ReportedTSMC 2026 capital expenditure guidance of $52–56B, a ~30% increase over 2025. 70–80% allocated to advanced process technologies. Source: TrendForce, TSMC investor relations.. Hyperscalers are deploying $600B+ in data center infrastructure this yearReportedCombined 2026 capex projections: Amazon ~$200B, Alphabet $175–185B, Microsoft $120B+, Meta $115–135B, Oracle $50B. Approximately 75% tied to AI infrastructure. Sources: IEEE ComSoc, CreditSights, company earnings guidance., 75% of it for AI — and the analysis that determines whether that capacity comes online in 2028 or 2031 runs through the same energy permitting workflows mapped in this benchmark. Same structure every time: small team, massive capital budget, decision quality as the binding constraint. Energy is the industry where we counted the roles.

Mid-size E&P ($75M G&A, $3B development capital). G&A savings assume 20% compression. Capital allocation improvement assumes same acreage, better scenario coverage. Dominator reference: Concho Resources, Delaware Basin, 2019. Cross-industry capex figures from company guidance and analyst estimates (2025–2026).

The ownership problem

If the value is in the capital allocation decision — not the meeting summary, not the email draft, not the reformatted slide deck — then the person who owns that decision has to own the AI deployment. Not "approve it." Not "sponsor it." Own it. Be the user.

Right now, most energy companies delegate AI to IT or to a "digital innovation" team two levels removed from capital decisions. That's how you get chatbots for the help desk and summarizers for the weekly meeting. Useful. Worth maybe $15M a year in time savings. And completely disconnected from the $3B development program where the real leverage sits.

The CFO who reviews four scenarios that someone else built is a different person than the CFO who directs a system that runs forty and flags the three that change the decision. The first CFO is a consumer of analysis. The second is an operator of intelligence. Same title. Completely different leverage on the capital budget.

The VP of Development at a Permian operator told us: "Our IT team pitched an AI tool that auto-generates my weekly status report. I don't need a faster status report. I need to know whether our 660-foot spacing assumption holds in the Wolfcamp B before I sign a $400M AFE. That's the AI project I'd pay for. Nobody's pitching me that one."
Pattern observed across E&P capital planning teams

This is why the Concho loss is structural, not anecdotal. The team that selected 230-foot spacing had the domain expertise. What they lacked was scenario coverage — and scenario coverage was rationed because the analytical cost came out of G&A, while the decision it informed controlled billions in capital. The budget owner for the analysis and the budget owner for the decision were in different buildings. AI doesn't fix that misalignment. Organizational design does.

The practical version: every energy company needs someone in the room where capital gets allocated who also controls the AI deployment that feeds that decision. Call it what you want — Chief Decision Officer, VP of Decision Intelligence, the CFO who actually uses the tools. The title doesn't matter. What matters is that the person directing AI and the person signing the AFE are the same person, or sit next to each other. When they're three org layers apart, AI gets aimed at the G&A line. When they're the same person, it gets aimed at the capital budget.

That's the 40:1 gap restated as an org chart problem. The technology is ready. The question is whether the person who benefits from better decisions is the same person who controls the AI budget. At most energy companies today, they're not.

DRI case · Treasury and lender readiness

A mid-market E&P with a $1.2B reserve-based lending facility runs two borrowing-base redeterminations per year. Each cycle: 3 analysts, 3 weeks, rebuilding the same variance tables and lender Q&A packets from scratch. The lender asks the same fifteen questions every cycle. Nobody structured last cycle's work product for reuse.

One owner. One cycle. One budget line. Direct labor: ~$35K/year. AI inference: ~$2K/year at API pricing (~$20K with orchestration overhead). The direct savings are modest. The real value: freed analysts run 15 additional sensitivity scenarios on the next development program — scenarios the team has been skipping because each one cost three engineer-weeks. On a $500M–2B facility, a 1% improvement in borrowing-base utilization from better scenario coverage = $5–20M.

What stays human: negotiation posture, lender relationship, representation of downside scenarios, signoff. What the system composes: source comparison, variance tables, covenant extraction, Q&A draft generation, package assembly — delivered before the cycle starts, not rebuilt from scratch. Full beachhead breakdown →

The role that's easiest to automate is almost never the role where automation matters most.

Compressibility scores rank technical exposure. Wage-bill share ranks economic consequence. They don't agree.

Exhibit 5 · Value-risk frontier · 30 roles

In energy, the biggest AI upside lives next to real downside.

Everybody wants the lower-right corner: high value, low risk. Look at the chart. It's empty. Every role that creates massive value also carries meaningful downside. In energy, value and consequence travel together. They have historically, and the pattern holds across the workflows modeled here. The deploy-first corridor isn't where risk is zero. It's where the math works anyway.

A VP of Engineering at a midstream operator put it this way: "If your AI misroutes a marketing email, you lose a lead. If it misroutes a pipeline pressure reading, you lose a town." That's why the easy-win quadrant is empty. The work worth automating in energy is work where getting it wrong actually matters.
Pattern observed across midstream and pipeline operations teams

X = value creation potential. Y = asymmetric downside risk. Size = automation exposure. The empty lower-right is the chart's most important feature — you can't deploy AI where it's most valuable without accepting real downside. That makes this a governance problem, not a technology problem.

Three dimensions of AI impact · 30 roles

One score hides the answer. Three scores reveal it.

A reservoir engineer scores moderate on automation exposure but enormous on value-creation scope. A roughneck scores low on automation but catastrophic on downside risk. Score on one axis and you automate the wrong things. Score on three and you see which roles are worth touching first. These are the 30 we modeled in depth.

Deployment

Start where AI already pays for itself.

Not a vision deck. Three workflows you can start next quarter. Inputs named, gates identified, compression measurable. You own the document assembly. The outcomes pass through external parties (lenders, counterparties, regulators), but the analytical pilot requires no external approval. Just someone willing to run it.

The gap between what models can do and what energy companies actually use them for is the early-mover advantage. It's closing. These three workflows are where to start — not because they're the flashiest, but because they pay for themselves in the first cycle.

Beachhead 01 · Semi-controlled

Start with the packet every lender asks for

Borrowing-base support, covenant monitoring, lender Q&A, amendment packages
The lender sends the same fifteen questions every redetermination cycle. The treasury team rebuilds the answers from scratch because most teams haven't structured last cycle's work product for reuse. The questions don't change. The assembly starts over.
Pattern observed across RBL borrower teams
What feeds inReserve reports, production exports, price decks, hedge schedules, covenant definitions, prior lender memos
What compressesSource comparison, variance tables, covenant extraction, Q&A draft generation, package assembly
What stays humanNegotiation posture, lender relationship management, representation of downside scenarios, signoff
Success metricsPackage cycle time, analyst hours per cycle, questions answered inside 24 hours, error escapes
Decision contextBorrowing-base readiness · $500M–2B credit facility
The first-principles question: The borrowing-base package exists because lenders can't continuously monitor collateral. If AI makes continuous monitoring cheap, the semi-annual cycle disappears — the lender gets a live dashboard and the "package" becomes unnecessary. You can make the cycle 5x faster. Or you can ask why the cycle exists at all.
Back-of-envelope economics (not audited): 3 analysts × 3 weeks × 2 cycles/year ≈ $35K in allocated laborDerivation3 analysts × $100K fully loaded annual cost × (6 weeks / 52 weeks) = ~$34.6K/year allocated to this workflow. Per analyst per cycle: $100K × 3/52 ≈ $5.8K. Assumes 100% utilization during cycle weeks. Actual may be lower (analysts have other duties) or higher (overtime during crunch). Note: the analysts' full annual cost is ~$300K — only the time allocated to this workflow is counted here. The deal readiness memo uses $200–400K per cycle, which represents the full departmental loaded cost including overhead, management, tools, and allocated time for other duties. Both figures are directional.. AI inference cost ≈ $2K–20K/yearDerivation~60M tokens/cycle × 2 cycles/year = ~120M tokens/year × $8/M blended rate ≈ $960 at raw API pricing. With orchestration overhead (retrieval, re-ranking, iteration loops), practical cost is ~$2K/year. $20K upper bound includes retrieval infrastructure, fine-tuning amortization, and production overhead (10–20× raw API). Range reflects uncertainty in production architecture. Token estimates from Exhibit 8 mid-case scenario.. Direct savings: modest. The real value isn't the labor delta — it's the 15 additional scenarios the freed analysts can runMethodologyCurrent: ~3–5 scenarios per redet cycle (base, upside, downside, stress, bank). With AI-assisted assembly, each incremental scenario adds hours instead of days. 15 scenarios = 3× current capacity. The credit committee sees better coverage of tail risk, not just faster packaging., and what that does to the quality of the credit decision on a $500M-2B facility. A 1% improvement in borrowing-base utilization on $1B = $10MDerivationDirect arithmetic: 1% × $1B = $10M. "Utilization" = ratio of drawn amount to borrowing base. Better scenario coverage → more confident draw → higher utilization. This is the decision-quality argument, not cost savings..
Beachhead 02 · Semi-controlled

Attack the paper trail that keeps cash and decisions stuck

JOAs, AFEs, JIB review, division orders, curative, title chains
A non-op analyst reconciles JIBs against division orders across 3,000 tracts. Most of the work is pattern matching — comparing operator charges against working interest, flagging discrepancies, routing exceptions. The judgment calls are 10% of the hours. The comparison is 90%.
Pattern observed across non-op and land administration teams
What feeds inJOAs and amendments, lease and deed chains, title opinions, curative notes, division orders, AFEs, election notices
What compressesClause extraction, document comparison, exception queue triage, owner-response drafts, curative clustering
What stays humanLegal interpretation on edge cases, negotiation with counterparties, escalation on title risk
Success metricsSuspense resolution time, title queue aging, JIB dispute cycle time, % auto-triaged
Decision contextNet revenue interest accuracy · $50–500M working-interest exposure
What would prove this wrong: If AI can't reliably parse the clause structure of a 40-page JOA — specifically conditional consent provisions and pooling elections — the comparison step doesn't compress. Current models handle this for standardized JOAs but struggle with pre-2010 custom agreements. Test it on your worst JOA before committing.
Beachhead 03 · Semi-controlled

Shorten the prep stack before the next rate case

IRPs, rate-case exhibits, testimony support, discovery responses
A state commission sends 400 data requests during a rate case. Miss one inconsistency with a filing from 2019, and the intervenor uses it in cross-examination to unravel the entire case. The regulatory team's real job isn't analysis. It's making sure page 847 of this year's filing doesn't contradict page 312 of the filing from three years ago.
Pattern observed across IOU regulatory affairs teams
What feeds inForecast workbooks, generation scenarios, depreciation schedules, historical filings, commission discovery requests
What compressesExhibit assembly, testimony draft support with citations, discovery routing, consistency checks
What stays humanPolicy judgment, regulatory strategy, witness preparation, external advocacy
Success metricsTurnaround on data requests, revision count, witness-prep load, consistency escapes
Decision contextAllowed return on equity · $1–10B+ regulated rate base
What would prove this wrong: If a PUC rejects AI-assisted testimony or discovery responses — not because they're wrong, but because the commission won't accept the process. Regulatory conservatism is the binding constraint. Test with a non-controversial filing first, not the general rate case.

Request a workflow audit

We map the top 3–5 decision loops in your organization, score them for compressibility, and identify the one workflow where AI pencils first. Two weeks. One working session with your team.

Get in touch →

Six things AI won't fix. No matter what the vendor tells you.

External clocksAI cannot speed up the regulator's calendar

You can assemble the rate-case exhibits in two days instead of two months. The PUC still takes fourteen months to issue an order. AI compresses time to prepare. Time to decide (agency review, public comment, commission deliberation) runs on a calendar you don't control.

Organizational frictionIt will not stop people from waiting on each other

The borrowing base takes three weeks because the VP is traveling, the bank wants a different format, and the geologist is arguing with the reservoir engineer about type curves. Maybe 30% of cycle time is work product. The rest is waiting. AI compresses the 30%.

Data qualityIt will not rescue bad source data

Well logs from the 1970s were hand-transcribed onto paper by a guy named Earl. AI on bad data produces bad results faster. Garbage in, garbage out — now at the speed of light.

IndependenceThe machine is not the signer

Reserve auditors exist because lenders require independent attestation. The bank doesn't care how smart your model is. They care that a human with a PE license signed the page.

Never aloneSome calls stay human because the downside is physical

Final legal opinions, auditor conclusions, safety-critical field calls. These stay human. Not because the industry is slow to change. Because a wellhead blowout doesn't have an undo button.

The downcycleIn a downturn, the value proposition flips

This analysis is implicitly mid-cycle. In a downturn, the prep stack gets gutted by layoffs before AI touches it. The value proposition flips: not "compress the work" but "maintain analytical capability after the RIF." The company that cut 40% of finance in 2020 and has AI can still run the analysis. The one that cut 40% without it can't.

What actually changes inside a role

Inside every role, the same split.

Some tasks shrink. Some disappear. Some become more valuable because the prep bottleneck is gone. The table below decomposes five roles into their task layers — then shows how time and value restructure when the assembly work gets cheap.

Role Tasks AI compresses Tasks AI amplifies Tasks that stay human Time shift
Reservoir engineerScore: 7.1 Decline curve fitting, type-curve generation, reserve report assembly, data gathering from production databases, variance commentary drafts Scenario comparison (can now run 40 instead of 4), sensitivity analysis across price/decline/spacing, pattern recognition across analogue wells Subsurface judgment calls, well spacing decisions, reserve certification signoff, risk framing for the board 60/15/2510/45/45
Treasury analystScore: 8.2 Borrowing-base packet assembly, covenant compliance checks, lender Q&A drafts, amendment redlining, data reconciliation across systems Exception detection (catches covenant breaches earlier), cross-cycle comparison (persistent memory across redeterminations), scenario stress testing on covenants Lender negotiation posture, downside framing, final representations, signoff authority, relationship management 70/10/2015/40/45
Land / title analystScore: 7.8 Division order calculation, lease abstraction, JIB exception screening, curative document comparison, ownership chain reconciliation Cross-asset title pattern matching (flag similar defects across properties), historical exception memory (recalls curative outcomes from prior cycles) Curative negotiation, title opinion judgment calls, counterparty relationship management, legal liability decisions 65/10/2510/40/50
Regulatory analyst (utility)Score: 6.9 Exhibit assembly, discovery response drafting, data request compilation, testimony support document preparation, precedent citation lookup Cross-docket pattern analysis (identifies commissioner tendencies), IRP scenario modeling (more alternatives evaluated), consistency checks across multi-year filing history Regulatory strategy, testimony delivery, commissioner relationship management, settlement negotiation, policy judgment 55/15/3010/40/50
LinemanScore: 2.4 Paperwork: daily job briefing forms, time entry, incident reporting templates, outage documentation Predictive routing (AI optimizes storm response dispatching), outage pattern recognition (learns from historical restoration sequences) Climbing, switching, grounding, live-line work, safety assessment, crew leadership, storm response decisions 10/5/853/10/87

Time-split estimates (compress / amplify / stays-human) are modeled from FERC procedure timelines, E&P workflow audits, and utility rate-case filing breakdowns. Pre-AI splits represent the approximate share of a full-time role's hours. Post-AI splits assume production deployment, not pilot.

The role does not vanish. The economics of the role change.

The time-shift column tells a story that aggregate compressibility scores miss. When a reservoir engineer's time allocation moves from 60/15/25 to 10/45/45, the role doesn't disappear — it restructures from an assembly job into a judgment job. The economic consequence: the value per hour of that person's work increases because each hour now involves scenario evaluation or decision-making rather than data compilation. A role that was 60% commodity work and 25% judgment work becomes 45% judgment work. The judgment didn't get cheaper — it became a larger share of the output.

The amplification paradox, at the role level. A treasury analyst who spent 70% of their hours assembling borrowing-base packets was being paid mostly for document work. Post-compression, that same person spends 45% of their time on lender negotiation and downside framing. Same hourly rate. Different economic content per hour. The role's strategic value to the organization goes up even as headcount stays flat.

The lineman row is the control case. When only 10% of a role's time is in the compress column, AI changes the paperwork and the dispatch routing but not the job. The time shift (10/5/85 to 3/10/87) is real but marginal. This is why physical-layer roles cluster below 3.0 in the benchmark — there's not enough compressible time to restructure the role's economics.

Failure mode to monitor

The "stays human" column is only durable if organizations actually invest the freed-up time in judgment and analysis rather than simply reducing headcount. The risk: a company compresses its five-person treasury team to two people, but keeps the same workflow volume — so each remaining person rubber-stamps AI outputs at 3× speed instead of reviewing them at appropriate depth. The amplification column only works if people have the time budget to fill it. If compression translates to headcount cuts rather than role restructuring, the centaur model degrades into an automation model with a human-shaped rubber stamp at the end. This is an organizational design choice, not a technology constraint.

The pattern across all five: the compress column is document assembly and data reconciliation — it's where the hours are today and where AI makes the most immediate difference. The amplify column is scenario depth and pattern recognition — things the human could always do but couldn't afford to because prep consumed their capacity. The stays human column is judgment, liability, relationships, and physical execution. The time shifts quantify the centaur trade: organizations buy judgment capacity by spending less time on assembly. The question is whether they use that capacity or simply eliminate it.

Workforce · ScenarioTHREE-PART FRAMEWORK

Work doesn't disappear. It migrates.

Don't ask which jobs get replaced. Ask what part of the work compresses, what human capability gets amplified, and what new control work appears because AI now exists. Broad labor evidence points toward redesign and churn, not a simple extinction storyReportedWEF Future of Jobs 2025: projects 170M jobs created and 92M displaced by 2030 across macro trends. AI/information-processing tech specifically: +11M created, -9M displaced. LinkedIn Economic Graph 2024: 70% of skills in most jobs will change by 2030; 10%+ of professionals hired today have titles that didn't exist in 2000. Anthropic Economic Index 2025: no clear unemployment impacts in AI-exposed jobs yet; routine API work trending toward automation, interactive use trending toward augmentation.. Functions appear before titles do.

Compresses

The assembly layer inside the job

Drafting, reconciliation, packaging, routing, repetitive evidence assembly. Not "the whole job." The prep work underneath the judgment. Where four analysts assembled borrowing-base packets, two review AI-assembled packets and spend the freed hours on scenario analysis.

Analysts, packagers, coordinators, memo-builders, context routers, reconciliation specialists

Amplifies

Judgment seats get more leverage

Judgment, negotiation, relationship management, signoff, testimony, exception handling, field decision-making, political and regulatory sense-making. These people don't disappear. Their leverage increases because the prep work around them gets cheaper. As analysis gets cheap, judgment gets expensive.

Approvers, negotiators, operators, witnesses, relationship owners, field decision-makers

Emerges

A new control layer around the machine

Every workflow AI automates creates a new control problem. Someone has to teach, audit, route, and sign for machine output. The first new jobs are not sci-fi. They are the people who ensure the machine's work is trustworthy enough to act on.

Evidence architects, exception managers, eval/QA leads, workflow owners, provenance leads, apprenticeship stewards

Don't ask whether AI replaces the job

Ask which part of the seat was the job.

Each tile is a task inside one role. Some flow to AI. Some stay with the human. A few recombine into new functions. The seat is being unbundled — not eliminated.

AI doesn't delete the org chart

It redraws the center.

The old pyramid thins in the middle. The decision layer stays. The field layer stays or grows. A new thin control layer appears around the machine.

The hours don't disappear

They move to higher-consequence work.

Hours leave drafting, reconciliation, and packet assembly. They arrive in exception handling, scenario exploration, and AI governance. The hours don't vanish — they migrate upward.

You can automate the training ground out of existence

Then who makes the decisions in 2035?

Junior manual work builds pattern recognition, which builds judgment, which earns signoff authority. AI slices out the junior layer. The firm now has a problem: how does it produce future decision-makers?

New work functions, not job forecasts

We estimate new work, not new jobs. Functions appear before titles do. The demand model:

Scenario expansionWork generated by running 40 scenarios instead of 4. In treasury, this creates an evidence architect who makes each redetermination cycle cheaper than the last. Exception adjudicationThe 5% of cases that create 95% of delays. In title/non-op, this is the exception manager who handles curative failures that AI flags but cannot resolve. Eval / governanceTrust infrastructure. In regulatory, this is the provenance/QA lead who prevents citation escapes and witness-prep chaos before a rate case. Knowledge maintenanceKeeping the institutional memory alive as senior staff retire. The workflow owner who ensures AI-assisted processes encode domain knowledge, not just patterns. Training / apprenticeshipAcross the company, the apprenticeship steward who keeps junior staff learning judgment, not just becoming prompt operators.

These functions deliver business value, not soft benefits. An evidence architect brings compounding memory. An exception manager brings downside protection. An eval lead brings the trust that lets you actually deploy. A workflow owner brings adoption and budgeted ROI. An apprenticeship steward brings future judgment supply.

The net employment math

A mid-size operator creates 3–7 FTEs across emerging functions while restructuring 20–50 above-field positions. Net headcount likely declines. Per-person value rises. Total payroll may stay roughly flat. The IEA's 1.7-to-1 retirement-to-entrant ratio means in many cases AI-driven compression doesn't cause layoffs — it prevents the capability gap from widening. A 50-person technical team that loses 8 to retirement and replaces 5 (with AI assistance) maintains roughly the same output.

2025–2026: Pilot phase — emerging functions handled as side responsibilities. 2027–2028: Production phase — workflow architect and eval roles become distinct positions. 2029+: Regulatory pressure formalizes governance and provenance roles. The firms that benefit most treat this as workforce restructuring, not simple automation.

Confidence: scenario

These work functions are projected from benchmark structure and cross-industry labor evidence, not observed in operating companies today. The pattern — that technology shifts create control work around the seams of the technology — is well established (database administrators before databases; DevOps before cloud; compliance officers before Sarbanes-Oxley). We are confident the functions will exist. We are not confident about timing or headcount.

The winners will not be the firms with "fewer people everywhere." They will be the firms with fewer people rebuilding the same packet, more people governing faster decisions, and a deliberate system for producing future judgment.

Decision leverage

Your first workflow depends on what kind of company you run.

Eight company types. Eight different entry points. The shale E&P starts in treasury. The utility starts in regulatory. The PE fund starts in portfolio monitoring. Start in the wrong place and you burn six months proving something the business didn't prioritize. Each one is a DRI-shaped problem: a single owner, a recurring cycle, a budget line, and company-controlled prep work. No regulator, lender, or board permission needed to start the analytical pilot. Click any row.

Company typeHighest-torque applicationDecision scopeWhy it's asymmetric
Exhibit 6 · Benefit by company type

Most companies save 15%. One type of company gains 10x.

Most consultants sell "cost savings from AI." The savings are real and nearly identical across company types: maybe 10-20% on analytical labor. The strategic value (better decisions, competitive moats, new capabilities) varies by 10x. It concentrates in the company types your consultant isn't modeling.

A PE fund managing $4B in energy assets runs the same portfolio-monitoring cycle every quarter. Three analysts spend two weeks pulling data from nine portfolio companies, reconciling it, building the IC deck. The partners look at it for forty-five minutes. The analysts could have run sensitivity analysis on the two assets actually at risk. Instead they spent those two weeks making a deck pretty. That's not a labor cost problem. That's a decision quality problem.
Pattern observed across energy-focused PE funds

Gray = efficiency gains (cost savings, time compression). Olive = strategic value (decision quality, competitive moat, new capabilities). The gray bars are roughly equal. The olive bars aren't. Scores are modeled estimates — the ranking is the claim, not the precision.

If you're building an AI platform for energy, cost savings aren't your market. Decision leverage is.

The next three exhibits shift the denominator from headcount to token demand, recurring revenue, and capture sequence.

Exhibit 7 · Workflow demand map · 304 roles

Stop counting jobs. Start counting tokens.

Score and spend don't track. The role with the highest AI exposure isn't the role that burns the most tokens. Rank by annual reasoning demand (fresh inference, not cached lookups) and utility rate-case teams and PE fund IC cycles climb. Field operations barely registers. This is the ranking an AI platform would underwrite against.

X = annual token demand. Y = decision value per 1M tokens. Size = addressable companies. Olive = company-controlled; amber = semi-controlled (external party governs outcome); gray = externally governed. Upper-right quadrant is where recurring model spend compounds.

Exhibit 8 · Top 20 recurring workflows

Twenty workflows where AI demand keeps coming back.

Not roles. Workflows. Each has a budget owner, a case frequency, a fresh-vs-cached token split, and an expansion path. The ones at the top aren't flashiest — they're where reasoning demand recurs, compounds across operating units, and can support serious model spend because the decision value justifies it. "Fresh %" is the model-tier signal: high means frontier reasoning (Opus 4.6, GPT-5.4 Pro); low means most tokens route to fast, cheap inference (Haiku, Mini).

WorkflowCompany typeBudget ownerCases / yrAnnual tokens (M)Fresh %Value at stakeExpansion path
Exhibit 9 · Platform economics · 20 workflows

$1 in. $20 out. That's the ratio.DerivationMedian leverage ratio across 20 workflows = ~23:1 (range: ~7:1 to ~50:1). Leverage = decision value ÷ API cost = valPerM ÷ blended rate. At $8/M blended, a workflow with $180K decision value per million tokens yields 22.5:1. Highest: NRC licensing (~50:1). Lowest: production accounting (~7:1). Panel A plots every workflow. Reference lines at 10:1, 100:1, 1000:1 show scale context.

The API revenue floor is real but small. The capture opportunity is the gap between what the model costs and what the model decides. Three panels: value leverage, capture sequence, price sensitivity. Token prices compress. Outcome-based pricing doesn't.

Panel A: API cost vs. decision value (log-log), with 10:1 / 100:1 / 1000:1 leverage lines. Panel B: cumulative revenue waterfall by capture priority. Panel C: price sensitivity at $4, $8, $12/M blended token rates. All figures modeled. Color scheme matches Exhibit 7.

The loop

Energy is the bottleneck on AI's next wave.

AI compute demand needs chips. Chips need data centers. Data centers need power. Power needs permitting: scenario analysis, regulatory filings, engineering studies, contract negotiation. Twenty workflows stand between a signed lease and a spinning turbine.

A major near-term bottleneck on the AI revolution is the interconnection, permitting, and contracting stack — much of it document-heavy. As of end-2024, nearly 2,300 GW sat in U.S. interconnection queuesMeasuredLBNL "Queued Up" 2025 Edition. Characteristics of Power Plants Seeking Transmission Interconnection As of End of 2024.. Only ~13% of capacity entering queues from 2000–2019 reached commercial operationMeasuredLBNL "Queued Up" 2025 Edition. 13% of capacity submitted from 2000–2019 had reached commercial operation by end-2024 (19% of projects by count). by end-2024. An estimated 60–70% of queue time is document workModeledSunya decomposition of FERC Order No. 2023 interconnection timelines. Not observed from project data. See steelman section. — a Sunya modeled decomposition, not an observed measurement — that frontier models can already accelerate. NVIDIA uses AI to design chips. Google uses AI to train AI. Anthropic just signed a multi-year partnership with the U.S. Department of Energy and committed $50B to U.S. computing infrastructureReportedAnthropic, "Anthropic invests $50 billion in American AI infrastructure," Nov 12, 2025. Includes data centers in Texas and New York via Fluidstack. — suggesting the frontier labs view physical infrastructure and energy access as a binding constraint on their growth. Energy — the sector that determines how much AI the world gets — is still assembling rate-case exhibits in Word.

The economic loop is visible but undeveloped: faster permitting expands compute supply, which accelerates AI deployment.

Accelerate permitting by one quarter per project. Multiply across 2,300 GW of queued capacity. Freed-up TWh become data center build-outs. Data centers become AI compute. The flywheel spins. But only if someone closes the loop at the energy layer. OpenAI's latest round names compute as the strategic advantage. Compute needs power. Power needs permitting. Permitting is document work. The most advanced technology on earth is waiting on a filing cabinet.

Energy companies sit on some of the most honest signal in any industry. Every transaction, every filing, every interconnection study, every redet cycle — these aren't survey responses or ad clicks. They're facts about how capital moves through the physical world. The quality of any intelligence layer is only as good as the signal feeding it. Energy has the signal. It just doesn't have the model yet.

A dollar spent on AI for energy workflows doesn't just improve energy decisions. It expands the compute supply AI runs on. Energy isn't another vertical. It's the physical constraint that sizes every other AI market — including one where two frontier labs already claim a combined $39B+ run rate, company-reported, not audited.

What could break this

Faster analysis doesn't guarantee faster approvals. Lender committees, regulator calendars, and board decisions operate on their own timelines. The 60–70% document-work estimate is a modeled decomposition, not observed from project-level data. Physical bottlenecks — site execution, procurement, interconnection queue position — remain binding regardless of analytical speed. The loop is real, but the compression rate depends on which phase is actually the bottleneck for a given project.

See it happen

Watch AI work a rate-case request.

Illustrative mockup — not a benchmarked model run. A real PUC data request — the kind that takes a rate case analyst three days. Walkthrough shows plausible sequence, outputs, and speed. No prompt, model, corpus, or evaluation trace is published. Treat it as a storyboard, not evidence.

Docket No. 2024-00187-EL
Staff Data Request Set 3, Item 14
PUC-TX

Request: Provide the Company's actual and projected plant additions, retirements, and transfers for each functional category for the test year and each of the five preceding calendar years. Include explanations for any year-over-year variance exceeding 10%.

Supplemental: For each variance explanation, identify the specific capital project(s), their FERC account classification(s), the date placed in service, and whether the investment was included in the Company's most recent depreciation study filed in Docket No. 2021-00042-EL.

Format: Provide in Excel format with supporting workpapers. Cross-reference to the Company's response to Staff DR Set 1, Item 7 (rate base roll-forward) and OPC DR Set 2, Item 22 (depreciation schedules).

Response deadline: 10 business days from date of service. Objections due within 5 business days.

AI Analysis
Processing document...
Structured output · 4.2 seconds
What the analyst would do next
Your company

Pick one workflow. See what you've been leaving on the table.

This is a scenario model, not a forecastMethodologyDecision surplus = (capex × type base multiplier × scenario gap) + (FTE × 40% compression × $100K avg cost). Type base multipliers: E&P 3.0%, Utility 2.0%, PE 2.5%, Inv bank 4.0%, Midstream 2.0%, LNG 3.0%, Developer 2.5%, Major 1.5%. Scenario gap = max(0, (30 − current scenarios) / 90) — normalizes the distance to a 30-scenario target. Annual model cost = type token base × (FTE/40) × (max(scenarios,4)/4). All values overridable. The point is the structure, not the precision.. It estimates the gap between the analysis you run and the analysis you could run. The assumptions are visible — override anything that doesn't fit your world.

$5.0B
4 scenarios
40 FTEs
Estimated annual decision surplus · Scenario illustration, not forecast · Assumptions below
$45M
$1.6M
Prep-stack cost savings
$43M
Decision quality improvement
6 mo
Estimated time to first value
$240K
Implied annual model spend
187x
Decision value / model cost
Assumptions visible: Decision value = capex × type multiplier × scenario gap (target 30 scenarios). Prep savings = FTEs × 40% compression × $100K avg cost. Model spend = type token base × FTE scaling × scenario scaling. These are scenario estimates — not audited, not forecast. Override with your own numbers.
Where you stand

Most companies are earlier than the board thinks.

Most energy companies sit at Level 1 or 2. They know it. The board doesn't. Dot distribution below is a Sunya estimate, not a survey — if it's wrong, it flatters. The company that reaches Level 5 first gets a compounding advantage that widens every quarter.

Governance

The question is no longer whether to use AI. It is where to trust it.

Most energy companies apply Tier 4 — full human oversight — to everything. Payroll gets the same governance as reserve analysis. A more useful approach: match the governance tier to the actual risk profile of each workflow. The general counsel who figures out which tier each workflow belongs in will be the most important person in the building. The question they'll have to answer — who's liable when AI contributes to a safety-critical decision — doesn't have a settled answer yet. That's a feature, not a bug: it means the governance framework is worth building now, before the regulators build it for you.

The head of IT at a large IOU told us: "We have one governance policy for everything — the same approval chain for generating a board slide deck and for running a load-flow study. So nothing gets approved because the safety people won't sign off on the slide deck tool, and the business people won't wait eighteen months to find out." One tier for everything means nothing ships.
Pattern observed across utility IT and innovation teams
Tier 1
Autonomous
PayrollTrade confirmsAP / ARScheduling
Model tier (as of Mar 2026): Haiku 4.5 / GPT-5.4 Mini · ~$1–3/M tokens · High volume, low stakes
Tier 2
Draft & review
Lender packagesCIMsBoard packsContract summaries
Model tier (as of Mar 2026): Sonnet 4.6 / GPT-5.4 · ~$3–9/M tokens · Human reviews every output
Tier 3
Assist & decide
Reserve analysisScenario modelingPortfolio optimizationIRP planning
Model tier (as of Mar 2026): Opus 4.6 / GPT-5.4 Pro · ~$15–30/M tokens · Frontier reasoning, low volume
Tier 4
Human only
Safety-critical opsLegal opinionsAuditor attestationRegulatory testimony
No model. Human judgment, human liability, human signature.

Model tier mapping is directional, not prescriptive. Current frontier pricing (March 2026): Opus 4.6 at $5/$25 per million tokens, Sonnet 4.6 at $3/$15, Haiku 4.5 at $1/$5. OpenAI GPT-5.4 at $2.50/$15, Mini at ~$0.75/$4.50, Pro at $30/$180. Prompt caching and batch APIs can reduce effective cost by up to 95%. The $8/M blended rate used in Exhibit 9 assumes a weighted mix across tiers — most token volume flows through Tier 1–2 (cheap), most decision value flows through Tier 3 (expensive). The governance tier determines the model tier. Get that mapping right and the unit economics follow.

Everything above: what to deploy now
Everything below: where the industry goes
Implementation

Most energy AI projects die in the same place.

Gartner reported that at least 30% of generative AI projects would be abandoned after proof of concept by end-2025MeasuredGartner Press Release, July 29, 2024. Revised to 50% in Gartner article "Why 50% of GenAI Projects Fail," Jan 2026., later revising to at least 50% overall — and that's cross-industry, not energy-specific. In energy, the pattern is consistent: the pilot works, the team gets excited, then data integration hits and the whole thing stalls. The exec who championed it gets a new role. The team quietly shelves it and no one brings it up at the next offsite.

If you've been through a failed "digital transformation" — and most energy executives have — here's why this is different. The last wave tried to change the workflow. New platform, new data architecture, new operating model. Eighteen months of IT integration before anyone saw a result. This doesn't change the workflow. The borrowing-base cycle still exists. The treasury team still runs it. The lender still asks the same fifteen questions. The only thing that changes is that the assembly takes two days instead of three weeks. Nobody adopts a new system. They review AI output instead of building from scratch. Correction, not creation.

That's the structural difference: digital transformation was a platform play — replace the system, retrain the team, migrate the data, hope it works. This is a workflow play — same system, same team, same data, different starting point. Instead of a blank screen, you start with a draft. The analyst who spent three weeks building the packet now spends two days reviewing it. The skills don't change. The starting point does.

The companies that survive this still have to do three things the others don't.

Survival tactic 01

Start with the workflow the team already hates. Borrowing-base assembly, not reservoir analysis. Nobody will fight to keep a hated process. The treasury team that dreads the three-week redet cycle is your first adopter — they'll champion anything that ends it.

Survival tactic 02

Let the AI draft. Let the human edit and take credit. Measure time saved, not people replaced. The metric is "hours back" not "heads out." The analyst who used to spend three weeks building the borrowing-base packet now spends two days reviewing the AI's draft and a week running scenarios the team never had time for.

Survival tactic 03

Instead of waiting for the reservoir engineer's spreadsheet, send them an AI-drafted variance table built from production data. They'll spend twenty minutes correcting it instead of three days building it. Correction is faster than creation. Use that everywhere.

Scenario · The number not widely reported
$1.0T
in capital reallocation over a decade
Range: $330B (1% improvement) to $1.65T (5%) — we use 3% as the base case

The math is not complicated. Global energy investment reached $3.3 trillion in 2025MeasuredIEA World Energy Investment 2025. Includes clean energy, fossil fuels, and grid infrastructure.. A 3% improvement in capital allocationScenarioSunya base case assumption. Range: 1% ($330B/decade) to 5% ($1.65T/decade). Not a forecast. — the difference between 4 scenarios and 400 — redirects ~$100 billion per year to higher-return projects. Over a decade: ~$1 trillion. Not saved. Reallocated. From worse wells to better ones, from overbuilt projects to ones that should have been built instead. Every year you run 4 scenarios instead of 40, you're not losing that money to a competitor. You're losing it to a version of yourself that bothered to do the analysis.

The native-AI energy company · Speculation

Now imagine the company built after reasoning got cheap.

Snowflake didn't put Oracle in the cloud. Uber didn't put a taxi dispatcher on a phone. The native-AI energy company doesn't make the filing cabinet faster. It doesn't have a filing cabinet. It wouldn't need one.

01 · The decision clock

Development plans become continuous, not annual.

The dev plan is annual because it takes six months to build. The borrowing base is semi-annual because each cycle takes three weeks. When reasoning is cheap, these become continuous. New KPIs follow: not "how many wells did we drill?" but "how many scenarios did we run before choosing?"

02 · Information asymmetry flattens

The buyer assembles its own reserve estimate before entering the data room.

The seller traditionally knew the asset better than the buyer. AI flattens that. A buyer can now assemble its own reserve estimate from public data. Your information moat gets thinner. Same logic applies to lenders, regulators, and counterparties.

03 · Speed risk

Fast analysis without quality architecture is dangerous.

When the analysis cycle was three weeks, errors got caught in review. At three hours, one bad assumption propagates through 40 scenarios before anyone looks. The companies that skip quality architecture will move fast and break expensive things. Governance is the constraint, not speed.

What remains human-controlled: Field safety. Regulatory judgment. Relationship capital. Physical execution. Liability. The native-AI company removes humans from the assembly line, not from the decisions. Each carries external accountability or physical consequence that no model absorbs. The deeper question: the prep stack exists because humans need information in narrative form. A decision system that operates directly on the data needs no translation layer — no memo, no package, no slide deck. The bottleneck was never the decision. It was the preparation. That distinction reshapes where the ROI model points.

The question underneath everything

What does your company understand that is genuinely hard to understand — and is that understanding getting deeper every day?

If the answer is nothing, AI is cost optimization. Cut headcount, improve margins for a few quarters, get absorbed. If the answer is deep — interconnection queue dynamics, reservoir behavior across basins, regulatory filing patterns, lender decision logic — then AI doesn't just augment your company. It reveals what your company actually is.

This benchmark is either a one-time report or the seed of a world model for energy decision-making. The 404 positions are capabilities. The compressibility scores are a first-pass intelligence layer. The deal flow, the filing history, the cycle data — that's honest signal waiting for a model. The question is whether anyone builds it.

Exhibit · The power bottleneck

2,300 GW are waiting in line. Most will never make it through.

As of end-2024, the U.S. interconnection queue holds more capacity than the entire installed fleet (LBNL Queued Up 2025MeasuredLBNL "Queued Up" 2025 Edition. Lawrence Berkeley National Laboratory, April 2025.). The median project waits 4+ yearsMeasuredLBNL "Queued Up" 2025. Median time from queue entry to commercial operation for completed projects.. A major share of that wait is document-heavy — interconnection studies, environmental review, permitting, contract negotiation. That's the bottleneck AI was built to compress.

Data: LBNL Queued Up 2025 Edition. Completion rate (13% of capacity) from projects entering queues 2000–2019 as of end-2024. Funnel widths are approximate. Document share is a Sunya estimate — modeled, not observed.

A 1 GW AI data center throws off roughly $1B a month. Every month it sits in queue, that revenue waits with it. The paperwork is the holdup. And the paperwork is exactly what AI compresses fastest.

$10–12B
Annual revenue per GW of AI data center

A 1 GW AI data center throws off $10-12B a yearReported estimateSemiAnalysis, "How AI Labs Are Solving the Power Crisis: The Onsite Gas Deep Dive," Dec 2025. Proprietary estimate, not independently measured.. Every month it sits in queue, that's roughly $1B waiting. Compress the paperwork and the project comes online a year earlier. Multiply that across 100+ GW of planned capacity and you're looking at tens of billions in revenue sitting in a queue, waiting on documents.

Revenue per GW: SemiAnalysis estimate

2 paths
Grid interconnection and onsite generation

The hyperscalers aren't waiting for the grid. Oracle/OCI collaborated with VoltaGrid for 2.3 GW of onsite generation in Texas. xAI reportedly ordered 1.9 GW from Doosan (per industry press — not confirmed by primary source). Both grid-connected and behind-the-meter paths have document bottlenecks — interconnection studies on one side, air quality permits and gas supply contracts on the other. Different paperwork, same compression opportunity.

3–5 yr
NRC licensing for new reactor designs

Constellation's 20-year PPA with Microsoft for the Crane Clean Energy Center restartMeasuredConstellation Energy press release, September 2024. Three Mile Island Unit 1 restart for Microsoft data center power. (announced Sept 2024). Google's agreement with Kairos Power for SMRsMeasuredGoogle/Kairos Power announcement, October 2024. Agreement for small modular reactor deployment. (announced Oct 2024). Amazon acquiring nuclear-adjacent capacity across multiple sites. NRC licensing is 30–40% document-intensive — safety analysis reports, environmental review, inspection documentation. Compress that by 12 months and you've accelerated the nuclear buildout that every hyperscaler is betting on for baseload.

What the next version models

What hyperscaler power teams actually need to know.

State-level breakdown

In 2023, data centers consumed about 26% of Virginia's electricity supplyMeasuredEPRI data cited by Pew Research Center, 2024. Virginia-specific estimate.. Texas gets tens of GW in monthly requests. The queue, the regulatory stack, and the power mix are completely different in each market.

Efficiency crossover

Blackwell is ~4x more efficient per token than HopperMeasuredNVIDIA Technical Blog, "Inside Blackwell Ultra," 2025. 4.2x on Llama 3.1 405B; up to 35x with specific optimizations.. If chip efficiency improves 4x every 2 years but demand grows 2x, when does efficiency stop outrunning demand? That crossover point is the most important number in the industry.

Bottleneck migration

2023: CoWoS packaging. 2024-25: data center power. 2026+: semiconductor fabs. Bottlenecks move. The analysis needs to show the sequence, not just the current constraint.

Turbine supply chain

If every hyperscaler goes onsite gas, who makes the turbines? GE Vernova, Siemens, Doosan, Wärtsilä, Bloom, Caterpillar. The turbine order book is the leading indicator of data center capacity.

If superintelligence is around the corner

Energy isn't a sector AI impacts. It's the sector that determines how much AI the world gets.

This report covers wave one — the filing cabinet, the prep stack, the back office. But if AGI is close, the binding constraint shifts fast. Away from analyst-hours and toward megawatts, interconnection time, and political permission. The question flips from "which function compresses?" to "which region delivers power fastest?" Here are the four waves, from now to the horizon.

Wave 1
Workflow compression

Filings, packages, contracts, reports. What this page scores. Deployable now.

Wave 2
Continuous capital allocation

Dev plans refresh weekly. Every asset ranked against every alternative. The ~$1T reallocation prize.

Wave 3
Energy-system control

AI as the operating system for the grid. The IEA (2025) estimates AI could unlock up to 175 GW of additional transmission capacity from existing lines and save up to $110B/yr in the electricity sectorScenarioIEA Energy and AI report, 2025. Estimates for grid optimization potential if AI is widely deployed. if widely adopted.

Wave 4
Physics

Materials science, fusion, novel reactor design, storage chemistry. What changes when intelligence is no longer the bottleneck — and the physical world still is.

Global data center electricity demand

How much power will AI actually need?

A data center goes up in two to three years. The power plant it needs takes longer. The transmission line takes longer still. The permit to build the transmission line takes longest of all. Energy becomes a speed problem before it becomes a cost problem.

IEA published scenarios: base = 945 TWh by 2030, lift-off = 1,700 TWh by 2035. "Base ext." extrapolates the IEA 2030 base case to ~1,200 TWh by 2035 using internal trend analysis — not an IEA-published 2035 figure. "Headwinds" is a Sunya downside scenario. Year-by-year values are interpolations, not exact data points. US share from Berkeley Lab 2025: 325–580 TWh by 2028 (7–12% of national electricity).

Compresses

Analysts, packagers, coordinators, memo-builders. The assembly layer inside every role.

Amplifies

Approvers, negotiators, operators, field decision-makers. As analysis gets cheap, judgment gets expensive.

Emerges

35 new roles. Builders, bridgers, orchestrators. See Exhibit 10 ↓

You can automate the training ground out of existence. If entry-level analytical work disappears, where do future operators, traders, engineers, and executives learn judgment?

In nuclear roles, 1.7 workers nearing retirement for every young worker enteringReportedIEA World Energy Employment 2025 report. Ratio is 1.7× for nuclear, 1.4× for grid-related roles. Reported by IEA; not independently measured by Sunya.. For grid roles: 1.4× (IEA 2025).

Exhibit 10 · ModeledNEW — 35 emerging roles

35 roles that didn't exist two years ago. The talent war is already on.

Previous exhibits map where existing value compresses. This one maps where new value gets created, and who captures it. Big Tech energy hiring jumped 34% year-over-yearReportedCNBC, Jan 2026. Amazon: 605 energy hires. Microsoft: 570+ since 2022. Google: 340+ since 2022. Energy-related hiring remained 30% above pre-AI (2022) levels.. Amazon alone added 605 energy hires. AI-energy hybrid roles stay open 90+ days with 30–40% salary premiumsReportedMSH / Talent market surveys, 2025–2026. Roles requiring both energy domain expertise and AI skills. Premium measured vs. standard market rates for equivalent seniority.. And AI-skilled workers in energy are 40% less concentrated than in tech or financeMeasuredPwC AI Jobs Barometer 2025, LinkedIn data. Concentration of AI-skilled workers in utilities, oil & gas, and mining vs. education, technology, finance, and media sectors.. The talent gap is structural.

Three origin categories. Two scoring axes. The roles that matter sit where leverage is highest and talent is scarcest.

X = leverage (decision value flowing through the role). Y = scarcity (time-to-fill, salary premium, talent pool size). Bubble size = estimated industry-wide headcount demand. Color: Builders (AI-for-Energy) · Bridgers (Energy-for-AI & AI-lab GTM) · Orchestrators (hybrid mutations of existing roles). Upper-right quadrant = the talent war zone: high leverage, high scarcity, where every PE firm, utility, and hyperscaler is competing for the same 200 people. All scores are modeled estimates — the clustering pattern is the claim, not individual precision.

Talent war zone · Upper-right quadrant

Seven roles cluster where leverage exceeds 7 and scarcity exceeds 7: Data Center Energy Lead, Energy Strategy Director (AI Lab), Power Procurement Specialist, Interconnection Queue Manager, AI Safety Engineer (Grid), Enterprise Sales — Energy (AI Lab), and AI-Augmented Reservoir Engineer. Microsoft poached GE's former CFO to run energy strategy. Google hired Duke's Tyler Norris for energy market innovation. Amazon staffed 605 energy positions in a single year. They're not hiring AI people. They're hiring energy people who understand AI. There are maybe 200 of them on earth.

What could break this

These roles assume sustained AI investment and continued energy-AI convergence. If frontier model costs collapse faster than expected, some builder roles commoditize. If hyperscaler power strategies shift to long-term utility contracts (removing the custom procurement layer), bridger demand softens. If AI tools become genuinely autonomous, orchestrator roles shrink rather than grow. The scarcity scores also assume current training pipelines — if universities and bootcamps spin up energy-AI programs at scale, premiums compress within 3–5 years. These are 2026 scores, not permanent conditions.

This report is wave one. The binding constraints shift from analyst-hours to megawatts, interconnection time, and political permission. The beachheads work whether the destination is 3x better or 300x different. Here's what would prove us wrong.

What would prove this wrong

We ran the strongest case against our own thesis. Here's what survived.

Most objections to this analysis make one of four moves. Naming them is not a dismissal — it's an invitation to check whether your own objection clears the bar.

Question drift
We claim "these tasks compress." The objection answers "AI won't replace people." Different question. We agree — the roles persist, the prep work inside them shrinks.
Perfect-proxy fallacy
Compressibility scores are directional proxies, not deployment blueprints. Attacking the proxy for not being a perfect predictor misses the purpose — it ranks where to look first, not what to buy.
Autonomy strawman
We never argue for autonomous AI in critical workflows. The thesis is augmented prep: AI drafts, humans verify and decide. Attacking "unsupervised AI" attacks a claim we don't make.
Edge-case universalizing
Finding one workflow where AI fails and generalizing it to the whole stack. Some workflows won't compress. That's already in the data — the constraint chips section maps exactly where.

The single strongest objection that survives all four filters: compressibility as a single axis is too blunt. A high score tells you the task is compressible — it doesn't tell you whether the organization can adopt it, whether the data exists to train on, or whether the regulatory environment allows it.

We agree. That's why 304 of 404 positions carry three scores, not one: compressibility (can the prep work shrink?), criticality (what breaks if AI gets it wrong?), and reasoning demand (does this need frontier-class models or commodity inference?). If you only look at compressibility, you'll automate the wrong things. The three-axis view is how you avoid that.

What we know, what we believe, and what we're guessing

The scores are hypotheses. Scored by one analyst (Raj Mistry) whose background spans energy treasury, regulatory filings, and deal structuring — the same workflows being scored. No inter-rater reliability, no blinded scoring, no adjudication log. Employment counts are formulaic (layer base × tier multiplier), not survey-sourced. Compensation uses six layer-level proxies, not role-specific wages. The directional layer pattern is the claim; individual role numbers are structural estimates. Multi-rater validation is a v33 requirement. CONFIDENCE: medium. Layer rankings are high confidence; individual scores are low.

The ~$1T number is a scenario, not a forecast. $3.3T annual energy investment × 3% decision-quality improvement × 10 years. At 1%: ~$330B. At 5%: ~$1.65T. The 3% approximates the difference between running 4 scenarios and 40 on a typical development program. The entire gap between "nice efficiency play" and "industry transformation" rests on whether cheaper analysis leads to more analysis (Jevons) or just faster versions of the same 4 scenarios. We believe Jevons is directionally true. The magnitude is speculative. CONFIDENCE: low. The range ($330B–$1.65T) is the honest answer.

60-70% of queue time as document work is a modeled estimate. Derived from FERC Order 2023 procedure timelines — system impact study, facilities study, environmental review, permitting, PPA negotiation — vs. physical construction. A developer or interconnection consultant could confirm or refute it with project-level data. We haven't found one willing to share yet. CONFIDENCE: medium. Procedure-derived, not project-validated.

The native-AI energy company is a hypothesis, not a finding. We've identified no venture-backed company that has demonstrated a fully AI-native operating model in energy workflows as of March 2026. The analogies (Snowflake, Uber) are retrospective — we know they worked. Whether the pattern transfers to energy is an open question. CONFIDENCE: speculative.

Known gaps. No competitive landscape mapping (Microsoft Copilot, Palantir, C3.ai, and dozens of startups target overlapping workflows). US-centric regulatory framing (FERC, PUC, NRC, RRC) — international frameworks deserve equal depth. Uneven coverage depth: upstream E&P and utilities are granular, refining and petrochemicals less so, the trading floor alone could be 30+ workflows. And the IOCs are already spending hundreds of millions on AI — this research is most relevant to the companies that haven't started. CONFIDENCE: N/A. Scope disclosures.

Speculation · No precedent exists
A venture-backed, AI-native energy operator is plausible — built not to sell tools to energy companies, but to be one. Capital allocation powered by continuous scenario analysis instead of quarterly human assembly. No venture-backed company has demonstrated this yet. But cheap reasoning, expensive expertise, and recurring decisions are three ingredients that have built category-defining companies in every other industry they've appeared in.
This report is an illustration of its own thesis.

If you're reading this and thinking "my company should be producing this kind of analysis internally" — that thought is the thesis. You're living in it.

~$75
Model cost to produce this page MeasuredAggregate API billing across Claude Opus 4.6 and ChatGPT Pro 5.4. Token logs and model-mix breakdown to be published with v33.
~$450K
Equivalent research team cost ModeledBenchmark: 3 sector analysts + 1 data scientist + 1 designer + 1 editor × 6 months at market rates. Illustrative, not audited.
1
Analyst with domain expertise AttributionRaj Mistry — energy operations, financial structuring, applied AI. Scoring required direct experience with the workflows being scored: borrowing-base packets, rate-case filings, title chains, deal structuring, and regulatory testimony.
90–900×
Model cost leverage Derivation$450K equivalent team cost ÷ $500–$5,000 total model spend (including iteration, discards, and research sessions). The $75 production-session figure yields 6,000× but excludes iteration — the honest range using full spend is 90–900×.

373 roles, 24 workflows, 7 artifacts. 10 interactive exhibits. 25 linked supporting documents (32 in the full bundle). Built with Claude Opus 4.6 and ChatGPT Pro 5.4 — millions of tokens of research, drafting, and iteration. Directed, scored, and verified by Raj Mistry. The models assembled; Raj made the calls. A six-person team working six months would have produced comparable breadth at far higher cost. That's the thesis illustrated.

Methodology note: the $75 figure reflects direct API billing for the final production sessions only (Claude Opus 4.6 and ChatGPT Pro 5.4); earlier research, iteration, and discarded drafts are not included. Total model spend including iteration is likely $500–$5,000, giving the 90–900× range shown above. Labor comparator is a benchmark estimate for a comparable industry research team at market rates — not a quote. Neither figure is audited. Token logs and model-mix breakdown will be published with v33.

We didn't set out to prove the thesis while writing about it. But one analyst directing two frontier models just produced what would have taken six people six months. The models did the prep work. The analyst made every judgment call. That's the thesis — illustrated, not argued.

v33 adds what v32 can't: real before/after data from actual energy deployments.
First observed workflow traces, measured cycle-time compression, and error rates from live pilots. One email when it ships.

What would this save your team?

Pick your workflow, tell us about your team, and see a modeled compression estimate. This uses the same scoring methodology behind the 404-position benchmark.

1–50
1–60 days
$75K–$300K
Evidence

Evidence room — full dataset, claims ledger, and scoring method.

Full dataset, claims ledger, composite traces, scoring method. Every major claim tagged by confidence level. If you think we're wrong, the data is right here.

A note on the data. 60 of 404 positions have BLS-sourced employment counts. The remaining 344 use a formulaic estimate: layer base × tier multiplier (three bands: 0.75×, 1.0×, 1.25×). These are structural scaffolding for directional wage-pool analysis, not census-quality headcounts. Compensation uses six layer-level proxies ($58K–$180K), not role-specific wages. The directional finding — that roughly nine of every ten AI-exposed wage dollars sit above the field — holds across a wide range of employment assumptions. Individual role numbers should not be cited as precise. Full methodology: methodology FAQ.

How to reproduce every number in this benchmark

The core benchmark — scores, employment, compensation, and the ~90% headline — can be reconstructed from the published CSV and the three formulas below. Some interactive exhibits (role explorer, workflow economics, demand scenarios) use additional embedded data structures visible in the page source.

Formula 1 — Exposure-weighted wage bill

EWB = estimated_employment × layer_median_comp_proxy × compressibility_score / 10

This is the single metric that drives ranking, the slope chart, the heatmap area, and the ~90% headline. The core exhibits trace back to this formula applied row-by-row across the 404-position CSV.

Formula 2 — Employment estimation (344 formulaic rows)

employment = layer_base × tier_multiplier

Where:

layer_basePhysical: 22,000 · Technical: 8,000 · Corporate: 10,000 · Advisory: 7,000 · Capital mkts: 5,000 · Governance: 3,000 tier_multiplierThree explicit bands: Low (0.75×), Mid (1.0×), High (1.25×). Each role is assigned to a tier by a stable hash of its name: h=0; for each char c: h=((h<<5)-h+charCode(c))|0; tier=|h|%3. Tier 0→0.75×, 1→1.0×, 2→1.25×. This produces three discrete employment levels per layer rather than pseudo-random integers that would imply false precision. The tier assignment is deterministic and reproducible from role name alone.

Methodology note (v32.1): Employment estimation is decoupled from compressibility scoring. An earlier version used max(0.15, 1.3 − score/8) as a score factor, which gave low-scoring roles more people by construction. The continuous jitter 0.7 + (hash mod 600)/1000 was also replaced with three explicit tier bands to eliminate false precision. All employment figures for formulaic rows are displayed with rounding and a "~" prefix to signal estimation uncertainty.

The 60 anchor rows override this formula entirely — their employment figures come from BLS Occupational Employment and Wage Statistics or industry-derived analogs, flagged in the employment_source column. Note: two of these 60 rows are not people-roles (Board pack production, A&D diligence support) — they are workflow/artifact rows with industry-estimated volume, not BLS occupational matches.

Formula 3 — Compensation proxies

Physical
Technical
Corporate
Advisory
Capital Mkts
Governance
$58K
$95K
$72K
$105K
$130K
$180K

These are BLS-derived median fully loaded annual compensation for the layer's representative occupations, not role-specific wages. Every row in the same layer uses the same proxy.

Compressibility score — what it measures

Each position's compressibility score (1–10) represents how plausibly current AI can compress the preparation stack around that role or workflow. It is built from five sub-dimensions:

Task exposureWhat fraction of the role's tasks involve document-heavy, pattern-matching, or analytical work that current models handle well Adoption feasibilityHow quickly the workflow could realistically adopt AI given regulatory, organizational, and technical constraints Economic compressionWhether compressing this workflow changes the cost structure enough for a budget owner to notice Workflow standardizationHow repeatable and templated the work is across cycles and counterparties Template densityHow much of the output follows a known structure (forms, packets, exhibits, standard reports)

The score is a calibrated composite, not a simple average. Roles with high task exposure but low adoption feasibility (e.g., safety-critical field roles) score lower than roles where both align (e.g., document assembly and reconciliation). The full scoring rationale is published per-row in the CSV's rationale column and in the scoring method note.

Augmented dataset

404 positions (373 roles, 24 workflows, 7 artifacts) with compressibility scores. 60 anchor rows (BLS-sourced or industry-analog), 344 formulaic estimates. 304 have augmented axes (decision criticality, reasoning demand, company control). 30 anchor roles have 3 additional axes.

Open CSV →

Claims ledger

Every major claim tagged by type: modeled, scenario, hypothesis, cited, factual, measured boundary, self-critique, or method requirement.

Open ledger →

Scoring method

Transparent explanation of the new axes and the heuristics used to populate them.

Open method →

Proof room

Composite traces for treasury, non-op, utility, LNG, and EPC workflows.

Open proofs →

Token-demand model

Bottom-up Jevons scenario logic, per-operating-unit reasoning-demand lanes.

Open model →

Scenario table

Base, aggressive, and extreme annual token scenarios across six workflow families.

Open scenarios →

Product storyboards

Workflow-bound product narratives instead of generic AI feature lists.

Open storyboards →

Adoption map

Now, next, later, and never-alone deployment guidance by workflow family.

Open map →

Counterforces

Where the thesis bends, slows, or fails — and how to test each assumption.

Open analysis →

External sources

Validation table for every external data point: URL, date, exact/approximated status, and confirmation flags.

Open sources →

Headline metrics

Registry of key metrics with definitions, denominators, and derivation paths.

Open registry →

Adjacent industries

How energy compares to legal, financial services, healthcare, and other verticals on AI adoption readiness.

Open comparison →

Audience briefs

Tailored summaries for different reader types: investors, operators, advisors, regulators.

Open briefs →

Partnership map

Frontier lab positioning and partnership strategy for vertical AI in energy.

Open map →

Exhibit spec

Design and data specifications for the signature exhibits.

Open spec →

v33 roadmap

Prioritized upgrade plan synthesizing audit findings and reproducibility gaps.

Open roadmap →

Who pays first

Buyer analysis: which energy companies pay first, how to sell it, and unit economics.

Open analysis →

Sensitivity analysis

Monte Carlo stress test: the 90% above-field finding survives ±1 and ±2 point scoring errors across all 404 rows. 1,000 iterations.

Open analysis →

Scoring rubric

Inter-rater validation kit: calibration anchors, 10-role test set, scoring sheet template, and agreement metrics for external raters.

Open rubric →

Distribution package

Email templates, 20-target outreach list, UTM parameters, and timing guidance for launch.

Open package →

How this was built

Transparent account of the AI-assisted production process: tools used, human judgment applied, and where the model helped vs. where it didn't.

Open process →

v33 patch log

Synthesized action plan from five independent audits (Thiel, Altman, Musk, Amodei, Error). All fixes applied, priorities documented.

Open log →

Sensitivity executive summary

One-page summary: baseline 90.25%, Monte Carlo range 87.9%–92.4%, parametric stress 75.5%–96.1%. The finding holds.

Open summary →

DeepMind brief

Audience-specific brief for frontier reasoning labs: why energy decision loops are a natural eval environment for tool-using agents.

Open brief →

Red team audit

April 2026 comprehensive audit across numerical consistency, HTML code, claims sourcing, and docs. 0 breaking errors, 3 critical docs gaps (now fixed).

Open audit →

Every number on this page is published. Every assumption is attackable. If we're wrong, you'll know exactly where.

If we're right, you're running four scenarios right now. Your competitor is about to run forty. That's not a technology gap. It's a calendar problem — and the calendar is already moving.

If you run an energy operator

Name the five decision loops where your team still rations analysis. Start where the cycle recurs quarterly, the team dreads the assembly, and you own the prep — no regulator permission needed.

See the three beachheads

If you invest in or lend to energy

Ask your portfolio companies how many scenarios they ran before their last billion-dollar commitment. If the answer is four, ask why. The gap between 4 and 40 is where the next write-down hides.

Download the dataset

If you build AI tools for energy

The workflows in Exhibits 7–8 are ranked by recurring reasoning demand and decision value per token. That's the revenue signal. Start where the budget owner can feel the gain within one operating cycle.

Open claims ledger

Share the full story

A clean 4-slide deck: the core exhibits, takeaways, methodology, and the reflexive loop. Download and present to your team or board.

Four scenarios on a billion-dollar decision. Or forty. That's the only question this page asks.

Know someone who should see this?

Forward the research to a colleague. The link includes the full dataset, methodology, and proof trace.

Forward to your team →