← Back

Methodology

Am I Cooked? is a research-backed tool that shows workers the real economic math behind AI job displacement. It combines peer-reviewed exposure scores, real-world AI usage data from Anthropic, BLS wage statistics, and O*NET task definitions to give you an honest, data-driven picture of where you stand. Every number is sourced, every formula is transparent, and every limitation is stated. The goal is not to alarm — it’s to inform.

⚠ How to interpret the numbers

The AI exposure percentage presented here should not be interpreted as equivalent to the proportion of money saved per employee or the actual financial impact on a company. Rather, this metric is intended to provide a perspective on the potential influence of AI within an organization. By quantifying exposure in relative terms, the aim is to highlight the areas where AI may play a significant role — making the implications more tangible and relatable to stakeholders, without implying any direct monetary consequence.

Audit Pipeline

When you enter a job title, salary, and industry, the following pipeline executes — each step either looks up empirical data, runs deterministic math, or generates prose via an LLM.

Job Title

User input

O*NET Match

BLS dataset
word-overlap + stemming

β Lookup

Automation dataset
SOC code → β score

Observed

Anthropic Dataset
real-world AI usage

BLS Wages

Median, mean, %iles
employment counts

Task Match

Confidence-aware
fuzzy matching

Gemini Grade

Unmatched tasks
batch AI grading

Economics

Augmentation vs
automation costs

Timeline

Derived formula
β + observed

Narrative

Gemini (prose)
cached per prompt

Empirical data lookup

Deterministic math

LLM generation (cached)

System routing

Green steps use published, peer-reviewed data. Amber steps are deterministic formulas — same inputs always produce the same output. Red steps call Gemini, but results are cached — identical prompts return instantly without an API call.

Exposure Score (β)

The core metric comes from Eloundou, Manning, Mishkin & Rock (2023), published in Science. They classified every O*NET task into three exposure levels:

Exposure Rubric

E0 — No exposure: An LLM cannot reduce task completion time by ≥50% while maintaining quality. Physical tasks, in-person interaction, novel unstructured judgment.
E1 — Direct exposure: An LLM alone can reduce task time by ≥50%. Writing, editing, translating, summarizing, coding.
E2 — Tool-exposed: An LLM alone doesn’t halve the time, but software built on top of it could. Document processing, data pipelines, AI agents.

The β Formula

β = E₁ + 0.5 × E₂ N_tasks where E₁ = directly exposed tasks, E₂ = tool-exposed tasks (weighted 0.5), N_tasks = total tasks for occupation

The 0.5 weight on E₂ reflects that tool-mediated exposure requires additional investment to realize. β ranges from 0 (no exposure) to 1 (fully exposed). The US occupation average is approximately 0.30.

Key finding: approximately 80% of US workers belong to an occupation with at least 10% of tasks exposed, and 19% have occupations where more than half of tasks are exposed.

Observed Exposure

From Massenkoff & McCrory (2026), Anthropic. They measure what fraction of an occupation’s tasks are actually being performed by AI, not just theoretically possible.

Observed Exposure Formula

r̃_t = 𝟙{WorkUsage_t ≥ 100} × 𝟙{β_t ≥ 0.5} × α_t where α_t upweights automated (vs. augmentative) tasks. Full automation → α = 1, pure augmentation → α = 0.5

R_occupation = Σ(w_t × r̃_t) Σ(w_t) where w_t = fraction of working time spent on task t

Key finding: AI is far from reaching its theoretical capability. Computer & Math occupations have 94% theoretical exposure but only 33% observed adoption.

Task Penetration Matching

O*NET defines ~19,000 tasks. The Anthropic Economic Index measures AI usage per task. O*NET and Anthropic describe the same task differently, so we use confidence-aware multi-stage matching:

Exact text match → confidence: high
Normalized match (lowercase, strip punctuation) → confidence: high
Keyword Jaccard ≥ 0.35 (significant words overlap) → confidence: medium/high
SequenceMatcher ≥ 0.60 (character-level similarity) → confidence: medium/high
≥ 4 shared significant keywords → confidence: medium

Tasks with no confident match (< medium confidence) are sent to Gemini for AI penetration estimation on a 0.00–0.90 scale. These appear in the report with a ✦ Gemini badge. Results are cached per-task and per-prompt to minimize API calls.

Economic Model

Every cost is labeled: EMP (empirical), EST (estimated with stated basis), or DER (derived from inputs).

Salary at Risk

S_risk = salary × 1.35 × β 1.35 = employer overhead (benefits, payroll taxes, equipment). This represents the economic value of tasks AI could theoretically perform.

Path A — Augmentation

C_augmentation = C_{subscriptions} ~$720/yr baseline (1 AI assistant + 1 coding copilot + misc tools). Compute is bundled in subscription pricing.

Path B — Automation

C_automation = C_platform + C_inference + C_verification Platform (~$12K/yr EST) + API inference (tokens × β × model rate, DER) + human review (20% of automated task value, EST)

Employer Savings

Savings = S_risk − C_automation Range: [savings × 0.5, savings × 1.3]. Assumes proportional headcount reduction — overstates near-term impact.

Users can switch the AI inference model in the report to see how compute costs change across providers (Claude, GPT, Gemini, Grok, DeepSeek, and others).

Timeline Derivation

The automation timeline is calculated from data, not guessed by an LLM.

eff = max(β, observed)
base = 2 + 11 × (1 − eff)
ratio = min(observed / β, 2.0)
accel = max(1 − ratio × 0.2, 0.4)
years = clamp(round(base × accel), 1, 18) eff=1.0 → 2 years, eff=0.1 → 13 years. Adoption acceleration reduces timeline when observed >> β.

We use max(β, observed) because Eloundou’s β scores are from early 2023 — before widespread tool-use, code agents, and multimodal capabilities. For some occupations, real-world adoption has already exceeded the theoretical prediction.

Gemini Integration

All Gemini API calls go through a unified client with two layers of caching:

Prompt-level cache: Every call is hashed (system + prompt + temperature). Identical prompts return cached results instantly with zero API calls. Persists to disk between restarts.
Task-level cache: Individual task grading scores are stored separately. A task graded in one audit is never re-graded in another, even if the batch prompt differs.

Gemini is used for: task penetration grading (when task_penetration dataset has no confident match) and personalized narrative generation. It is not used for: β scores, costs, savings, timeline, or any quantitative output.

Datasets

Dataset	Records	Source	Used For
Occupation Level	798	Eloundou et al. (2023)	β exposure scores
Job Exposure Dataset	756	Massenkoff & McCrory (2026)	Observed AI exposure
Task Penetration Dataset	17992	Anthropic Economic Index	AI usage per task
BLS Wages Dataset	1358	BLS OES May 2024	Wages, employment
Layoffs Dataset	2020–2026	Kaggle/swaptr	Industry layoffs
Occupations Data	~1,016	O*NET Center	Job title search
Occupations Tasks Dataset	~19,000	O*NET Center	Task definitions
LLM Tier & Tokens Pricing	4 providers	Pricetoken	Subscription costs

Live API: O*NET Web Services v2 — job outlook and career detail only (changes with BLS projection updates).

Key Research Findings

Eloundou et al. (2023)

~80% of US workers have at least 10% of tasks exposed to LLMs
~19% have more than half of tasks exposed
Higher-income jobs face greater exposure
Programming and writing = positively associated; science and critical thinking = negatively associated

Massenkoff & McCrory (2026)

No systematic increase in unemployment for highly exposed workers since ChatGPT’s release
Suggestive evidence: hiring of 22–25 year olds down ~14% in exposed occupations
Higher observed exposure correlates with lower BLS projected growth through 2034
30% of workers have zero observed AI coverage

Limitations

Exposure ≠ displacement. β measures theoretical task exposure, not actual job loss. Most AI adoption is currently augmentative. The financial figures illustrate economic pressure, not guaranteed outcomes.
β scores are from 2023. AI capabilities have advanced since the Eloundou paper. Observed exposure data (2025–2026) partially corrects for this.
The 20% verification cost is estimated. The exact human oversight fraction varies by task complexity and organizational risk tolerance.
Usage data is from one provider (Anthropic). May not represent total AI usage across all tools and providers.
Task matching is imperfect. Fuzzy matching has false negatives. Gemini fallback grading is an estimate, not observed usage data.