The Outcome Leverage Framework
A scoring framework for evaluating what a person can produce end-to-end with current LLM access, how good the result is, and how much value it displaces from prior workflows. 16 dimensions, three scoring layers, one composite score.
Find the full framework on LinkedIn
The Task Compression and Human Advantage Framework asks which work resists automation. This framework asks a different question: what can a person actually produce, end-to-end, right now, using commercially available LLMs?
The unit of analysis is not the task. It is the finished outcome. A signed contract. A deployed website. A proofread manuscript. A working Chrome extension. A marketing email that gets sent. The question is not whether AI can help with parts of the work. The question is whether a person with a $20/month subscription can sit down and produce a complete, usable result.
What Outcome Leverage Means
Outcome leverage is the ratio between what you can produce now and what the same outcome used to cost. A generic services contract used to require an attorney, weeks of coordination, and real money. Now it takes 30 minutes and the output is functionally equivalent for most commercial purposes. That is high outcome leverage.
A novel patent filing can also be generated in 30 minutes, but the output would be shredded by a competent examiner. Same tool, radically different leverage.
The framework measures three things:
- Achievability — Can the LLM produce a finished outcome?
- Output Quality — Is the result actually good enough to use?
- Displacement Value — What did producing this outcome used to cost in time, money, and dependency chains?
These combine into a single Outcome Leverage Score that surfaces where current LLM access creates the most real-world value for an individual operator.
The Three Scoring Layers
Each of the 16 dimensions is scored from 1 to 5. Higher scores always mean more leverage.
Layer 1 — Output Achievability (Dimensions 1–4)
Can you get a finished result in a practical session?
| # | Dimension | Score 1 | Score 5 |
|---|---|---|---|
| 01 | Specification Clarity | Deep domain expertise needed | Anyone can describe |
| 02 | Single-Session Completability | Multi-session / multi-day | One-shot |
| 03 | Tool Chain Simplicity | Complex toolchain | Chat window only |
| 04 | Operator Domain Knowledge | Deep expertise to evaluate | General literacy |
Layer 2 — Output Quality (Dimensions 5–9)
Is the result actually good enough to use, ship, send, or deploy?
| # | Dimension | Score 1 | Score 5 |
|---|---|---|---|
| 05 | Structural Correctness | Malformed | Sound |
| 06 | Substantive Accuracy | Dangerous to trust | Reliably accurate |
| 07 | Professional Parity | Obviously amateur | Indistinguishable |
| 08 | Edge Case Handling | Misses critical cases | Robust |
| 09 | Taste and Polish | Generic / flat | Crafted |
Layer 3 — Displacement Value (Dimensions 10–13)
What did the outcome used to cost? High displacement is where outcome leverage actually lives.
| # | Dimension | Score 1 | Score 5 |
|---|---|---|---|
| 10 | Prior Cost | Trivial | Thousands+ |
| 11 | Prior Time | Minutes | Weeks+ |
| 12 | Prior Dependency Chain | Just you | Multiple specialists |
| 13 | Prior Path Risk | Smooth | High friction |
Decision Metadata (Dimensions 14–16)
These modify interpretation without changing the core score. Stakes and Audience Sophistication act as discount factors on effective leverage.
| # | Dimension | Score 1 | Score 5 |
|---|---|---|---|
| 14 | Frequency | Once ever | Weekly+ |
| 15 | Stakes | Catastrophic if wrong | Low consequence |
| 16 | Audience Sophistication | Expert scrutiny | General audience |
Scoring
Raw Achievability Score — Sum of dimensions 1 through 9 (max 45).
| Score Range | Band | Meaning |
|---|---|---|
| 38–45 | Reliable one-shot | Sit down and produce it. Light review, then ship. |
| 28–37 | Strong with review | Achievable in one session. Iterate, review, then ship. |
| 19–27 | Viable draft | Gets you a real starting point. Needs expert finishing. |
| 9–18 | Scaffolding only | Produces structure and fragments. Not a finished outcome. |
Displacement Multiplier — Average of dimensions 10 through 13 (range 1.0–5.0).
Outcome Leverage Score = Raw Achievability x Displacement Multiplier. Maximum possible: 225. Practical ceiling for current models: 180–200.
The Five Bands
A. Full Displacement
The LLM produces a finished, usable outcome that displaces a previously expensive or slow workflow. The operator can ship with confidence after light review. The old path involved specialists, weeks, and real money.
B. High Leverage with Review
The LLM produces a strong output that needs targeted review before deployment. You still capture most of the time and cost savings. The review layer is the new bottleneck, not the production.
C. Viable Draft
The LLM produces a real starting point that accelerates the workflow but does not replace it. You get to the 60–70% mark fast. The remaining 30–40% still requires domain expertise and human judgment.
D. Scaffolding
The LLM produces structure, fragments, and first-pass material. Useful as a starting framework, not as a deliverable. Production still requires traditional workflows and expertise.
E. Out of Reach
The LLM cannot produce a meaningful version of this outcome. These outcomes remain fully human — surgical procedures, live crisis command, physical construction, courtroom advocacy.
Worked Examples
Generic Services Contract — Leverage: 180.5 (Full Displacement)
Achievability 38/45. Displacement multiplier 4.75. The operator needs enough domain knowledge to know what clauses matter, but the output is structurally and substantively strong. Prior path: attorney coordination, weeks, $2,000–$5,000. Current path: 30 minutes.
Marketing Email — Leverage: 80.0 (High Leverage with Review)
Achievability 40/45. But displacement multiplier is only 2.0. Copywriters were already fast and cheap. The leverage is real but modest because the prior workflow was not expensive.
Full-Stack Web App (Next.js + Integrations) — Leverage: 137.8 (High Leverage with Review)
Achievability 29/45. Tool chain scores low because you need a dev environment, package manager, hosting, and DNS. But displacement multiplier is 4.75. Prior path: hire a developer, weeks of coordination, $5,000–$15,000. Current path: 150 minutes.
Chrome Extension — Leverage: 136.0 (Strong with Review)
Achievability 34/45. The model handles manifest, content scripts, popup UI, and storage API well. Displacement multiplier 4.0. Prior path: hire a Chrome extension developer or spend days learning the API yourself.
Book Chapter — Leverage: 91.0 (Viable Draft)
Achievability 28/45. Professional parity and taste are weak. The output reads like competent filler, not authored prose. You get a starting point, not a finished chapter.
Synthetic Data Platform — Leverage: 85.0 (Scaffolding Only)
Achievability 17/45. Specification is hard to articulate, requires a complex toolchain, and deep operator expertise. The leverage score is deceptive because achievability is too low to capture it.
How to Use This
For Individuals — List 5 to 10 outcomes you regularly need to produce or currently pay others to produce. Score each one. Start with the highest-leverage outcomes you are not yet producing yourself.
For Founders and Operators — Score the outcomes your business produces for clients. Where leverage is high, your pricing power is under pressure. Where achievability is low but displacement is high, you have a window to offer the outcome at scale before models improve enough for customers to self-serve.
For Teams — Score your team's deliverables. High-leverage outcomes bottlenecked on specialists represent the biggest efficiency gains. Low-achievability outcomes that consume significant team time represent where human expertise remains the binding constraint.
Do not ask whether AI can produce the outcome. Ask whether you can produce the outcome, right now, with the tools available to you. Then ask what that outcome used to cost.
Find the full framework on LinkedIn
- Learn more: adjective.us
- Get started: Request a consultation