We Don't Know What We Don't Know
Every AI interaction begins with a human request. The quality of that request is bounded by what that human has been exposed to. This is the most important unsolved problem in AI, and almost nobody is framing it correctly.
Every AI interaction begins with a human request. The quality of that request is bounded by what that human has been exposed to. This is the most important unsolved problem in AI, and almost nobody is framing it correctly.
I want to talk about a problem that's been eating at me for months. It's one of those problems that once you see it, you can't unsee it. It shows up everywhere. In every AI demo, every product launch, every breathless LinkedIn post about agents and autonomy and the future of work. It's hiding in plain sight, and the reason nobody talks about it is because the problem is us.
Here's the setup. You sit down at a computer. You open a chat interface, or a terminal, or an IDE with an AI copilot. You type something. Maybe it's a question. Maybe it's a prompt. Maybe it's a spec for a product you want built. Whatever it is, this is the moment that determines everything that follows. This is Input Zero.
The quality of what the AI produces is bounded by the quality of what you asked for. The quality of what you asked for is bounded by what you know. And what you know is bounded by where you've been, what you've done, who you've talked to, what you've read, what you've failed at, and how many years you've spent accumulating all of it.
That's the problem. The bottleneck on AI is biological.
Part I: The Knowing
Let me tell you about three people. They're composites, drawn from real conversations, real industries, real gaps I've watched play out over the past decade working across defense intelligence, software engineering, and venture capital.
The Metallurgist
Dr. Sarah Chen has spent nineteen years studying the fatigue behavior of titanium alloys under cyclic loading. She can look at a micrograph and tell you the grain boundary orientation, predict where a crack will nucleate, and estimate the remaining service life of a component within a margin that would make most engineers uncomfortable because it's so precise. She has published forty-three papers. She holds six patents. She is, by any reasonable definition, one of the top hundred people on Earth who understand how titanium fails.
Three hundred miles from her lab, a medical device company is recalling a surgical implant. The failure mode is fatigue cracking in a titanium component. The engineering team at the company has been working on the problem for eight months. They've tried five different heat treatments. They've consulted two materials testing firms. They're running out of time before the FDA deadline.
Sarah could solve their problem in a week. Maybe less. She's literally the person who wrote the paper that describes the mechanism causing their failure. She presented it at a conference in 2019. Seven people attended the talk.
Sarah doesn't know about the recall. The company doesn't know about Sarah. The paper exists. The knowledge exists. The connection doesn't, because neither party has been exposed to the other's world.
This happens every day. Everywhere. In every industry. All the time.
The Machinist
Tony Russo runs a CNC shop in western Pennsylvania. Twelve machines. Seven employees. He makes precision components for aerospace subcontractors, small-batch medical parts, and the occasional prototype for a startup that found him on Thomasnet. Tony can hold a tolerance of plus or minus two tenths on a five-axis mill running Inconel. If you don't know what that means, just know that it's the machining equivalent of performing surgery while riding a mechanical bull.
Tony's order book is full right now. Has been for six months. He's quoting eight-week lead times. He turned down three jobs last month because he didn't have the capacity. From Tony's perspective, business is good.
What Tony doesn't see is the trade data. A new tariff on imported precision castings is about to shift demand for domestic machining by 15-20% in his region. Three of his competitors are going to get slammed with orders they can't handle. The overflow is going to land on shops like his. In about four months, Tony's phone is going to ring more than it has in five years. He has no idea. He's going to be unprepared. He's going to turn down work that could have been worth six figures because he didn't hire ahead and didn't invest in the additional fixturing.
The trade data is public. The demand forecast models exist. The connection between tariff policy and regional machining demand has been analyzed by at least three economic research groups. Tony has never encountered any of it. He reads trade magazines and talks to his customers. That's his information environment. It's served him well for twenty years. It's about to be insufficient.
The Architect
Priya Sharma is a senior software architect at a mid-size fintech company. She's rebuilding their transaction processing pipeline. The current system handles about 10,000 transactions per second. The business needs 100,000. She's been designing a distributed architecture using event sourcing and CQRS patterns, with a Kafka backbone and a custom sharding strategy. It's elegant work. She's been at it for three months.
Twelve time zones away, a research group at a university in Zurich published a paper seven months ago describing a novel approach to high-throughput transaction processing using conflict-free replicated data types. The approach eliminates the need for the coordination layer that accounts for about 40% of Priya's architecture's complexity. It would cut her development timeline in half and reduce her infrastructure costs by a third.
Priya subscribes to three engineering blogs, follows forty people on Twitter, attends two conferences a year, and reads the Hacker News front page most mornings. She's well-informed by any standard measure. She still hasn't encountered this paper because it was published in a distributed systems journal she doesn't follow, written by researchers whose names she doesn't recognize, and tagged with keywords that don't overlap with her search patterns.
The paper exists. The solution exists. Priya is going to spend three more months building something she could have built in six weeks with a different foundation. The gap is pure exposure.
Part II: The Four Constraints
Here's my framework. There are four primary constraints on AI right now. The industry talks about three of them constantly. The fourth one is the most interesting and gets almost zero airtime.
1. Energy
This is the most tangible constraint. Training frontier models requires staggering amounts of electricity. Running inference at scale across billions of queries per day compounds the demand. Data centers are being sited next to nuclear plants, natural gas facilities, and hydroelectric dams. Entire national energy strategies are being renegotiated around AI compute demand.
This constraint is real. It's physical. It's being addressed with capital, engineering, and political will. It is a hard problem with known solution vectors. More generation capacity, more efficient chips, better cooling, better power distribution. The timeline is measured in years, and the investment is measured in hundreds of billions of dollars. It will get solved. It might take a decade, and the path will be ugly, and the environmental questions are legitimate and serious. It will get solved.
2. Funded Problems
This one is underappreciated. AI capacity is increasing faster than the world can define problems to apply it to. Every quarter, models get more capable. They reason better, they handle longer contexts, they use tools, they write code, they analyze data. The raw intelligence available through API calls today would have been science fiction five years ago.
The bottleneck is the demand side. The volume of well-scoped, well-funded, clearly articulated problem statements that can absorb this expanding capacity is growing linearly while the capacity itself is growing exponentially. There is more intelligence available than there are needs defined to apply it to.
This shows up in organizations as the "AI strategy" problem. Every company knows they need to be doing something with AI. Most of them are still figuring out what that something is. The tools are ready. The problems are still being formulated. The gap between available intelligence and articulated demand is widening, and it's creating a strange economic dynamic where the most valuable skill in many organizations is the ability to define what AI should be working on.
3. Trust and Governance
Can you verify that an AI-generated analysis is trustworthy? Can you trace a decision back through the chain of reasoning that produced it? Can you prove that a document was generated by a specific model at a specific time with specific inputs? Can you ensure that a human was in the loop at the moments that mattered?
These questions define the trust constraint. Cryptographic provenance, attestation chains, audit trails, alignment research, regulatory frameworks. The infrastructure for making AI outputs verifiable, traceable, and governable is being built. It's slow. It's political. It involves coordinating across governments, industries, and technical communities that don't always agree on definitions, let alone standards.
This constraint is necessary. Without trust infrastructure, AI adoption will hit a ceiling in every domain that matters: healthcare, defense, finance, legal, government. The technology to solve it exists or is being developed. The coordination to deploy it at scale is the hard part.
4. Biological Knowledge
And here's the one that keeps me up at night.
Every AI interaction begins with Input Zero: the human request. The quality of that request determines the ceiling on what the AI can produce. And the quality of that request is a function of what the human knows, which is a function of what they've been exposed to, which is a function of their lived experience, which accumulates slowly, linearly, over years, and is bounded by the environments they happen to move through.
Sarah knows titanium. She doesn't know about the recall. Tony knows machining. He doesn't see the tariff data. Priya knows distributed systems. She hasn't read the Zurich paper.
These aren't failures of intelligence. These are constraints of biology. Human brains are extraordinary at going deep. They're terrible at going broad. Careers reward specialization. Institutions reinforce silos. The person who spent twenty years becoming an expert in one domain has, by definition, spent twenty years not becoming an expert in the adjacent domains where their expertise could be transformative.
AI models have broad knowledge. They've processed papers from every discipline, code from every language, data from every domain. They have the breadth that humans lack. They also wait. They respond to requests. They need someone to know enough to ask the right question, to frame the right problem, to make the right connection between domains.
The model has the answer to the question nobody is asking because nobody knows enough to ask it.
That's the fourth constraint. And it's the only one that can't be solved with more compute, more capital, or more policy.
Part III: Compressing the Gap
So what do you do about it?
If the constraint is biological exposure, the intervention is compression. Take the vast landscape of technical knowledge, score it, structure it, tag it, embed it, and make it queryable in ways that surface connections no individual human would encounter through their own career trajectory.
Let me be specific about what I mean and what I don't mean.
I don't mean search. Search is great when you know what you're looking for. The problem we're describing is the case where you don't know what to look for because the thing that would trigger the search has never entered your awareness. You can't Google what you've never heard of. You can't ask for what you don't know exists.
I don't mean recommendation engines. Recommendation engines optimize for engagement and similarity. They show you more of what you've already seen. The gap we're describing requires showing you something from a domain you've never visited, because your problem maps to a solution in that domain. Recommendations reinforce silos. We need to break them.
I don't mean dashboards. Dashboards display known metrics for known processes. They're rearview mirrors. The exposure gap is forward-looking. It's about possibilities that haven't been articulated yet.
What I mean is structured intelligence. Scored, tagged, composable knowledge primitives. Individual units of technical knowledge that carry enough context, enough metadata, enough evaluative opinion, that both humans and AI agents can reason over them, discover unexpected adjacencies, and compose new specifications from capabilities that were never designed to work together.
Imagine every significant open-source project, every notable research paper, every meaningful model release, evaluated on defensibility, frontier risk, novelty, and composability. Embedded semantically so that a search for "tools that help robots see" returns computer vision, LiDAR perception, and depth sensing projects even when none of those results contain the phrase. Scored so that a query for "defensible middleware with low displacement risk" returns ranked, opinionated results with written reasoning behind every score.
Now imagine combining these primitives across domains. A robotics framework and a quantum computing library. A memory persistence service and a specification-driven development framework and a data lineage engine. Components from fields that would never appear in the same search query, composed into product specifications that represent genuinely new technical territory.
When Sarah's expertise in titanium fatigue can be matched to the medical device company's failure mode through a structured intelligence layer, nineteen years of materials science expertise compresses into a query. When Tony's capacity planning challenge can be informed by trade policy data he's never seen because a scored primitive connects tariff impacts to regional machining demand, the gap between his craft knowledge and the system-level context closes in minutes. When Priya's architecture challenge maps against the Zurich paper because a semantic embedding recognized the adjacency between her problem and their solution, three months of unnecessary engineering evaporates.
This is what data primitives do. They compress decades of cross-domain exposure into queryable adjacencies. They close the gap between what any individual has experienced and what exists in the world's technical landscape.
Part IV: The Compounding Effect
Here's where it gets interesting.
The exposure gap is the only constraint among the four that creates a compounding return when you address it. Energy gets cheaper linearly with investment. Funded problems scale with organizational maturity. Trust infrastructure builds incrementally. The exposure gap compounds.
Every new connection surfaced becomes a data point. Every composition attempted generates feedback on which adjacencies produce valuable outcomes. Every scored primitive used in a novel context creates a new edge in the knowledge graph. The system learns. Over weeks and months and years, the intelligence gets denser, the adjacencies get richer, the quality of the connections improves.
This creates a fascinating dynamic. The more you compress the exposure gap, the more connections emerge. The more connections emerge, the better the inputs become. The better the inputs become, the higher the quality of what AI can produce. Which generates more data about which connections are valuable. Which makes the intelligence denser. Which surfaces more connections.
The flywheel spins.
And here's the part that really gets me: this flywheel is domain-agnostic. The same infrastructure that connects a metallurgist to a medical device company connects a defense contractor to an academic research group connects a fintech architect to a distributed systems paper connects a machinist to a trade policy forecast. The primitives are domain-specific. The infrastructure for scoring, embedding, composing, and querying them is universal.
One system. Every domain. Accumulated intelligence that compounds with every use.
Part V: What This Means
The conversation about AI constraints has been dominated by hardware, money, and regulation. These are real constraints with real consequences. They're also constraints with known solution vectors. We know how to build power plants. We know how to fund research. We know how to write regulations (slowly, badly, and with much arguing, but we know how).
The fourth constraint is harder because it's weirder. It requires admitting that the ceiling on AI might be the richness of what we ask it to do. It requires accepting that the most sophisticated model in the world is only as useful as the input it receives. It requires confronting the fact that the input is shaped by the narrow, path-dependent, biologically-bounded experience of whatever human happens to be sitting at the keyboard.
And it requires building something new. Infrastructure that compresses the exposure gap. Intelligence primitives that carry enough context to bridge domains. Composition engines that surface adjacencies no individual would discover through their own lived experience. Systems that get smarter with every query, every connection, every composition.
We don't know what we don't know. That's always been true. What's new is that we can build systems that make it less true. Systems that close the distance between what any individual has experienced and what exists in the world's accumulated technical knowledge.
The request is the bottleneck. The request has always been the bottleneck. We've just been looking at the wrong side of the interface.
Making the request better is the opportunity of the decade.