Back to Insights
April 7, 2026AI InfrastructureRob Murtha

Girolamo: Technical Intelligence as Infrastructure for the Agentic Era

Introducing Girolamo, a technical intelligence platform that scores the open-source landscape on defensibility, frontier risk, and novelty — and serves the results to both human analysts and AI agents via MCP.

The Landscape is Moving Faster Than Analysts

GitHub hosts over 400 million repositories. arXiv publishes thousands of papers every week. Hugging Face adds hundreds of models daily. The open-source technical landscape is vast, noisy, and accelerating.

Existing tools show you what is popular — stars, citations, downloads. They answer the question "what are people using?" But for CTOs building product strategy, investors evaluating technical moats, and analysts tracking competitive positioning, popularity is the wrong signal.

The questions that matter are harder:

  • Is this project defensible, or will a frontier lab ship a better version in six months?
  • Is this domain consolidating or fragmenting?
  • Who are the key creators, and are they moving to new areas?
  • Where is the gap between "technically defensible" and "not yet discovered by the market"?

Girolamo is a technical intelligence platform built to answer these questions. It continuously monitors the open-source landscape, scores every entity it finds using LLM-powered analysis, and serves the results as a structured, searchable, agent-queryable intelligence corpus.

The name comes from Gerolamo Cardano, the Renaissance polymath who invented probability theory — the practice of looking at a complex, uncertain landscape and calculating the odds.


From Raw Data to Scored Intelligence

Girolamo operates a six-stage pipeline that transforms raw activity data into opinionated, defensibility-scored intelligence.

1. Universal Ingestion

Platform-specific adapters pull entities from GitHub, arXiv, and Hugging Face. Each source is normalized into a universal schema — Entity, Creator, and TrajectorySnapshot — enabling cross-platform analysis. A researcher who publishes papers on arXiv and maintains implementations on GitHub appears as one creator across both corpora. Every write is idempotent. Overlapping jobs are safe. Re-ingestion updates traction metrics without creating duplicates.

2. Enrichment

Post-ingestion enrichment backfills signals that are not available during initial scraping. For arXiv papers, this means citation counts from Semantic Scholar — giving papers a traction signal comparable to GitHub stars. Rate-limited with exponential backoff so that failures are retried on the next run without manual intervention.

3. AI-Powered Evaluation

Every entity is analyzed by an LLM operating under strict Pydantic schema enforcement. No freeform string parsing. The evaluation produces:

  • Defensibility score (1–10): How hard is this to replicate or displace?
  • Frontier risk (LOW / MEDIUM / HIGH): How likely is a major lab to ship something that makes this obsolete?
  • Threat profile: Platform domination risk, market consolidation risk, displacement horizon.
  • Composability profile: Tech stack, integration surface, capability tags, implementation depth, novelty classification.
  • Reasoning narrative: Multi-paragraph explanation of why a project scored the way it did. This is the most valuable field — it is the analyst's written argument, not just a number.

4. Materialization

All data fuses into denormalized intelligence primitives — single JSONB documents containing entity metadata, AI analysis, traction data, and creator information. Primitives are the queryable unit of the system. Every API call, every search, every agent tool invocation reads from this table. Zero joins at query time.

5. Semantic Embedding

Each primitive is converted into a 512-dimensional vector embedding and stored in pgvector. This enables search by meaning, not keywords. Asking for "tools that help robots see" returns computer vision, LiDAR perception, and depth sensing projects — even when none of those results contain the search phrase.

6. Conspectus

The macro intelligence layer. Aggregates all primitives by topic to produce corpus-level signals: average defensibility, velocity distribution, frontier risk ratios, creator concentration, and an LLM-generated narrative summarizing the state of each domain. Over time, Conspectus entries form a time series — enabling trend detection that is impossible without longitudinal data.


Agent-Native by Design

This is where Girolamo connects to the broader thesis of agentic infrastructure.

The platform exposes an MCP (Model Context Protocol) server alongside its REST API. Any MCP-compatible client — Claude Code, Claude Desktop, Cursor, Windsurf, or a custom agent framework — can connect and query the intelligence corpus using structured tools:

  • query_intelligence — Semantic search with filtering by corpus, defensibility threshold, and lens.
  • analyze_competitive_landscape — Topic-level analysis with velocity and defensibility data.
  • search_intelligence — RAG-powered Q&A synthesized from relevant primitives.
  • find_defensible_clusters — Graph analysis of high-defensibility entity groupings.
  • get_creator_network — Creator influence, portfolio, and co-creation mapping.

An AI agent working inside a development workflow can ask Girolamo: "Find me defensible robotics middleware with low frontier risk and production-grade implementation depth." It receives structured, scored results it can reason over — not a list of links.

This is the pattern that makes agentic intelligence compounding rather than ephemeral. When an agent queries Girolamo, it is drawing on months or years of accumulated trajectory data, defensibility assessments, and creator movement tracking. The intelligence gets better over time because the underlying dataset does.


The Intelligence Primitive

The atomic unit of Girolamo's corpus is the intelligence primitive — a single JSON document containing everything known about an entity. Defensibility score, frontier risk, threat profile, composability profile, traction metrics, creator data, and the full reasoning narrative.

This structure enables multiple modes of consumption from the same data:

  • Search: Semantic similarity against the embedded corpus.
  • Filter: By score, risk level, source, type, or novelty.
  • Compare: Side-by-side evaluation of competing technologies.
  • Compose: Workspace-level analysis that synthesizes how multiple artifacts could work together.
  • Trend: Conspectus aggregation over time reveals macro shifts.
  • Agent reasoning: Structured enough for LLMs to parse, score, and incorporate into autonomous decision-making.

A single data structure serving both human analysts and AI agents. Same corpus, same scores, same narratives — different interfaces.


The Repeatable Pattern

Girolamo is not just a product. It is a repeatable intelligence pattern:

  1. Pick a domain — defense, biotech, fintech, robotics, quantum computing.
  2. Define ingestion sources — GitHub, arXiv, patents, regulatory filings, job boards, RFPs.
  3. Configure evaluation lenses — defensibility, investment potential, regulatory risk, security posture.
  4. Run the pipeline — ingest, enrich, evaluate, materialize, embed, conspectus.
  5. Expose via API and MCP — agents and analysts query the same corpus.
  6. Compound over time — longitudinal data creates irreplaceable trend intelligence.

The infrastructure is domain-agnostic. The intelligence is domain-specific. The Lens system allows multiple analytical perspectives on the same underlying data — a defense acquisition lens scores differently than a venture capital lens, but both draw from the same normalized entity corpus.

This pattern maps directly to the broader Adjective capability stack. Prelude provides machine-readable context for codebases. GLITCHLAB provides deterministic agentic execution. Zephyr provides cryptographic provenance. Girolamo provides scored technical intelligence. Each system is independently useful. Together, they form a sovereign infrastructure for teams operating in the agentic era.


Why This Matters Now

The transition to agentic workflows changes the nature of technical intelligence. When AI agents participate in engineering decisions — selecting dependencies, evaluating architectures, recommending tools — they need access to the same quality of intelligence that human analysts use.

Feeding an agent a list of GitHub trending repositories is not intelligence. It is noise ranked by popularity. Feeding an agent a scored, threat-assessed, composability-profiled intelligence primitive with a reasoning narrative is something it can actually reason over.

Girolamo exists because the future of agentic intelligence requires structured, opinionated, longitudinal data as a first-class input. Not search results. Not dashboards. Scored intelligence that compounds.

Own your intelligence. Build with certainty.