Back

Another open-source PoC generator? Cute. Now let’s talk about the hard part.

Ofek Haviv

November 6, 2025

3 minutes read

Everyone’s releasing shiny open-source tools that churn out CVE PoCs like fast food. Press releases love it: “PoC in one click!” Great headline. Cute trick. But here’s the thing - that’s the easy part. Generating a proof-of-concept is noisy theater. The real work - the part that actually keeps your systems safe and your ops team sleeping at night - is far messier, far more contextual, and far more human.

Why “we made a PoC” doesn’t mean anything for your stack

A generic PoC proves an idea in a vacuum. It does not prove exploitability in your environment. Two reasons:

Every customer implements things differently. The same package, same version, same vulnerability - yet every deployment has different paths, configs, wrappers, mitigations. One-size-fits-all PoC? Mostly marketing.
Context matters. Where your app stores files, how your auth flows work, what proxies and filters are in front - none of that shows up in a raw PoC. Without context, you get noise: false alarms, wasted patch cycles, and panic.

That’s what most “PoC factories” ignore. Terra doesn’t.

How a process like this actually looks

Step 1 - The Great Vulnerability Hunt: gather everything, keep what matters

We ingest broadly and aggressively: NVD, GitHub Advisories, OSV, Exploit-DB, mailing lists, vendor advisories, community posts - all of it. Then we clean and structure:

Merge duplicates (because the internet loves giving the same bug three IDs).
Tag by product, version ranges, and ecosystem (npm, PyPI, Maven, etc.).
Attach confidence, provenance, and quick heuristics.

You end up with a canonical vulnerability catalog per customer - catch ’em all, but keep only the ones that can actually walk into your app and bite.

Step 2 - Hunt or build a PoC (collection + basic sanity checks, no adaptation)

When a CVE looks relevant, the first move is simple and fast: find an existing PoC or create a safe one.

Search: scour public repos, advisories, Exploit-DB, GitHub, mailing-lists and community posts for usable PoC artifacts.
Synthesize when needed: if no reliable PoC exists, construct a non-destructive test harness that simulates the vulnerable behavior in a tightly controlled way (scoped inputs, no destructive side effects).
Initial sanity vetting (AI + expert rules):
- Coherence: is the artifact intelligible and reproducible, or just a fragment of copy-paste noise?
- Technical plausibility: does the code look like it actually implements the vulnerability described, or is it superficially similar?
- Safety triage: does it try to do obviously destructive things (arbitrary system writes, credential exfiltration)? If so, flag and quarantine for human review.
- Provenance & version hinting: any metadata that suggests which versions or environments the PoC targeted.

Outcome of Step 2: a candidate PoC that is technically reasonable and safe to inspect, or a PoC that’s quarantined/marked for human review - but no environment-specific adaptation yet. We don’t map paths, endpoints, or runtime details here; that comes next.

Step 3 - Discovery + Adaptation: find where a PoC touches your stack and turn it into a faithful test

With PoCs collected, we run a single focused flow that both discovers where a PoC could matter and immediately adapts it into a runnable, non-destructive test when appropriate.

What we do:

SBOM & dependency matching: match PoCs to the actual deployed packages - not just manifests.
Code & call-graph mapping: locate where the vulnerable API/pattern is invoked, wrapped, or sanitized in your codebase.
Runtime footprint: identify which services/processes load the component, file paths, endpoints, auth tiers and other compensating controls.
Adapt for execution: for PoCs that map to reachable targets, tweak payloads, endpoints, headers, file paths and timing to reflect your runtime, and produce a minimal, non-destructive test plan (telemetry to collect, abort/rollback conditions, isolation strategy).
Safety vet: apply sanity checks (plausibility, version match, obvious destructive actions). Quarantine anything unsafe or ambiguous for human review.

Decision outcomes:

Prepare & execute (human-approved): mapped PoCs that look plausible are adapted into a scoped execution plan and sent for approval.
Discard: PoCs that don’t map or fail safety/plausibility checks are recorded and deprioritized, with rationale.

Result: a focused set of adapted, auditable test plans - or a documented reason why a PoC doesn’t apply - all produced in a single, efficient step so we move from hypothesis to safe verification quickly.

Step 4 - Execution plan: AI proposes. Humans approve (always)

AI drafts a precise execution plan for any test. No plan runs without human sign-off - no exceptions. The plan is auditable and includes purpose, scope, isolation strategy, safety guards, telemetry, abort conditions, and rollback steps.

Human approval is recorded in the audit trail. AI proposes, humans take responsibility.

Step 5 - Testing safely (production-savvy, not reckless)

Tests (when authorized) are executed under strict constraints:

Non-destructive PoC-level checks only.
Strong isolation or scoped interaction when production touch is required (with explicit approvals).
Full logging and telemetry capture for reproducibility.
Automatic cleanup and rollback procedures.

The single question we answer: can this CVE be exploited in your real environment? If yes, we capture reproducible evidence. If not, we capture the negative evidence and rationale.

Step 6 - Findings: one clear, actionable artifact

After validation we create a single finding - the canonical record you act on. Each finding contains:

Title & provenance (CVE, source).
Short finding summary: what was proven under which conditions.
Scope & conditions: versions, paths, inputs (redacted where necessary).
Evidence snapshot: sanitized logs, hashes, telemetry summaries.
Exploitability & impact assessment: concise score and why.
Tactical remediation: prioritized steps you can apply now.
Tracking metadata + audit trail: owner, severity, approvals, test logs.

If no exploitability was demonstrated, the finding documents that negative result with the same evidence and reasoning. No ambiguity. No long ambiguous reports - one focused, auditable item per verified capability.

What sets Terra apart - the real edge

A lot of vendors stop at “we found a PoC.” That’s marketing, not security. Terra’s differentiators:

Discovery-first approach. We don’t try to brute-force PoCs across an unknown surface. We map where a CVE would actually interact with your app and target verification there.
Code-level adaptation. We map CVEs to your code and runtime behavior, not generic signatures.
Platform-level verification. Pentest-grade reasoning and controlled checks produce reproducible evidence - not just matches.
Noise reduction + human judgment. AI triages; humans validate; you get fewer false positives and fewer wasted patches.
Safe-by-default testing & auditability. Non-destructive tests, explicit approvals, and complete logs.

Automation finds; discovery focuses; AI reasons; humans own the execution. That combination is the competitive edge: it produces fewer alarms, faster resolution, and real, auditable evidence.

Why this approach actually scales

Automation handles ingestion and triage.
Discovery narrows the problem to what matters.
AI speeds technical reasoning and generates safe plans.
Humans provide judgment, accountability, and approvals.

Together this triad moves fast without being reckless. It turns “you might be vulnerable” into “here’s exactly what, where, and how - and here’s the fix.”

Closing - a small manifesto

Generating PoCs is the appetizer. The real value is discovery, validation, and proving whether a CVE is material to your system - safely and audibly. That’s part of what we’re building at Terra Security: tools that find and reason, processes that enforce safety, and humans who take responsibility.

We’ll still generate PoCs when useful. But we’re not selling demos - we’re delivering answers.

Stay skeptical of pretty demos. Demand evidence. Want to see this mapped to your stack? We’ll bring the discovery, the tests, and the receipts.

‍