Dethernety: a graph-native threat modeling platform

Context

Dethernety started as a side project. Several things lined up: a chance to sharpen my development, graph, security, cloud, and AI work in the same place; something I could use directly in client engagements; and a genuine product underneath if it landed. I went full focus when I closed my last consulting engagement in mid-2025. The full-focus stretch has been part deliberate build and part sabbatical from the rhythm of large engagements.

The existing tooling sat on the wrong foundation. Security is a graph problem — components, trust boundaries, attack paths, and controls all relate as a graph — and graph-native threat modeling did not exist. It still does not, beyond what I have built.

The status quo is the part you might recognize. Threat modeling as most organizations practice it produces diagrams that sit on shelves: models that are not executable, not versionable in any real sense, and not connected to the code they describe. Security architects draw them once and move on. Engineers never see them again. The gap between "we did threat modeling" and "our threat model reflects what we actually ship" is where most of the risk lives.

What I set out to build was a graph-native threat modeling platform that treats models as code and lives in the engineer's editor.

Mandate

A self-set one. Build Dethernety as a graph-native threat modeling platform that engineers can use day to day, not only security architects. Design it commercially from the start, with multiple plausible revenue paths in mind: SaaS in tiers, on-prem deployment, a module marketplace, supporting tools like Studio, and services around the product. Ship the SaaS first, with an open core. The open-source layer has to be genuinely useful on its own; the proprietary layer handles the infrastructure and provisioning nobody wants to solve themselves.

Role

Solo builder. Product, architecture, code, ops, documentation, every decision. No team, no cofounder, no external sponsor pushing for a specific direction. Every technical call is mine; every misjudgement is mine to recover from.

The shape of solo building is not what it looks like from the outside. A material share of the work is deciding what not to do. Solo time is the most finite resource on the project, and there is always more work visible than can fit into it.

Approach

Platform components

Backend. The backend is a NestJS service exposing a GraphQL API with queries, mutations, and subscriptions. The domain model is a graph, so the storage is a graph: Neo4j or Memgraph holds the live model, with components, trust boundaries, data flows, attack paths, and countermeasures as first-class graph entities. GraphQL query definitions are shared across the platform's consumers: the web UI, the CLI, the Claude Code plugin, and the MCP server all pull from one source of truth.

Module ecosystem. A module provides the classes of the system: design classes for the things being modelled, analysis classes for the lenses applied to them, and issue classes for issues and their integration with external trackers like GitHub or Jira. Modules are JavaScript libraries, so the integration surface is extensible: a new tracker means a new module, not a platform change. The MITRE ATT&CK and D3FEND frameworks are loaded as graph; how a model's components link to specific techniques and countermeasures is decided by the module's logic on the relevant attributes, not by a platform default.

Analysis runs at two levels. Component-level analyses evaluate one element at a time, and the engine is swappable per module: a generic module can use OPA/Rego, another can use static graph queries, others can do something different again. Model-level analyses operate across the whole graph, and that is where the graph-native shape matters most; an integrated LangGraph service is one of the paths a module can take for AI-assisted analyses. The first-party modules cover the core domain and the MITRE frameworks; custom rules ship as new modules, not platform forks.

Web frontend. The web UI is a Vue 3 single-page application built around an interactive diagram, and the module system extends into it. Property panels are generated from module-defined JSON Schemas via JSONForms, so when a module ships new classes the form UI for those classes appears without a frontend release. Modules can also register custom Vue components at runtime, with the host application exposing the Vue runtime and composables so modules do not bundle their own. Vue Flow drives the data-flow editor, with hierarchical trust boundaries and direct assignment of MITRE techniques on diagram elements. Authentication is OIDC with PKCE against the usual identity providers, Cognito and Keycloak among them.

Dethereal plugin. The platform's second frontend is a Claude Code plugin for building threat models, sitting in the engineer's editor. It replaces the blank-page problem of traditional threat modeling tools with a fixed eleven-step staged-delegation workflow, where each step is a specialist agent proposing changes the user adjudicates before anything persists. The assumption underneath: novice modellers cannot articulate what a threat model needs up front, but they can recognize good answers. A staged workflow with agent proposals moves the work from articulation to recognition, and that shift is the innovation. Models persist as disk files, resumable across sessions and committable to git. I wrote the plugin design up separately in Eleven Steps You Don't Type.

Studio. Authoring modules has its own surface. Studio is a standalone application for designing, testing, and packaging modules: AI-assisted class generation through LangGraph pipelines, a form editor with live preview that renders classes the way end users will see them, Rego authoring with sample-input validation, and module packaging for deployment. Dethereal builds threat models out of existing modules; Studio builds the modules those threat models use.

Deployment

Multi-tenant SaaS on AWS, designed for compromise. The SaaS side is built on the assumption that any multi-tenant system will eventually be partially compromised, and that the right question is what a compromise can reach. The answer in Dethernety is: not much. Each customer gets their own network segment, their own identity pool, their own compute (single-instance on the entry tier, K3s on the higher tiers), their own CloudFront distribution over a VPC Origin, and their own IAM role scoped by hardcoded resource ARNs. Terraform state is per-customer. There is no shared runtime data plane between tenants. The entry tier runs on Fedora CoreOS with an immutable read-only root, so a compromised node cannot persist changes that survive a reboot; the higher tiers move to K3s with the same isolation posture. I wrote the architecture up across a five-part series, starting with Architecture Overview.

Development methodology

AI-native, spec-first, agent-reviewed. The methodology shifted partway through. Dethernety started as a normal development project: specs as prose, implementation as a series of commits, tests written against features. As the generation of tooling around Claude Code matured, I moved the project to a spec-driven, AI-native workflow that now carries most of the platform's development.

The architecture stays mine. AI generates implementation; I own the system shape, the data model, the API surface, the analysis subsystem boundaries. Code review depends on the surface: the backend gets read line by line; the web frontend and Studio ride the workflow more directly, with review at the gate rather than at every line.

The workflow has five phases with an explicit human-in-the-loop at each. Intent by exploration: I describe what I want to build, and a specialist agent drafts a spec by exploring the existing code, asking clarifying questions, and proposing the shape. Multi-agent review: the spec is reviewed by a set of agents with distinct specialties — security, architecture, graph theory, operations — each producing findings in its own voice rather than a merged consensus. Sprint plan: once the spec clears blocking issues, it becomes a plan with user stories, definitions of done, references to the relevant code and docs, and test and evaluation strategies per story. AI-driven implementation: the plan is executed with specialist agents where the work calls for it. Comprehensive testing: unit, integration, and evaluation suites, with the eval layer specifically for agent-mediated work where traditional assertions fall short.

Every phase gates on my judgement before the next one starts. The goal is to put the human where adjudication and direction actually matter, not where the human is a bottleneck on typing. All of this is encoded in the project's .claude/ configuration and .github/ workflows, with slash commands gating PRs on boundary checks, security review, and documentation-staleness detection before anything ships.

Where it stands

Graph-native threat modeling platform with a multi-tier SaaS deployment on AWS, per-customer infrastructure isolation, and an open-core split (the OSS monorepo sits as a subtree of the private monorepo).
Dethereal Claude Code plugin: eleven-step staged-delegation workflow, four specialist agents, a set of MCP tools, with permissions enforced at the tool layer rather than in prompts.
Module system covering the core threat modeling domain and the MITRE ATT&CK / D3FEND frameworks, with OPA/Rego policy evaluation and an extensibility boundary that avoids platform forks.
AI-native development toolchain: specialist agents, slash commands, and workflow gates that operationalize the spec-first, multi-agent-reviewed, sprint-planned, agent-executed methodology across the monorepo.
Six published essays on the underlying architecture and plugin design, with more in progress.

What made it hard

Solo breadth is the first constraint. Threat modeling, graph databases, SaaS infrastructure, immutable compute, AI-native tooling, and Claude Code plugin design are six different disciplines, each with depth I had to either reach into myself or delegate to a specialist agent. The scope of the work is not a decision I get to revisit. It is the shape of the product.

The methodology pivot was expensive. Moving an in-flight project onto a spec-driven AI-native workflow is not a matter of configuring tools. It changes what "done" means, what a review looks like, and where the cost of a bad decision shows up. The biggest single cost is documentation discipline: agents will fill in intent if you don't make it explicit, so domain rules, design constraints, test invariants, naming conventions, and the reasoning behind past decisions all have to be captured in a structured way. The payoff shows up in quality and security, not in typing speed. I lost time before I gained it. The gain came later and is now structural, but the transition was a cost I paid over several months with eyes open.

Positioning is harder than the technology. A graph-native, AI-native, shift-left threat modeling platform is easy to describe technically and harder to place in a market used to document-first threat modeling tools on one side and chat-first AI copilots on the other. The product is neither of those, and naming that clearly without sounding like yet another "we reinvented threat modeling" pitch is a genuine writing problem.

Solo pacing is its own discipline. Nobody else is going to notice that test coverage drifted, that a module interface is generating more coupling than it should, or that a dependency upgrade has sat on a branch for a week. The internal review function has to be real. The specialist agents help, and catch things a solo builder would miss, but the ultimate review is mine and I have to budget for it explicitly.

What I took from it

Three things stuck.

One: AI-native development, run with discipline, changes what you can build alone. Specialist agents do the bulk of the implementation; architecture stays human, and so does review on the parts that warrant it. The multi-year, multi-team work I have scoped for clients in the past is a different shape under that combination. The work is not easier. The ceiling of what one person can carry end to end has shifted, and I am still calibrating where the new one sits.

Two: staged delegation beats free-form prompting in any domain where users cannot articulate what they want. The novice threat modeler does not know what a good threat model contains, and no amount of open prompting fixes that. A fixed workflow with specialist proposals at each step meets the user where they actually are. That pattern generalizes past threat modeling, and I am watching for the other domains it applies to.

Three: treating infrastructure isolation as a design principle, not a configuration task, produces a posture you cannot retrofit. Designing Dethernety from the first line for per-customer isolation was more work up front than a shared-everything SaaS would have been, and it is now the part of the architecture I have to defend the least. The right default, chosen early, pays back every month.

And the residue. Building solo with AI-native methods changed how I think about what consulting can deliver. A design I wrote as a consultant assumed the team on the other side could carry it. A system I build as Dethernety carries itself, with me doing the adjudication a team would otherwise do collectively. Those are not the same craft, and knowing where they converge is a live question I expect to be answering for a while.

Sources:

dether.net — project site
dethernety-oss on GitHub
Architecture Overview — entry point for a five-part series on the AWS infrastructure (the four follow-up essays are linked at the end of the overview)
Eleven Steps You Don't Type