Trust and Verify
42% of new code is now written by AI[1]. We are trusting AI to review it. Archway is what's missing — a verification layer for the AI era.
What Archway is
A compiler turns source code into something a machine can execute: source → intermediate representations → executable target. Archway runs a pipeline of the same shape — but the target is mathematical analysis, not execution.
Three side-by-side views of one Python function, sign(x): the Python source, the Python bytecode the interpreter runs, and Archway's categorical translation showing the function's compositional shape.
Python
What you write
def sign(x):
if x > 0:
return 1
elif x < 0:
return -1
else:
return 0 Python bytecode
Sequenced for execution
3 RESUME 0
4 LOAD_FAST 0 (x)
LOAD_CONST 1 (0)
COMPARE_OP 148 (bool(>))
POP_JUMP_IF_FALSE 1 (to L1)
5 RETURN_CONST 2 (1)
6 L1: LOAD_FAST 0 (x)
LOAD_CONST 1 (0)
COMPARE_OP 18 (bool(<))
⋯ Archway translation
Compositional shape
abstract(sign): () → (sign) sign(x) → ([~1: int|~2]): (;) ├─ id(x) ⊗ const(0): () → (~3: int) ├─ gt: (x, ~3: int) → (~4) ├─ distribute: (~4, x) → ([x|x]) ├─ Copairing: ([x|x]) → ([~1|~2|~5]) │ ├─ [x] return(x, 1) │ └─ [x] ⋯ └─ codiagonal: (…) → (…)
archway-labs.ai
The same function in three forms. Python is what you write. The bytecode is what Python's interpreter runs, sequenced for execution. Archway represents your code as a faithful mathematical object that exposes its compositional shape.
Built on that translation, our analysis engine pairs AI agents with formal-methods techniques, producing results you can check rather than text that sounds right.
You wouldn't trust your AI to do arithmetic without a calculator. You shouldn't trust it to review your code without Archway.
How it works
Your source code goes into the translation engine, which produces the categorical translation. The analysis engine then runs on top, combining deterministic algorithms with AI agents that supply domain context.
- The translation engine. Turning real, dynamic, messy code into a precise mathematical object that can be reasoned about locally. This is the bulk of what we've built so far, and the key differentiator: everyone else needs code written to be proved — annotated or generated. We start from code that never was.
- The analysis engine. Once code is in the translation, the techniques we layer on — type-level reasoning, abstract interpretation, deductive verification — are established mathematics from formal methods. We didn't invent them. What's new is that the structure of the translation lets them compose cleanly across modules and scale to whole codebases — a place existing tools have struggled to reach.
Much of the interesting work ahead lives there, where formal methods can finally meet codebases at scale.
Why we chose Python to start
We wanted to meet developer communities where they are, and to start on hard mode to validate the approach.
Python is dynamically typed. Modules can reach into and modify each other at runtime. What a program actually does often isn't fully determined until it runs. The existing Python tools — mypy, Pyright, Pyre, Pysa — do excellent shallow type checking and pattern-based analysis, but the deeper formal-methods stack (abstract interpretation, deductive verification) has largely sidestepped Python for exactly these reasons. Most verification research targets statically-typed languages with simpler semantics.
We went the other way on purpose. Solving Python first means statically-typed languages — like C++, which we're eyeing next — give the translation engine more to lean on at compile time.
What it enables
The translation is grounded in category theory — a branch of mathematics built for reasoning rigorously about structure and composition. The math is the means, not the point. Here's what the layer actually enables:
Better analysis.
Because the translation is mathematically precise, we can run deterministic analyses on it — catching classes of errors that probabilistic tools miss, with proofs rather than guesses.
Cheaper code review.
Deterministic analysis does the heavy lifting; AI gets called only where it genuinely adds value. As the era of subsidized AI compute ends, a layer in your engineering stack that doesn't charge by the token becomes a real advantage.
Provable software.
The long-term goal: a world where you can verify that your code does what it claims — across every framework, every integration point, and every possible input, not just the cases you happened to test. The goal: drive runtime errors toward zero — provably, not hopefully.
To make it concrete: a C++ compiler will catch you passing the wrong type into a function. Archway aims higher — letting you declare invariants like "nothing touches the database without first acquiring a lock" and checking them across every code path, before the race condition that loses your data ever ships.
The layer is meant to be invisible. No special decorators, no annotations. After lightweight setup, it's just you, your coding agent, and Archway underneath.
Where we are today
We're early, but moving fast. One concrete by-product already gives a feel for where this is heading: code visualization. The math we use has its own visual language, so once code is in the translation, rendering it is essentially free. For internal use we've built a visual codebase map (think Google Maps for your code) that layers the categorical translation with control flow and analysis. It's not what we're ultimately targeting — but it's a real demonstration of what the translation layer can offer. We're partnering with friends at Nvidia to open-source it soon.
Screenshot of Archway's codebase map — a high-level visual view of a Python codebase showing modules, dependencies, and overall structure, with a control-flow panel and analysis sidebar.
Screenshot of Archway's per-function view: a single Python function rendered as its categorical translation alongside its control-flow graph and a properties/analysis panel.
archway-labs.ai
A by-product of the layer: the same translation that powers verification also gives developers structured intermediate tooling — code mapped down to the method level, navigable, analyzable.
Next up: posting results on published static-analysis benchmarks, and extending language coverage toward TypeScript, C++, and Rust.
In the future: The same categorical translation layer has applications beyond the code that agents write — modeling and translating the workflows and behaviors of the agents themselves. As we succeed in formal verification for agentic coding, agent workflows are the natural next step.
Who's building this
We've been writing software our whole careers — coming from pure mathematics and enterprise ML/AI. Over the last two years we've watched the same thing happen everywhere: teams ship more and more AI-generated code.
We started Archway because we kept seeing the same gap from opposite directions: rigorous systems shipping without proof on one side, enterprises killing AI projects they couldn't trust on the other. The rising traction of Lean, the breakneck pace of AI-generated code, and the impact of category-theoretic approaches in languages like Rust all pointed the same direction. The pace and depth of our categorical translation of Python has been the confirming evidence: the natural next layer for computing is a provable one. Archway is bringing it in.
We're looking to collaborate
We've been gathering mathematicians, formal-methods researchers, and cracked engineers from our networks — and we're widening the door.
If you want to contribute, discuss applications of this framework, or just want to put Archway to work on a hard codebase you have, here's how.
— The Archway team
Work on formal methods, PLT, or category theory?
Joining us, partnering with us, or telling us why we're wrong — we want to talk.
Get in touchGot a gnarly Python codebase?
Send it our way. The messier the better — we want real-world code to test our translation engine against.
Submit a codebaseThinking about verification in the agent era?
We're running a small Discord for people working on formal methods, deterministic analysis, and the future of code review.
Join the DiscordWant to break our translation engine?
We're putting together a small test kit — a handful of benchmark codebases and a CLI you can run against your own. Join the waitlist for early access.
Join the waitlistThe State of Provable Software
A field guide to the verification stack — where each tool lands, and where the gaps are.
Read post 2 →