Where chaos
becomes craft.

Transmuting code into confidence — a structured pipeline that orchestrates AI agents across any provider, scores every output, and heals itself when things go sideways.

Get started See how it works

pipeline.yaml

phases:
  spec          # define what to build
  behavioral    # write contracts
  adversary     # challenge them
  tests         # prove it works before it exists
  implement     # write the code
  review        # score it
  test          # verify it

transitions:
  review.fail: implement  # loop back, not forward

THE PROBLEM

AI is probabilistic.
Your pipeline shouldn't be.

AI agents are coin flips wrapped in confidence. Every run produces different output. Every output needs verification. Most teams either trust blindly or review manually. Neither scales.

The industry built orchestrators for routing — no one ships trust as the default.

THE VISION

From probabilistic
to verifiable.

Orchemist is not another orchestrator. It is a trust engine.

Specs before code

Define what success looks like before writing a single line.

Adversarial review

Challenge assumptions before you build on them.

Tests before implementation

Prove it works before it exists.

Scoring and routing

Quantify confidence. Gate on thresholds. Route on results.

Bounded self-healing

Retry with feedback, not blindly. Surface blockers, don't hide them.

ARCHITECTURE

Any workload. Any sequence.
One engine.

A pipeline is a YAML state machine. You define phases, transitions, and quality gates. The engine follows the graph. Swap phases, change models, add gates — without touching code.

SHOWCASES

Pipelines for real work.

CODING

Issue to merged PR in 8 phases

Spec, adversary, acceptance tests, implement, review, score. Every phase gated. Every output graded.

11 phases · cross-model adversary

Learn more

CONTENT

Research, write, fact-check, red-team

Built-in anti-hallucination. Every claim gets a source. Red-team review before publish.

0 claims without an attached source

Learn more

RESEARCH

Web search, market scan, synthesis

Competitive analysis, structured intelligence gathering. From raw data to actionable insight.

structured intelligence output

Learn more

EDITORIAL

Rewrite, review, consistency, polish

Sequential rewrite with flow review and consistency checks. Voice preservation across revisions.

4 passes per piece, minimum

Learn more

THE MARKET

They orchestrate agents.
We orchestrate trust.

Seven frameworks compete on how elegantly they route agents. None compete on whether you can trust what those agents produced. LangGraph, CrewAI, Pydantic AI, Google ADK -- all solve communication. Orchemist solves verification.

WHAT THEY DO

WHAT WE ADD

Agent routing

Behavioral acceptance tests

Prompt chaining

Confidence scoring with routing

Multi-model support

Adversarial spec review

Workflow graphs

Bounded self-healing

—

Full audit trail

TRACTION

Numbers, not promises.

7,000+

Tests passing

Pipeline phases

Execution modes

MIT

Licensed

GET STARTED

Two ways to run.
Pick your on-ramp.

The Skills Pack rides on top of Claude Code — pure markdown, no Python. The Engine gives you the web UI, queue, daemon, and history dashboards. Same 11-phase pipeline underneath.

Path A · Skills Pack ~1 min · no Python

terminal · Claude Code already installed

$ git clone https://github.com/ToscanAI/orchemist-skills.git
$ cd orchemist-skills && ./install.sh
$ claude
> /orchemist:run examples/example-issue.md

Path B · Engine ~5 min · web UI included

terminal

$ pip install orchemist
$ orch new my-pipeline
$ orch run my-pipeline.yaml --mode openrouter

Skills Pack on GitHub Engine on GitHub

Where chaos becomes craft.

AI is probabilistic. Your pipeline shouldn't be.

From probabilisticto verifiable.

Specs before code

Adversarial review

Tests before implementation

Scoring and routing

Bounded self-healing

Any workload. Any sequence.One engine.

Pipelines for real work.

Issue to merged PR in 8 phases

Research, write, fact-check, red-team

Web search, market scan, synthesis

Rewrite, review, consistency, polish

They orchestrate agents.We orchestrate trust.

Numbers, not promises.

Two ways to run.Pick your on-ramp.

Where chaos
becomes craft.

AI is probabilistic.
Your pipeline shouldn't be.

From probabilistic
to verifiable.

Any workload. Any sequence.
One engine.

They orchestrate agents.
We orchestrate trust.

Two ways to run.
Pick your on-ramp.