Developers — REST API, SDKs, OpenAI & Anthropic Compatible

Quickstart

Five minutes from API key to first call.

Get an API key

For local dev, use vorana-dev-key against the gateway on localhost:5057. Production keys are minted from the admin portal per tenant.

Install the SDK (optional)

The .NET SDK adds typed requests, options, and Entra OIDC auth. Or skip it — the API is plain HTTP and works from any language.

Make your first call

Pick a pipeline, send your inputs, get a response with confidence, citations, and a run_id you can audit later.

cURL .NET TypeScript Python

POST /v1/generate

# Local dev — gateway runs on :5057, default key is `vorana-dev-key`
curl -X POST http://localhost:5057/v1/generate \
  -H "X-Api-Key: vorana-dev-key" \
  -H "Content-Type: application/json" \
  -d '{
    "pipeline_id": "pipeline.faq_assistant.v1",
    "inputs":      { "user_query": "What is our refund policy?" },
    "options":     { "mode": "balanced" }
  }'

using Vorana.Client;
using Vorana.Shared;

// In Program.cs / Startup
services.AddVoranaClient(opts =>
{
    opts.BaseAddress       = new Uri("http://localhost:5057");
    opts.ApiKey            = "vorana-dev-key";
    opts.DefaultPipelineId = "pipeline.faq_assistant.v1";
});

// Anywhere IVoranaClient is injected
var resp = await client.GenerateAsync(new GenerateRequest
{
    Inputs = new() { ["user_query"] = "What is our refund policy?" }
});

Console.WriteLine(resp.Output["content"]);
Console.WriteLine($"run_id={resp.RunId}, confidence={resp.Confidence}");

const res = await fetch('https://gateway.vorana.ai/v1/generate', {
  method:  'POST',
  headers: {
    'X-Api-Key':    process.env.VORANA_API_KEY!,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    pipeline_id: 'pipeline.faq_assistant.v1',
    inputs:      { user_query: "What is our refund policy?" },
  }),
});

const data = await res.json();
console.log(data.output.content, data.confidence, data.run_id);

import httpx, os

resp = httpx.post(
    "https://gateway.vorana.ai/v1/generate",
    headers={"X-Api-Key": os.environ["VORANA_API_KEY"]},
    json={
        "pipeline_id": "pipeline.faq_assistant.v1",
        "inputs":      {"user_query": "What is our refund policy?"},
    },
).json()

print(resp["output"]["content"], resp["confidence"], resp["run_id"])

Authentication

API key for local. Entra OIDC for production.

API key

Send X-Api-Key: <key> on every request. The key maps to a tenant_id + app_id. Optionally override the tenant with X-Tenant-Id.

headers

X-Api-Key:    vorana-dev-key
X-Tenant-Id:  tenant_acme   // optional
Content-Type: application/json

Entra OIDC (production)

When Auth:Entra:Authority is set on the gateway, send a bearer token instead. The .NET SDK fetches and refreshes tokens automatically via TokenCredential.

.NET

services.AddVoranaClient(opts =>
{
    opts.BaseAddress = new Uri("https://gateway.vorana.ai");
    opts.Credential  = new DefaultAzureCredential();
    opts.Scope       = "api://vorana/.default";
});

Generate

One endpoint. Everything Vorana does, in one shot.

POST /v1/generate takes a pipeline_id and your inputs. Vorana runs validation, retrieval, generation, scoring, and policy — then returns the answer along with the decisions, citations, metrics, and a run_id you can replay.

request response

POST /v1/generate

{
  "pipeline_id": "pipeline.faq_assistant.v1",
  "policy_id":   null,
  "inputs": {
    "user_query": "What's our refund window for purchases over $500?"
  },
  "options": {
    "mode":      "balanced",    // fast | balanced | strict
    "max_tokens": 800,
    "trace":     true
  }
}

{
  "run_id":    "r_8f3a2c…",
  "trace_id":  "01J5HQ…",
  "status":    "ok",                          // ok | low_confidence | denied | failed
  "output": {
    "content":    "Refunds within 30 days, full…",
    "confidence": 0.92
  },
  "confidence": 0.92,
  "decisions": [
    { "step": "validation.input", "rule_or_policy": "schema",    "result": "pass" },
    { "step": "llm.consensus",    "rule_or_policy": ">= 0.85",   "result": "pass", "reason": "0.92" },
    { "step": "scoring.composite","rule_or_policy": "weighted",  "result": "pass" }
  ],
  "citations": [
    { "source": "policy/refunds.pdf", "section": "§3.2" }
  ],
  "metrics": {
    "latency_ms":        1500,
    "prompt_tokens":     280,
    "completion_tokens": 120,
    "cost_usd":          0.0142
  },
  "obligations": []
}

Streaming

Tokens as they arrive — with a metadata chunk at the end.

POST /v1/generate:stream emits server-sent events. Each chunk is a JSON object: token deltas first, then a final done chunk with run_id, decisions, citations, and confidence.

cURL Browser (EventSource) Python (httpx)

POST /v1/generate:stream

curl -N -X POST http://localhost:5057/v1/generate:stream \
  -H "X-Api-Key: vorana-dev-key" \
  -H "Content-Type: application/json" \
  -d '{ "pipeline_id": "pipeline.faq_assistant.v1",
        "inputs": { "user_query": "Refund window over $500?" } }'

# Output (server-sent events)
data: {"event":"token","content":"Refunds "}
data: {"event":"token","content":"within "}
data: {"event":"token","content":"30 days…"}
data: {"event":"done","run_id":"r_8f3a2c…","confidence":0.92}

// Browser SSE — Vorana also accepts query-string auth for EventSource
const es = new EventSource(
  '/v1/generate:stream?api_key=vorana-dev-key' +
  '&pipeline_id=pipeline.faq_assistant.v1' +
  '&inputs.user_query=Refund%20window%3F'
);

es.onmessage = (e) => {
  const chunk = JSON.parse(e.data);
  if (chunk.event === 'token')  appendToUi(chunk.content);
  if (chunk.event === 'done')   { saveRunId(chunk.run_id); es.close(); }
};

import httpx, json, os

with httpx.stream(
    "POST", "https://gateway.vorana.ai/v1/generate:stream",
    headers={"X-Api-Key": os.environ["VORANA_API_KEY"]},
    json={"pipeline_id": "pipeline.faq_assistant.v1",
          "inputs":      {"user_query": "Refund window?"}},
) as r:
    for line in r.iter_lines():
        if not line.startswith("data:"): continue
        chunk = json.loads(line[5:])
        if chunk["event"] == "token":
            print(chunk["content"], end="", flush=True)
        elif chunk["event"] == "done":
            print(f"\nrun_id={chunk['run_id']}")

OpenAI-compat

Already on OpenAI? Change one line.

POST /v1/chat/completions matches OpenAI's wire format. Set the model to vorana:<pipeline_id> (or send X-Vorana-Pipeline), point your base_url at the gateway, and you're done. Streaming is already wired — SSE chunks come back in OpenAI's shape, plus a final vorana.completion.metadata chunk with the trust-layer fields.

Python (openai SDK) TypeScript (openai SDK) cURL

POST /v1/chat/completions

from openai import OpenAI

client = OpenAI(
    base_url = "https://gateway.vorana.ai/v1",   # <-- only change
    api_key  = "vorana-dev-key",
)

resp = client.chat.completions.create(
    model    = "vorana:pipeline.faq_assistant.v1",  # <-- pick a pipeline
    messages = [{"role": "user", "content": "What's our refund policy?"}],
    stream   = True,
)

for chunk in resp:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://gateway.vorana.ai/v1',
  apiKey:  process.env.VORANA_API_KEY!,
});

const stream = await client.chat.completions.create({
  model:    'vorana:pipeline.faq_assistant.v1',
  messages: [{ role: 'user', content: "What's our refund policy?" }],
  stream:   true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

curl -N https://gateway.vorana.ai/v1/chat/completions \
  -H "Authorization: Bearer vorana-dev-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vorana:pipeline.faq_assistant.v1",
    "messages": [{ "role": "user", "content": "Refund policy?" }],
    "stream": true
  }'

Pipelines

Pipelines are YAML. Every step is reviewable.

Each pipeline is a DAG of steps with a typed config block, optional when: guards, retries, and fallbacks. Drop a YAML file into src/Vorana.Gateway/pipelines/ (or load from blob storage in production), and reference it by id from your app.

pipelines/faq_assistant.yaml

id:       pipeline.faq_assistant.v1
version:  1
steps:
  - id:   validate
    type: validation.composite
    with:
      rules:
        - { field: inputs.user_query, required: true, max_length: 2000 }

  - id:   retrieve
    type: retrieval.hybrid
    with:
      index:    refund_kb
      top_k:    5
      min_score: 0.4

  - id:   answer
    type: llm.consensus
    with:
      providers: [azure_openai, anthropic]
      threshold: 0.85
    on_error: fallback
    fallback: answer_strong

  - id:   answer_strong               # only runs if `answer` failed
    type: llm.execute
    with: { provider: azure_openai, model: gpt-4o }
    when: "exists intermediate.answer.error"

  - id:   score
    type: scoring.composite
    with:
      consensus_weight: 0.4
      judge_weight:     0.4
      cache_weight:     0.2

Skills

Build a skill once. Every project gets it.

A Skill is a versioned, signed package of capability — a step plugin, a sub-pipeline, a prompt template, or all three. Publish to your org registry once; every team's pipelines reference it by skill.<org>.<name>.<ver>. Update it; the next deploy picks up the fix everywhere.

Skills follow semver. Pinned references (v2.1.0) never change without a redeploy. Floating references (~v2.1) auto-pick the latest patch. The registry tracks a used-by graph so removing or breaking a skill flags every dependent pipeline.

skill.yaml publish use in pipeline list / inspect

vorana skill

# skills/redact_customer_pii/skill.yaml — the manifest committed to your repo
id:          skill.acme.security.redact_customer_pii
version:     2.1.0
owner:       "@acme/security"
description: Mask names, SSN, email, card. EU-GDPR + US-PII profiles.

entrypoint:
  type:   step.composite        # or step.grpc, step.wasm, pipeline.fragment
  steps:
    - { id: detect, type: validation.composite, with: { profile: pii_us_eu } }
    - { id: mask,   type: redaction.tokenize,    with: { strategy: replace_with_tag } }

inputs:   { prompt: string }
outputs:  { prompt: string, masked_count: integer }

signing:
  required:    true            # reject unsigned skills at deploy
  trust_chain: acme-root

# From your skill repo — package, sign, push to the org registry
$ vorana skill build   ./redact_customer_pii
→ bundle: dist/redact_customer_pii-2.1.0.tgz   (12 KiB)

$ vorana skill sign    ./dist/redact_customer_pii-2.1.0.tgz \
    --key acme-root.pem
→ signature OK

$ vorana skill publish ./dist/redact_customer_pii-2.1.0.tgz \
    --org acme
→ published skill.acme.security.redact_customer_pii@2.1.0
→ downstream pipelines: 14  (use --check to preview impact)

# Reference a published skill from any pipeline in your org
id:       pipeline.support_copilot.v3
steps:
  - id:   redact
    type: skill.acme.security.redact_customer_pii  # <-- here
    version: 2.1.0                                # pinned (recommended for prod)

  - id:   answer
    type: llm.execute
    with: { provider: azure_openai, model: gpt-4o }

  - id:   judge
    type: skill.acme.legal.judge_compliance         # another shared skill
    version: "~1.4"                              # floating: auto-track patches

# Inspect what's available in your org and where each skill is used
$ vorana skill list --org acme
NAME                                  VERSION  OWNER             USED BY
redact_customer_pii                   2.1.0    @acme/security    14 pipelines
judge_compliance                      1.4.2    @acme/legal       8  pipelines
retrieve_policy_kb                    3.0.1    @acme/platform    22 pipelines
summarize_call_note                   0.9.0    @acme/cx          6  pipelines

$ vorana skill describe redact_customer_pii
→ v2.1.0   signed by @acme/security   published 2026-04-02
→ used by pipelines:
    customer-chat-prod, support-copilot, claims-intake, advisor-agent,
    fraud-triage, broker-assistant, ... (10 more)

Audit & replay

Every run is signed, indexed, and replayable.

Look up a run by run_id, replay it against the current pipeline + policy bundle, or search across all runs in a tenant. Audit blobs are CMK-encrypted in production and respect retention policies you set per tenant.

fetch a run replay search

GET /v1/runs · POST /v1/replay · GET /v1/audit/search

# Pull the full run record — inputs, decisions, citations, output, metrics
curl -H "X-Api-Key: vorana-dev-key" \
  http://localhost:5057/v1/runs/r_8f3a2c…

# Response shape mirrors GenerateResponse, plus original `inputs` and timestamps

# Re-execute against today's pipeline + policy. Useful for regression tests,
# A/B'ing prompt changes, or verifying a fix to a low-confidence run.
curl -X POST -H "X-Api-Key: vorana-dev-key" \
  http://localhost:5057/v1/replay/r_8f3a2c…

# Optional body: { "pipeline_id": "...override...", "policy_id": "..." }

# Search by tenant, status, score, or text. Indexed via Postgres in production.
curl -H "X-Api-Key: vorana-dev-key" \
  "http://localhost:5057/v1/audit/search?tenant=tenant_acme&status=low_conf&from=2026-04-01&q=refund"

# →  { "runs": [ { "run_id": "...", "status": "low_conf", "confidence": 0.74, ... } ], "next": "..." }

Drop-in trust layer.
The same code you already write.

Five minutes from API key to first call.

Get an API key

Install the SDK (optional)

Make your first call

API key for local. Entra OIDC for production.

API key

Entra OIDC (production)

One endpoint. Everything Vorana does, in one shot.

Tokens as they arrive — with a metadata chunk at the end.

Already on OpenAI? Change one line.

Pipelines are YAML. Every step is reviewable.

Build a skill once. Every project gets it.

Every run is signed, indexed, and replayable.

Build it. Ship it. Audit it.

Drop-in trust layer.The same code you already write.

Five minutes from API key to first call.

Get an API key

Install the SDK (optional)

Make your first call

API key for local. Entra OIDC for production.

API key

Entra OIDC (production)

One endpoint. Everything Vorana does, in one shot.

Tokens as they arrive — with a metadata chunk at the end.

Already on OpenAI? Change one line.

Pipelines are YAML. Every step is reviewable.

Build a skill once. Every project gets it.

Every run is signed, indexed, and replayable.

Build it. Ship it. Audit it.

Drop-in trust layer.
The same code you already write.