Prompts & A/B testing

Manage prompts as versioned objects in the dashboard, fetch them at runtime, inject variables, and split traffic across versions with weighted A/B experiments.

Stop hard-coding prompts in your app. Currai stores each prompt as a versioned object you edit in the dashboard and fetch at runtime — inject {{variables}}, roll a new wording forward without a deploy, and split traffic across versions with A/B experiments. Link the served version into your traces and compare cost and quality side by side.

Anatomy of a prompt

A prompt is identified by its name and carries:

  • typetext (a single string body) or chat (an array of { role, content } messages).
  • version — auto-incrementing integer. Every save mints a new version; old versions stay fetchable forever.
  • labels — movable pointers like production or staging. A label lives on exactly one version at a time — re-applying it elsewhere moves it.
  • config — arbitrary JSON shipped alongside the body (model name, temperature, etc.).
  • commitMessage — a required note describing what changed in the version, shown in the version history.

Variables use mustache-style {{ name }} syntax (a name starts with a letter or underscore; whitespace inside the braces is fine). They're detected automatically from the body — you never declare them.

Create a prompt in the dashboard

Open Prompts from the sidebar. The list shows every prompt with its latest version, labels, and version count.

  1. Click New prompt (top right).
  2. Enter a unique name — lowercase-with-dashes works well, e.g. bmi-intake. This is the name your code will pass to getPrompt(...).
  3. Click Continue. You land on the prompt's page, ready to author the first version.

Author a version

On the prompt page, open the New version tab. This is the editor:

  1. Type — toggle between text (one body) and chat (a list of role + content messages). For chat, use Add message and pick each message's role (system / user / assistant); remove a message with the trash icon.
  2. Prompt body — type your prompt. Any {{variable}} you write is highlighted as you go.
  3. Variables & preview (right panel) — every detected variable gets a sample-value input, and you see the compiled result live. Sample values are local to the editor and never saved — real values are injected at runtime via the SDK.
  4. Labels (optional) — comma-separated, e.g. production. Applying production here makes this the version getPrompt serves by default (see Resolution order).
  5. Commit message (required) — a short note like "Tighten the intake question", shown in the version history.
  6. Click Save version. The version number auto-increments (v1, v2, …) and the old versions stay available.

Versions & promoting with labels

The Versions tab lists every saved version, grouped by day (newest first) like a commit history — each row shows its commit message, author (or "via API key"), labels, and an Active badge on the version getPrompt currently resolves to. Search filters by message, version number, or author.

Click a version to open its detail page. There you see the full body, metadata, a collapsible variables preview, and a Manage panel where you can:

  • Edit labels — the comma-separated field is how you promote a version. To ship a new wording, set production on the version you want live; it moves off whichever version had it before. To roll back, put production back on an older version.
  • Delete the version.

A/B testing in the dashboard

An experiment splits getPrompt traffic across two or more versions of the same prompt by weight. Open A/B Tests from the sidebar (under Prompts).

  1. Click New A/B test.
  2. Give the test a name (e.g. "Intake question wording") and pick the prompt it runs on.
  3. Add variants — each has a label (e.g. control, variant-a), a version picked from that prompt's versions, and a weight. The dashboard shows the resulting traffic % for each variant live as you adjust weights. Use Add variant for more than two; remove extras with the trash icon.
  4. Click Save & activate to start splitting traffic immediately, or Save as paused to set it up for later.

In the experiments list, each test shows its status, the variants table with live traffic percentages, and controls to Pause / Activate, Edit, or Delete it.

Rules enforced when you save:

  • A test needs at least two variants.
  • Weights are relative, not percentagescontrol: 3, variant-a: 1 sends ~75% / ~25%. At least one weight must be positive.
  • Only one experiment per prompt can be active at a time.

Use the prompt in your app

Fetch a managed prompt with currai.getPrompt(name) and inject values with .compile():

const prompt = await currai.getPrompt("bmi-intake");
// text prompt body: "What is your {{weight}} and {{height}}?"

const text = prompt.compile({ weight: "70kg", height: "180cm" });
// → "What is your 70kg and 180cm?"

prompt.variables; // ["weight", "height"]  (detected from {{…}})
prompt.version;   // the resolved version, e.g. 3

chat prompts compile each message's content and return the messages, ready to pass to your provider:

const chatPrompt = await currai.getPrompt("support-agent");
const messages = chatPrompt.compile({ customer: "Sara" });
// → [{ role: "system", content: "You help Sara…" }, …]

Compilation is non-destructive: any {{variable}} you don't supply is left in place rather than blanked out, so partial compilation is safe.

Resolution order

When you call getPrompt(name) without a version or label, Currai resolves which version to serve in this order:

  1. An active A/B experiment for that name → weighted random pick across its variants.
  2. The version carrying the production label.
  3. The latest version.

Pin a specific version or label explicitly to skip resolution:

await currai.getPrompt("bmi-intake", { version: 2 });
await currai.getPrompt("bmi-intake", { label: "staging" });

You can also require a typegetPrompt(name, { type: "chat" }) returns nothing if the resolved version isn't a chat prompt.

When a variant served the prompt, prompt.selectedVariant is { label, weight }; otherwise it's null:

const prompt = await currai.getPrompt("bmi-intake");
if (prompt.selectedVariant) {
  // served by the A/B experiment — e.g. { label: "concise", weight: 1 }
}

To see which version (or A/B variant) produced each response, pass promptName and promptVersion onto the generation. The split then shows up directly in your traces and cost rollups:

const prompt = await currai.getPrompt("bmi-intake");

const gen = trace.generation({
  name: "openai.chat.completions",
  model: "gpt-4o-mini",
  input: prompt.compile({ weight: "70kg", height: "180cm" }),
  promptName: prompt.name,
  promptVersion: prompt.version,
});

Create versions from code

You don't have to use the dashboard — mint a new version from your app or a script with currai.createPrompt. The version auto-increments; pass labels to move a pointer (for example, promote straight to production):

await currai.createPrompt({
  name: "bmi-intake",
  type: "text",
  prompt: "What is your {{weight}} and {{height}}? Be concise.",
  config: { model: "gpt-4o-mini", temperature: 0.2 },
  labels: ["production"],
  commitMessage: "Tighten the intake question",
});

HTTP API

If you're not on the TypeScript SDK, hit the endpoints directly with your API key (HTTP Basic auth — see Authentication):

  • GET /api/public/prompts?name=…&version=…&label=…&type=… — resolve a prompt. Returns { id, name, version, type, prompt, config, labels, tags, commitMessage, createdAt, selectedVariant }.
  • POST /api/public/prompts — create a new version. Body: { name, type?, prompt, config?, tags?, labels?, commitMessage? }.

Next: send data with OpenTelemetry.