May 20, 2026

Why we built Currai

LLM apps fail in ways your APM never sees. We built Currai so you can watch every prompt, token, and tool call the way you already watch latency and errors.

COMPANY4 min readThe Currai team / Founders

Currai

Why we built Currai

Every team shipping an LLM feature hits the same wall. The demo works, you ship it, and a week later someone forwards a screenshot of the model saying something strange. You open your logs and find a request id, a 200 status, and a latency number — and absolutely nothing about what the model actually saw or said.

Traditional observability was built for deterministic systems. A request comes in, code runs, a response goes out, and if it breaks you get a stack trace. LLM apps are not that. The same input can produce different output, the failure is rarely an exception, and the thing you need to debug — the prompt, the retrieved context, the tool calls, the completion — lives inside a payload your APM throws away.

What we kept reaching for

Before Currai we instrumented our own apps by hand: logging prompts to one place, token counts to another, and stitching sessions together with grep. It worked until it didn't. The things we wanted were always the same:

The exact prompt and completion for any request, replayable months later.
The full tree of a request — retrievers, tools, and nested model calls — not just the top-level call.
Token usage rolled into real cost, per trace, per model, per day.
Multi-turn conversations stitched into one timeline, sliceable by end user.

Drop-in, not a rewrite

We also refused to make you re-instrument. Currai speaks the Langfuse SDK and OpenTelemetry wire formats, so if you are already sending spans you migrate by changing a host. Your existing trace code keeps working.

from currai import Currai

currai = Currai(public_key="pk-lf-...", secret_key="sk-lf-...")

# wrap any LLM call — Currai ships traces in the background
trace = currai.trace(name="chat-turn", user_id="user-1")
gen = trace.generation(name="openai.chat", model="gpt-4o-mini", input=messages)
gen.end(output=reply, usage={"input": 312, "output": 88})

The rest of this blog is how we build it — and how you can get the most out of it.

Back to blog

Why we built Currai

What we kept reaching for

Drop-in, not a rewrite

Human-in-the-loop AI agent evaluation: a complete guide

The best LLM evaluation tools in 2026

Best AI observability tools in 2026

Why we built Currai

What we kept reaching for

Drop-in, not a rewrite

Related articles

Human-in-the-loop AI agent evaluation: a complete guide

The best LLM evaluation tools in 2026

Best AI observability tools in 2026