Why we built Currai
The Currai team, Founders — May 20, 2026
Every team shipping an LLM feature hits the same wall. The demo works, you ship it, and a week later someone forwards a screenshot of the model saying something strange. You open your logs and find a request id, a 200 status, and a latency number — and absolutely nothing about what the model actually saw or said.
Traditional observability was built for deterministic systems. A request comes in, code runs, a response goes out, and if it breaks you get a stack trace. LLM apps are not that. The same input can produce different output, the failure is rarely an exception, and the thing you need to debug — the prompt, the retrieved context, the tool calls, the completion — lives inside a payload your APM throws away.
What we kept reaching for
Before Currai we instrumented our own apps by hand: logging prompts to one
place, token counts to another, and stitching sessions together with grep. It
worked until it didn't. The things we wanted were always the same:
- The exact prompt and completion for any request, replayable months later.
- The full tree of a request — retrievers, tools, and nested model calls — not just the top-level call.
- Token usage rolled into real cost, per trace, per model, per day.
- Multi-turn conversations stitched into one timeline, sliceable by end user.
Drop-in, not a rewrite
We also refused to make you re-instrument. Currai speaks the Langfuse SDK and OpenTelemetry wire formats, so if you are already sending spans you migrate by changing a host. Your existing trace code keeps working.
The rest of this blog is how we build it — and how you can get the most out of it.
currai