continuum
Unified execution runtime for LLM and ML programs
One runtime for inference, retrieval, and pipeline orchestration.
Demo
Problem
LLM pipelines stitch together Python scripts, inference servers, vector stores, and orchestration layers — each with its own runtime and process boundary. The glue code is fragile and the overhead is real.
What it is
Continuum is a C++ runtime that executes LLM and ML programs as compiled execution graphs. You describe a pipeline once; the runtime schedules and dispatches kernels with no interpreter in the loop.
- embed, retrieve, generate — as first-class operations
- static graph compilation catches errors before execution
- single binary, no orchestration daemons
Features
Unified Runtime
LLM inference and ML ops in one execution environment.
Pipeline Composition
Chain embedding, retrieval, and generation as typed steps.
Native Performance
C++ core — no interpreter overhead on the critical path.
Typed Programs
Static execution graphs catch shape errors before runtime.
Execution model
Example program
continuum run pipeline.ct --input query.txtWhy I Built This
Every LLM pipeline I built ended up as a mess of subprocess calls and HTTP clients. The model was fast; the glue was slow.
Continuum compiles the whole pipeline into a single execution graph — one runtime, no boundaries.