Skip to Content
Invocation

Invocation

Invocations are the per-row units inside a test — each one represents a single AI model call made during a benchmark or batch run, with its own configuration, status, trace, and grading.

Invocation is a sub-namespace on the test artifact. Every operation is named invocation_<op> and lives at POST /test/invocation_<op>: invocation_get, invocation_run, invocation_complete, etc. The CLI surfaces them as glow tests invocation <op>.

What is an Invocation?

An invocation captures everything needed to make and grade one AI model call inside a test:

  • invocation_id — unique identifier
  • test_id — the parent test
  • model / agent / provider — what’s being called
  • modalities — text / audio / image / video supported
  • status — queued / running / completed / failed / terminated
  • trace — full execution trace (prompts, intermediate calls, responses)
  • scores — per-rubric-standard scores when graded

Invocations are created when a test fans out (e.g., one invocation per scenario × model combination). They run independently and report back into the parent test for aggregation in the Benchmark view.

The invocation sub-op surface

Sub-opEndpointPurpose
invocation_getPOST /test/invocation_gethydrate one invocation by id
invocation_createPOST /test/invocation_createseed a new invocation (rare; tests fan out automatically)
invocation_runPOST /test/invocation_runstart / re-fire an invocation
invocation_completePOST /test/invocation_completemark complete
invocation_terminatePOST /test/invocation_terminateterminate an in-flight invocation
invocation_tracePOST /test/invocation_tracefetch the full execution trace
invocation_draft / invocationsPOST /test/invocation_draft / POST /test/invocationsedit draft / list invocations

The list endpoint is plural (POST /test/invocations) since it returns a row collection, while the single-target ops use the singular concatenated form.

Quick Start

CLI

Calls below use $GLOW_INSTANCE_URL + $GLOW_TOKEN — see Authentication to export them once.

# Hydrate a single invocation glow tests invocation get --body '{"invocation_id": "invocation-uuid"}' # Fan out / re-fire an invocation glow tests invocation run --body '{"invocation_id": "invocation-uuid"}' # Pull the full trace glow tests invocation trace --body '{"invocation_id": "invocation-uuid"}' # List invocations for a test glow tests invocations --body '{"test_id": "test-uuid"}'

API

# Get invocation detail curl -X POST $GLOW_INSTANCE_URL/test/invocation_get \ -H "Authorization: Bearer $GLOW_TOKEN" \ -H "Content-Type: application/json" \ -d '{"invocation_id": "invocation-uuid"}' # List invocations (paginated) curl -X POST $GLOW_INSTANCE_URL/test/invocations \ -H "Authorization: Bearer $GLOW_TOKEN" \ -H "Content-Type: application/json" \ -d '{"test_id": "test-uuid", "page_size": 25}'

Invocations in the test flow

When a test runs, the server fans out one invocation per benchmark cell (typically: scenario × model × agent). Each invocation:

  1. Resolves the model + agent configuration from the test setup
  2. Issues the model call(s) — recorded in trace
  3. Grades the resulting transcript against the test’s rubric
  4. Writes per-standard scores back onto the row
  5. Reports status = completed (or failed / terminated)

The parent test’s response aggregates invocations into AggregatedResults for benchmark UIs.

Status lifecycle

StatusMeaning
queuedcreated, not yet started
runningmodel call in flight
completedmodel call done, scores written
failedmodel call errored — see trace for details
terminatedcancelled via invocation_terminate

Use invocation_run to (re-)kick a queued or failed invocation; use invocation_terminate to stop a running one safely.

Trace

curl -X POST $GLOW_INSTANCE_URL/test/invocation_trace \ -H "Authorization: Bearer $GLOW_TOKEN" \ -H "Content-Type: application/json" \ -d '{"test_id": "test-uuid", "test_invocation_id": "test-invocation-uuid"}'

Returns a test_invocation_trace_id — the trace entry binding the invocation to its bundle config and recorded run, the anchor every replay or audit of this cell resolves against. Pass run_id to bind a specific recorded run.

Listing and paginating

Common Operations

TaskCLIAPI
Get one invocationglow tests invocation getPOST /test/invocation_get
List invocationsglow tests invocationsPOST /test/invocations
Run / re-fireglow tests invocation runPOST /test/invocation_run
Terminateglow tests invocation terminatePOST /test/invocation_terminate
Fetch traceglow tests invocation tracePOST /test/invocation_trace
Mark completeglow tests invocation completePOST /test/invocation_complete
Save draftglow tests invocation draftPOST /test/invocation_draft
List draftsglow tests invocation draftsPOST /test/invocation_drafts
  • Test API Reference — every invocation_* endpoint with full schemas
  • Tests CLI Reference — every glow tests ... command
  • Benchmark — the aggregated view across all invocations in a test
  • Pricing — cost tracking for invocation runs
  • Group — generation group rows that capture invocation cost
Last updated on