Back to tutorials
For Builders7 min read

Streaming AI Responses with WebSockets

Build realtime AI experiences with websocket endpoints and streaming responses.

Streaming AI Responses with WebSockets

When building conversational AI, generating a response can sometimes take seconds, especially if the workflow relies on deep reasoning or multiple tool executions. To provide a snappy, real-time experience, Pipecat exposes a native WebSocket API.

Instead of waiting for the entire workflow to complete before sending a single HTTP response, the WebSocket API allows your client application to receive streaming events—node status updates, tool execution traces, and the final output—as they happen.

In this tutorial, we will explore the exact WebSocket route, connection lifecycle, and the schema of the messages exchanged.


The WebSocket Route

To connect to a Pipecat workflow, open a WebSocket connection pointing to your backend endpoint:

wss://api.pipecat.in/ws/run/{workflow_id}

Replace {workflow_id} with the actual UUID of the workflow you wish to execute.


Step 1: Authentication and Initialization

Unlike standard HTTP requests where headers can be used easily, WebSockets often struggle with custom header authentication in browser environments. Pipecat handles this elegantly.

As soon as the connection opens, the server waits up to 10 seconds for an initial JSON payload containing your authentication token (this can be your Pipecat API key) and the user's input text.

Client Payload (Sent by you):

{
  "token": "eyJhbGciOiJIUzI1NiIs...",
  "input": "Can you check the weather in Tokyo and tell me a joke?"
}

Common Connection Errors (Code 1008)

If initialization fails, the server will close the connection with a 1008 Policy Violation status code and one of the following reasons:

ReasonCause
Expected JSON auth+input messageThe payload was not sent within 10 seconds or was not valid JSON.
Invalid tokenThe provided API key or access token is expired or invalid.
Workflow not foundThe workflow_id does not exist or doesn't belong to the authenticated user.
Monthly run limit of {X} reachedThe user has exceeded their active plan's execution limits.

Step 2: Receiving Execution Events

Once authenticated, the server begins executing the Directed Acyclic Graph (DAG) for your workflow. It will stream JSON messages tracking the lifecycle of nodes and tools.

There are three primary types of events you should listen for.

1. Node Lifecycle Events

As each node starts, finishes, or errors out, you will receive an event tracking its execution.

{
  "run_id": "a1b2c3d4-...",
  "node_id": "llm-node-1",
  "status": "running", // or "success" / "error" / "tool_call"
  "output": "Hello there! Let me check the weather.",
  "duration_ms": 450,
  "ts": "2026-05-10T12:00:00.000Z"
}

2. Tool Result Events

When an LLM node decides to invoke a custom HTTP tool, a dedicated tool_result event is fired once the tool returns its payload. This allows your UI to display "Thinking..." or "Calling Weather API..." states dynamically.

{
  "run_id": "a1b2c3d4-...",
  "node_id": "llm-node-1",
  "status": "tool_result",
  "output": "{\"temperature\": 22, \"condition\": \"Sunny\"}",
  "tool_name": "get_weather",
  "duration_ms": 1200,
  "ts": "2026-05-10T12:00:01.200Z"
}

3. Final Workflow Completion

When all nodes finish executing, a final event is broadcasted indicating the entire workflow is done.

{
  "run_id": "a1b2c3d4-...",
  "node_id": null,
  "status": "success", // or "failed"
  "output": null,
  "duration_ms": 3450,
  "ts": "2026-05-10T12:00:03.450Z"
}

[!TIP] The output payload in the final success message is null because the actual response text is delivered earlier by the workflow's Output Node (status: "success").


Implementing it in the Browser

Here's a quick vanilla JavaScript example to tie it all together:

const workflowId = "your-workflow-uuid";
const token = "your-pipecat-api-key"; // Or access token

const ws = new WebSocket(`wss://api.pipecat.in/ws/run/${workflowId}`);

ws.onopen = () => {
  // Send Auth + Input immediately
  ws.send(JSON.stringify({
    token: token,
    input: "Fetch my latest sales data."
  }));
};

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  
  if (data.status === "running") {
    console.log(`Node ${data.node_id} is running...`);
  } else if (data.status === "tool_call") {
    console.log(`LLM is calling a tool...`);
  } else if (data.status === "tool_result") {
    console.log(`Tool ${data.tool_name} returned in ${data.duration_ms}ms`);
  } else if (data.status === "success" && !data.node_id) {
    console.log("Workflow completely finished!");
  }
};

ws.onclose = (event) => {
  if (event.code === 1008) {
    console.error("Connection rejected:", event.reason);
  }
};

With these events, you can build rich, interactive loading states exactly like the Pipecat dashboard itself!


Next Tutorial

Continue learning with:

Deploying AI Workflows as APIs

Ready to start building?

Join Pipecat today and transform your store with AI.

Get started for free