BroomVA

Chat API

Send messages and receive AI responses through the chat API.

Chat API

The chat API is the primary interface for sending messages to AI models and receiving responses. It supports both streaming and non-streaming modes, multi-model selection, and tool use.

Send a message

POST /api/chat -- Send a chat message and receive an AI response.

Request body

{
  "model": "claude-sonnet-4-20250514",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Explain event sourcing in 3 sentences."
    }
  ],
  "stream": true,
  "chatId": "optional-conversation-id"
}
FieldTypeRequiredDescription
modelstringYesModel identifier (see model list below)
messagesarrayYesArray of message objects with role and content
streambooleanNoEnable SSE streaming (default: true)
chatIdstringNoConversation ID for persistence. Generated if omitted.

Message roles

RoleDescription
systemSystem instructions (optional, one per request)
userUser message (text, or multipart with attachments)
assistantPrevious assistant response (for multi-turn context)
toolTool result (follows a tool_use assistant message)

Available models

Model availability depends on the user's plan tier. The models are routed through the AI SDK v6 customProvider abstraction.

ProviderModel IDPlan
Anthropicclaude-opus-4-20250514Pro+
Anthropicclaude-sonnet-4-20250514Pro+
Anthropicclaude-haiku-3-5-20241022Pro+
OpenAIgpt-4oPro+
OpenAIgpt-4o-miniPro+
OpenAIo1Pro+
OpenAIo3Pro+
Googlegemini-2.5-proPro+
Googlegemini-2.5-flashPro+
OpenRouterVarious (100+)Free+
OllamaAny local modelFree+

Use GET /api/chat-model to retrieve the exact list of models available for your plan. Model IDs may change as providers release new versions.

Streaming response

When stream: true (the default), the response is an SSE stream using the Vercel AI SDK v6 data stream format:

data: {"type":"text-delta","textDelta":"Event"}
data: {"type":"text-delta","textDelta":" sourcing"}
data: {"type":"text-delta","textDelta":" is a pattern..."}
data: {"type":"tool-call","toolCallId":"call_1","toolName":"search","args":{"query":"..."}}
data: {"type":"tool-result","toolCallId":"call_1","result":{"...":"..."}}
data: {"type":"finish","finishReason":"stop","usage":{"promptTokens":42,"completionTokens":87}}

The stream emits UiPart objects that include:

Event typeFieldsDescription
text-deltatextDeltaIncremental text token
tool-calltoolCallId, toolName, argsTool invocation request
tool-resulttoolCallId, resultTool execution result
finishfinishReason, usageCompletion signal
errorerrorError during generation

The finish event includes token usage in the usage field:

{
  "type": "finish",
  "finishReason": "stop",
  "usage": {
    "promptTokens": 42,
    "completionTokens": 87
  }
}

Token usage is used by the billing system to calculate credit consumption for the request.

Non-streaming response

When stream: false, the response is a complete JSON object:

{
  "id": "msg_abc123",
  "model": "claude-sonnet-4-20250514",
  "content": "Event sourcing is a pattern where...",
  "usage": {
    "promptTokens": 42,
    "completionTokens": 87
  },
  "finishReason": "stop"
}

List available models

GET /api/chat-model -- List all available models for the authenticated user's plan.

Response

{
  "models": [
    {
      "id": "claude-sonnet-4-20250514",
      "name": "Claude Sonnet 4",
      "provider": "anthropic",
      "available": true
    },
    {
      "id": "gpt-4o",
      "name": "GPT-4o",
      "provider": "openai",
      "available": true
    },
    {
      "id": "gemini-2.5-pro",
      "name": "Gemini 2.5 Pro",
      "provider": "google",
      "available": true
    }
  ]
}

Model availability depends on the user's plan tier. Free-tier users have access to community models through OpenRouter and self-hosted models through Ollama. Pro and above unlock all premium providers.

Conversation persistence

When a chatId is provided, messages are persisted to the platform database. Subsequent requests with the same chatId automatically include the conversation history, so you do not need to resend previous messages.

If chatId is omitted, the platform generates one and returns it in the response headers:

X-Chat-Id: chat_abc123

Use this ID in subsequent requests to continue the conversation.

Memory integration

If the user has the memory vault enabled, the chat endpoint automatically:

  1. Queries the Lago knowledge index for memories relevant to the current conversation
  2. Injects retrieved memories as system context before the model call
  3. After the response, extracts new facts/preferences and stores them as memory events

This is transparent to the API caller -- memory augmentation happens server-side.

Tool use

The chat API supports tool calling for models that implement function calling (Claude, GPT-4o, Gemini). Tools are defined in the request and executed server-side through the MCP bridge:

{
  "model": "claude-sonnet-4-20250514",
  "messages": [{"role": "user", "content": "What's the weather in SF?"}],
  "tools": [
    {
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string", "description": "City name"}
        },
        "required": ["location"]
      }
    }
  ]
}

When the model invokes a tool, the stream emits a tool-call event followed by a tool-result event after execution. The model then continues generating text that incorporates the tool result.

MCP server tools

In addition to user-defined tools, the chat endpoint can bridge tools from configured MCP servers. Organization-level MCP connections are resolved at request time and their tools are merged into the available tool set. The @ai-sdk/mcp adapter handles protocol translation between MCP and the AI SDK's tool interface.

Error responses

StatusCodeDescription
400validation_errorMissing required fields or invalid model
401unauthorizedMissing or invalid token
402credits_exhaustedCredit limit reached for this billing period
429rate_limitedToo many requests
500internal_errorModel provider error or internal failure

On this page