Execution & Tool Mocking

The Lab allows you to simulate your entire production environment, including complex tool-use scenarios, without needing to run your backend code.

Running Prompts

Click the Run button (or Cmd+Enter) to execute the current configuration.

Streaming Process

The Lab uses Server-Sent Events (SSE) to stream the response token-by-token. This allows you to verify:

Latency: Time to First Token (TTFT).
Streaming format: Ensuring downstream apps handle the stream correctly.

Tool Mocking

One of the Lab’s most powerful features is identifying and mocking tool calls. This allows you to test how your prompt handles function calling scenarios without implementing the actual functions.

How it Works

Define a Mock Tool in the “Tools” tab.
Assign a name (e.g., get_weather) and a mock_return JSON.
When the model decides to call this tool, the Lab intercepts the request.
It strictly validates the call against the tool definition.
It injects your mock_return JSON back into the conversation as the tool’s result.
The model continues generating based on that mock data.

Example Configuration

Scenario: You want to test a customer support chatbot that needs to look up order status.

Tool Name: lookup_order
Description: Get order details by ID

Mock Return:


{
  "order_id": "12345",
  "status": "shipped",
  "delivery_date": "2024-12-25"
}

Execution Flow:

User: “Where is my order 12345?”
Model: Calls lookup_order(id="12345")
Lab: Intercepts call -> Returns Mock JSON.
Model: “Your order #12345 has been shipped and is expected to arrive on December 25th.”

Error Handling

The Lab exposes raw provider errors to help you debug:

400 Bad Request: Often due to invalid params or context window exceeded.
429 Rate Limit: You are hitting the provider’s limits.
500 Provider Error: Upstream issue with OpenAI/Anthropic/Google.