Execution & Tool Mocking
The Lab allows you to simulate your entire production environment, including complex tool-use scenarios, without needing to run your backend code.
Running Prompts
Click the Run button (or Cmd+Enter) to execute the current configuration.
Streaming Process
The Lab uses Server-Sent Events (SSE) to stream the response token-by-token. This allows you to verify:
- Latency: Time to First Token (TTFT).
- Streaming format: Ensuring downstream apps handle the stream correctly.
Tool Mocking
One of the Lab’s most powerful features is identifying and mocking tool calls. This allows you to test how your prompt handles function calling scenarios without implementing the actual functions.
How it Works
- Define a Mock Tool in the “Tools” tab.
- Assign a
name(e.g.,get_weather) and amock_returnJSON. - When the model decides to call this tool, the Lab intercepts the request.
- It strictly validates the call against the tool definition.
- It injects your
mock_returnJSON back into the conversation as the tool’s result. - The model continues generating based on that mock data.
Example Configuration
Scenario: You want to test a customer support chatbot that needs to look up order status.
- Tool Name:
lookup_order - Description:
Get order details by ID - Mock Return:
{ "order_id": "12345", "status": "shipped", "delivery_date": "2024-12-25" }
Execution Flow:
- User: “Where is my order 12345?”
- Model: Calls
lookup_order(id="12345") - Lab: Intercepts call -> Returns Mock JSON.
- Model: “Your order #12345 has been shipped and is expected to arrive on December 25th.”
Error Handling
The Lab exposes raw provider errors to help you debug:
- 400 Bad Request: Often due to invalid params or context window exceeded.
- 429 Rate Limit: You are hitting the provider’s limits.
- 500 Provider Error: Upstream issue with OpenAI/Anthropic/Google.
Last updated on