Write a Behavioral Simulation¶
Unit tests are great for checking if a function works, but they don’t tell you if your agent behaves correctly. The Simulator is designed to solve this by running an agent through a scripted conversation and asserting that it acts as expected.
This guide shows how to create a simulation to verify that our agent correctly uses the CalculatorTool for a math question.
The Core Concept¶
A simulation consists of two parts:
A
Scenario: Defines the user’s messages and a list of conditions (Assertions) that must be true at the end.The
Simulator: The engine that runs the agent through the scenario and generates a report.
Step 1: Define the Scenario¶
A scenario is just a Python object. We want to test if the agent, when asked “What is 150 / 10?”, does two things:
It uses the
calculatortool.Its final response includes the number
15.
1from cognicoreai import Scenario, ToolUsedAssertion, ResponseContainsAssertion
2
3# Define the assertions that must pass for the scenario to be successful
4assertions = [
5 # Did the agent use the 'calculator' tool at any point?
6 ToolUsedAssertion(tool_name="calculator"),
7 # Does the agent's final answer contain the substring "15"?
8 ResponseContainsAssertion(expected_text="15")
9]
10
11# Define the scenario with a name, conversational steps, and assertions
12math_scenario = Scenario(
13 name="Agent correctly uses calculator for division",
14 steps=["Hey Cogni, what is 150 divided by 10?"],
15 assertions=assertions
16)
Step 2: Set Up and Run the Simulator¶
Next, create your agent as you normally would. Then, instantiate the Simulator and call its run method.
For a reproducible test, we’ll use a mocked LLM, but you could also run this against the real OpenAI_LLM.
1from unittest.mock import MagicMock
2from cognicoreai import Agent, VolatileMemory, CalculatorTool, Simulator
3# We also import the scenario we just defined
4# from your_test_file import math_scenario
5
6# 1. Set up the agent with a mock LLM and real tools
7mock_llm = MagicMock()
8agent = Agent(
9 llm=mock_llm,
10 memory=VolatileMemory(),
11 tools=[CalculatorTool()]
12)
13
14# ... (code to configure mock_llm responses would go here) ...
15
16# 2. Instantiate the simulator
17simulator = Simulator()
18
19# 3. Run the simulation
20results = simulator.run(agent, [math_scenario])
21
22# 4. Print the results
23for result in results:
24 print(f"Scenario '{result.scenario_name}' Passed: {result.passed}")
25 for assertion, passed in result.assertion_results.items():
26 print(f" - Assertion {assertion}: {'PASS' if passed else 'FAIL'}")
This will produce a clear report, allowing you to build a suite of behavioral tests to ensure your agent remains reliable as you add more complexity.