Agents That Think: Introducing the ReAct Pattern
In our previous tutorials, we established the basic building blocks of an LLM agent. We implemented the observe,think,act loop, we added tool use and memory.
Our agent can now read code, suggest improvements and remember past interactions.
However, what happens when the agent makes a mistake, or its suggested action doesn’t achieve the desired outcome? This is where ReAct comes in. In this tutorial, we’ll introduce the ReAct pattern and show how to enable our agent to integrate reasoning with tool use, intepret the results of it’s actions and adapt it’s next steps accordingly, making it a more reliable, flexible and effective assistant.
The ReAct Pattern: Think, Act, Observe
The ReAct(Reasoning and Acting) pattern is a powerful pattern that combines Reasoning steps with Action steps performed via tools.
- Thought(Reasoning): The LLM internally reasons about the current situation and determines the next Action to take.
- Action: The agent executes the chosen action (e.g. calling a tool like
read_file(sample.py)) - Observation The result of the action (tool output) is returned to the agent, serving as the observation for the next turn.
Why Reasoning is Essential for Agent Success
- Self-correction: Reflection allows the agent to recognize when a tool call failed or when the output didn’t align with the goal.
- Plan Adjustment: The agent can assess the progress of the plan and modify its approach dynamically. For instance if it suggests a fix, the reasoning step can verify that the suggested fix actually addresses the original problem.
- Increased Robustness: By incorporating a dedicated step to reason about its outputs, the agent becomes less prone to “hallucination” and its responses are more grounded in real world tool outputs.
Implementing the ReaAct agent
The Agent code so far
We will make changes to this agent code CodeReviewAgentWithContext to demonstrate a simple implementation of the ReAct pattern
Set up the tools
import os
from typing import Callable, Dict
def read_file(filepath: str) -> str:
"""Read contents of a Python file"""
if not os.path.exists(filepath):
return f"File not found: {filepath}"
with open(filepath, "r") as f:
return f.read()
def patch_file(filepath: str, content: str) -> str:
"""Writes the given content to a file, completely replacing its current content."""
try:
with open(filepath, "w") as f:
f.write(content)
return f"File successfully updated: {filepath}. New content written."
except Exception as e:
return f"Error writing to file {filepath}: {e}"
def print_review(review: str):
print(f"Review: {review}")
return f"Printed review: {review}"
class ToolRegistry:
"""Holds available tools and dispatches them by name."""
def __init__(self):
self.tools: Dict[str,Callable] = {}
def register(self, name:str, func: Callable):
self.tools[name] = func
def call(self, name:str, *args, **kwargs):
if name not in self.tools:
return f"Unknown tool: {name}"
return self.tools[name](*args, **kwargs)
Implement the ReAct Agent
- Update the
run()method to manage the observe, reason, act loopdef run(self, user_query:str, max_iterations=3): """ Main execution loop with reflection. Args: user_query: The user's request max_iterations: Maxumum number of think-act-reflect cycles. this is to avoid the agent getting stuck in a loop. Returns: Final response string """ step = 0 current_input = user_query for step in range(max_iterations):. print(f"\n{'-'*60}") print(f"\nStep {step+1} of {max_iterations}") print(f"\n{'-'*60}") llm_response = self.think(current_input) print(f"Agent's LLM Response:\n{llm_response}") try: parsed_reponse = json.loads(llm_response) except json.JSONDecodeError as e: current_input = ( f"Your response was not valid Json. Error: {e}\n" f"Respond with ONLY valid JSON matching the required format." ) if "thought" in parsed_reponse: print(f"\nThought: {parsed_reponse["thought"]}") if "answer" in parsed_reponse: print(f"\n Answer: {parsed_reponse["answer"]}") return parsed_reponse["answer"] if "action" in parsed_reponse: action = parsed_reponse["action"] tool_name = action.get("tool","unknown") args = action.get("args", []) observation = self.act(action) print(f"Action: {tool_name}({','.join(repr(a) for a in args)})") current_input = f"Observation: {observation}" else: # Neither action nor answer print("\nResponse missing both 'action' and 'answer'") current_input = ( "Your response must include either 'action' (to use a tool) " "or 'answer' (if the task is complete). Please try again." ) return "Max steps reached without a final answer" -
Add
build_system_prompt()to generate a ReAct system prompt. ```python def build_system_prompt(self) -> str: “"”Construct the ReAct system prompt with current context.””” return f”"”You are a code review assistant using the ReAct pattern.## Available Tools - read_file(filepath): Read contents of a file - analyze_code(code): Get LLM analysis of code - patch_file(filepath, content): Replace file contents entirely ## Context {self.get_relevant_memories()} Conversation summary: {self.conversation_summary or 'This is the start of the conversation.'} ## Response Format You MUST respond with valid JSON in one of these two formats: ### Format 1: When you need to use a tool thought }} ### Format 2: When the task is complete thought ## Rules 1. Always include "thought" explaining your reasoning 2. Include "action" when you need to call a tool 3. Include "answer" only when the task is fully complete 4. Never include both "action" and "answer" 5. Respond with ONLY valid JSON—no markdown, no extra text ## Example User: Review auth.py and fix any bugs Response 1: thought}} Observation: def check(u): return db.user = u Response 2: thought}} Observation: File successfully updated: auth.py Response 3: thought """
* Update the `system_message_context` prompt in `think()` to implement the ReAct pattern
* The ReAct pattern is implemented by a prompt engineering technique where we give the LLM a crafted promnpt that directs it to reason about past actions and results and respond with the next action.
Note that we give the LLM a specific output format. The output format is specified to be JSON so that we can have better response handling control.
**NOTE:** As noted earlier in the tools tutorial, most modern LLM have specific tool calling and structured output conventions that would give more predictable structured output.
In this example, we keep things simple by telling the LLM how to format its response so it can work with most LLMs.
```python
import tiktoken
import json
import openai
class CodeReviewAgentReAct:
def __init__(self,tools_registry: ToolRegistry, model="gpt-4.1",memory_file="agent_memory.json",summarize_after=10,max_context_tokens=6000):
self.tools = tools_registry
self.model = model
self.conversation_history = [] # Short-term memory
self.memory_file = memory_file
self.load_long_term_memory() # Long-term memory (key-value store)
self.conversation_summary = "" # Summarized conversation history
self.summarize_after = summarize_after
self.turns_since_summary = 0
self.max_context_tokens = max_context_tokens
# Initialize tokenizer for the model
try:
self.tokenizer = tiktoken.encoding_for_model(model)
except:
self.tokenizer = tiktoken.get_encoding("cl100k_base")
def count_tokens(self, text:str) -> int:
"""Count tokens in a string"""
return len(self.tokenizer.encode(text))
def trim_history_to_fit(self, system_message:str):
"""Remove old messages until we fit within the token budget"""
# Count tokens in system message
fixed_tokens = self.count_tokens(system_message)
# Count tokens in conversation history
history_tokens = sum([self.count_tokens(msg["content"]) for msg in self.conversation_history])
total_tokens = fixed_tokens + history_tokens
while total_tokens > self.max_context_tokens and len(self.conversation_history) > 2:
removed_msg = self.conversation_history.pop(0)
total_tokens -= self.count_tokens(removed_msg["content"])
return total_tokens
def summarize_history(self):
"""Use LLM to summarize the conversation so far."""
if len(self.conversation_history) < 3:
return
history_text = "\n".join([f"{msg["role"]}:{msg["content"]}" for msg in self.conversation_history])
summary_prompt = f"""Summarize this conversation in 3-4 sentences,
preserving key fact, decisions, and actions taken:
{history_text}
Previous Summary: {self.conversation_summary or 'None'}
"""
response = openai.responses.create(model=self.model, input=[{"role":"user","content":summary_prompt}])
self.conversation_summary = response.output_text
# Keep only the last few turns + the summary
recent_turns = self.conversation_history[-4:] # Keep the last 4 messages (2 user/assistant exchanges)
self.conversation_history = recent_turns
self.turns_since_summary = 0
def remember(self, key:str, value: str):
"""Retrieve information from long term memory."""
self.long_term_memory[key] = value
self.save_long_term_memory()
def recall(self,key:str) -> str:
"""Retrieve information from long term memory"""
return self.long_term_memory.get(key,"No memory found for this key.")
def get_relevant_memories(self) -> str:
"""Format long term memories for inclusion in prompts."""
if not self.long_term_memory:
return "No stored memories"
memories = "\n".join([f"- {k}:{v}" for k, v in self.long_term_memory.items()])
return f"Relevant memories:\n{memories}"
def save_long_term_memory(self):
"""Persist long term memory to JSON file"""
try:
with open(self.memory_file,"w") as f:
json.dump(self.long_term_memory,f,indent=2)
except Exception as e:
print(f"Warning: Could not save memory to {self.memory_file}: {e}")
def load_long_term_memory(self):
"""Load long term memory from JSON file"""
if os.path.exists(self.memory_file):
try:
with open(self.memory_file, 'r') as f:
self.long_term_memory = json.load(f)
print(f"Loaded {len(self.long_term_memory)} memories from {self.memory_file}")
except Exception as e:
print(f"Warning: Could not load memory from {self.memory_file}: {e}")
else:
self.long_term_memory = {}
def build_system_prompt(self) -> str:
"""Construct the ReAct system prompt with current context."""
return f"""You are a code review assistant using the ReAct pattern.
## Available Tools
- read_file(filepath): Read contents of a file
- analyze_code(code): Get LLM analysis of code
- patch_file(filepath, content): Replace file contents entirely
## Context
{self.get_relevant_memories()}
Conversation summary: {self.conversation_summary or 'This is the start of the conversation.'}
## Response Format
You MUST respond with valid JSON in one of these two formats:
### Format 1: When you need to use a tool
thought
}}
### Format 2: When the task is complete
thought
## Rules
1. Always include "thought" explaining your reasoning
2. Include "action" when you need to call a tool
3. Include "answer" only when the task is fully complete
4. Never include both "action" and "answer"
5. Respond with ONLY valid JSON—no markdown, no extra text
## Example
User: Review auth.py and fix any bugs
Response 1:
thought}}
Observation: def check(u): return db.user = u
Response 2:
thought}}
Observation: File successfully updated: auth.py
Response 3:
thought
"""
def think(self, user_input:str):
"""LLM decides which tool to use with both short term and long term context."""
# Add user message to history
self.conversation_history.append({"role":"user","content":user_input})
self.turns_since_summary += 1
# Check if we should summarize
if self.turns_since_summary >= self.summarize_after:
self.summarize_history()
#Include long term memory & summary in system context
system_message_context = self.build_system_prompt()
self.trim_history_to_fit(system_message_context)
# Build prompt with system instructions
messages = [
{
"role":"system",
"content":system_message_context
}
] + self.conversation_history
response = openai.responses.create(model=self.model, input=messages)
decision = response.output_text
# Add assistant's decision to conversation history
self.conversation_history.append({
"role":"assistant",
"content": decision
})
return decision
def act(self, action:str):
"""Execute the chosen tool and return the result."""
try:
tool_name = action.get("tool")
args = action.get("args",[])
result = self.tools.call(tool_name,*args)
self.conversation_history.append({"role":"system","content":result})
return result
except Exception as e:
error_msg = f"Error executing tool: {e}"
self.conversation_history.append({
"role":"system",
"content": error_msg
})
return error_msg
def run(self, user_query:str, max_iterations=3):
"""
Main execution loop with reflection.
Args:
user_query: The user's request
max_iterations: Maxumum number of think-act-reflect cycles. this is to avoid the agent getting stuck in a loop.
Returns:
Final response string
"""
step = 0
current_input = user_query
for step in range(max_iterations):
print(f"\n{'-'*60}")
print(f"\nStep {step+1} of {max_iterations}")
print(f"\n{'-'*60}")
llm_response = self.think(current_input)
print(f"Agent's LLM Response:\n{llm_response}")
try:
parsed_reponse = json.loads(llm_response)
except json.JSONDecodeError as e:
current_input = (
f"Your response was not valid Json. Error: {e}\n"
f"Respond with ONLY valid JSON matching the required format."
)
if "thought" in parsed_reponse:
print(f"\nThought: {parsed_reponse["thought"]}")
if "answer" in parsed_reponse:
print(f"\n Answer: {parsed_reponse["answer"]}")
return parsed_reponse["answer"]
if "action" in parsed_reponse:
action = parsed_reponse["action"]
tool_name = action.get("tool","unknown")
args = action.get("args", [])
observation = self.act(action)
print(f"Action: {tool_name}({','.join(repr(a) for a in args)})")
current_input = f"Observation: {observation}"
else:
# Neither action nor answer
print("\nResponse missing both 'action' and 'answer'")
current_input = (
"Your response must include either 'action' (to use a tool) "
"or 'answer' (if the task is complete). Please try again."
)
return "Max steps reached without a final answer"
Run the agent
Let’s run the agent and give it a more complex task to complete.
tool_registry = ToolRegistry()
# Register the tools we defined above
tool_registry.register("read_file", read_file)
tool_registry.register("print_review",print_review)
tool_registry.register("patch_file",patch_file)
agent = CodeReviewAgentReAct(tools_registry=tool_registry)
agent.run(user_query="Review the code in sample.py and fix any issues you find")
What’s next
In this tutorial we have implemented a simple ReAct agent that ‘thinks’ about what actions to take based on the input that it’s been given.
In the next part of the series we will look at more advanced patterns such as routing, planning, and multi agent workflows.
Full Source Code Here: ReAct Agent Jupyter Notebook