Reasoning Patterns in AI Agents: Chain-of-Thought, ReAct, and Beyond
Reviewed: June 4, 2026
How an agent thinks matters more than how much it knows. The reasoning pattern — the internal process by which an agent decomposes problems, makes decisions, and arrives at conclusions — is the primary determinant of agent reliability. This post covers the major reasoning patterns, when to use each, and how to combine them for complex tasks.
Chain-of-Thought (CoT): The Foundation
Chain-of-Thought prompting asks the model to reason step-by-step before answering. It sounds simple, but it consistently improves performance on complex reasoning tasks by 20-60%.
# Standard prompt
"What is the profit margin if revenue is $1M and costs are $750K?"
# Chain-of-Thought prompt
"What is the profit margin if revenue is $1M and costs are $750K?
Let's think step by step:
1. Profit = Revenue - Costs = $1,000,000 - $750,000 = $250,000
2. Profit Margin = Profit / Revenue = $250,000 / $1,000,000 = 0.25 = 25%
The answer is 25%."
When CoT works well: Math problems, logic puzzles, multi-step planning, code generation
When CoT fails: Tasks requiring external knowledge (the model might „reason“ its way to a hallucination), tasks where the step-by-step decomposition isn’t obvious
ReAct: Reasoning + Acting
CoT is purely internal. ReAct (Reasoning + Acting) interleaves thought with external action. The agent thinks, observes, and acts in a loop:
# ReAct loop
Thought 1: I need to find the current population of Tokyo
Action 1: search("Tokyo population 2027")
Observation 1: According to World Population Review, Tokyo's population is 37.4 million
Thought 2: Now let me compare with New York
Action 2: search("New York population 2027")
Observation 2: New York metro population is 20.1 million
Thought 3: Tokyo is larger. The ratio is 37.4/20.1 ≈ 1.86x
Answer 3: Tokyo's population is approximately 1.86 times larger than New York's."
Each cycle: Thought → Action → Observation → repeat until answer
Tree of Thoughts (ToT): Exploring Multiple Paths
CoT follows a single reasoning path. ToT explores multiple paths simultaneously and evaluates them:
class TreeOfThought:
def solve(self, problem, breadth=3, depth=4):
candidates = [problem]
for level in range(depth):
next_candidates = []
for candidate in candidates:
# Generate B possible next thoughts
branches = self.generate_candidates(candidate, n=breadth)
# Evaluate each branch
for branch in branches:
score = self.evaluate(branch)
next_candidates.append((branch, score))
# Keep top-K candidates
candidates = sorted(next_candidates, key=lambda x: -x[1])[:breadth]
return max(candidates, key=lambda x: x[1])
When to use ToT: Games (Chess, 24-puzzle), creative writing with multiple plot options, complex planning with branching possibilities
Reflexion: Self-Critique and Improvement
After completing a task, the agent evaluates its own output and stores lessons for future attempts:
class ReflexionAgent:
def attempt(self, task, max_attempts=3):
memory = self.memory_store.get_recent_lessons(task.type)
for attempt in range(max_attempts):
# Include past lessons
prompt = f"Task: {task.description}nnPast lessons:n{memory}nnAttempt:"
result = self.llm.complete(prompt)
# Self-evaluate
critique = self.evaluate(task, result)
if critique.is_correct:
return result
else:
# Store lesson for future attempts
lesson = self.extract_lesson(result, critique)
memory += f"n- {lesson}"
self.memory_store.save(lesson, task.type)
return result # Return best attempt after max tries
Program-Aided Language Models (PAL)
For tasks requiring precise computation, have the agent write and execute code instead of reasoning in natural language:
Example: Instead of reasoning about a complex math problem in text, the agent writes a Python function and executes it. The code serves as a verifiable reasoning trace that produces exact results.
Combining Patterns in Production
Advanced agents combine multiple reasoning patterns:
class HybridReasoningAgent:
def solve(self, task):
# Step 1: Decompose with CoT
subproblems = self.chain_of_thought.decompose(task)
results = []
for subproblem in subproblems:
if subproblem.requires_tools:
# Step 2: Gather information with ReAct
facts = self.react_loop(subproblem)
else:
facts = None
if subproblem.has_multiple_solutions:
# Step 3: Explore with Tree of Thoughts
solution = self.tree_of_thoughts.explore(subproblem, facts)
else:
solution = self.direct_answer(subproblem, facts)
results.append(solution)
# Step 4: Self-critique with Reflexion
combined = self.synthesize(results)
critique = self.evaluate(task, combined)
if not critique.is_correct:
return self.reflect_and_retry(task, critique)
return combined
Debugging Reasoning Failures
When agents reason incorrectly, it usually falls into these patterns:
- Premature commitment: The agent locks into a wrong path early and doesn’t recover
- Confirmation bias: Subsequent steps support the initial (wrong) answer
- Error propagation: A mistake in step 1 cascades through all subsequent steps
- Overthinking: Too many unnecessary steps introduce noise
- Underthinking: Not enough reasoning steps for the complexity
Mitigation: Use self-consistency (generate multiple CoT paths and majority vote), add verification steps, and implement Reflexion for iterative improvement.
Conclusion
The reasoning pattern is the cognitive architecture of your agent. Start with ReAct for tool-using agents, add ToT for problems with multiple solutions, and layer Reflexion on top for self-improvement. The best agents in 2027 don’t just think — they think about how they think, learn from mistakes, and adapt their reasoning strategy to the task at hand.
Part of the Agent Memory & Knowledge Systems series on DataGate.ch
