Reasoning Patterns in AI Agents: Chain-of-Thought, ReAct, and Beyond

Q: ReAct: Reasoning + Acting

CoT is purely internal. ReAct (Reasoning + Acting) interleaves thought with external action. The agent thinks, observes, and acts in a loop: # ReAct loop Thought 1: I need to find the current population of Tokyo Action 1: search("Tokyo population 2027") Observation 1: According to World Population R

Q: Combining Patterns in Production

Advanced agents combine multiple reasoning patterns: class HybridReasoningAgent: def solve(self, task): # Step 1: Decompose with CoT subproblems = self.chain_of_thought.decompose(task) results = [] for subproblem in subproblems: if subproblem.requires_tools: # Step 2: Gather information with ReAct f

Reasoning Patterns in AI Agents: Chain-of-Thought, ReAct, and Beyond

Reviewed: June 4, 2026

How an agent thinks matters more than how much it knows. The reasoning pattern — the internal process by which an agent decomposes problems, makes decisions, and arrives at conclusions — is the primary determinant of agent reliability. This post covers the major reasoning patterns, when to use each, and how to combine them for complex tasks.

Chain-of-Thought (CoT): The Foundation

Chain-of-Thought prompting asks the model to reason step-by-step before answering. It sounds simple, but it consistently improves performance on complex reasoning tasks by 20-60%.

# Standard prompt
"What is the profit margin if revenue is $1M and costs are $750K?"

# Chain-of-Thought prompt
"What is the profit margin if revenue is $1M and costs are $750K?
Let's think step by step:
1. Profit = Revenue - Costs = $1,000,000 - $750,000 = $250,000
2. Profit Margin = Profit / Revenue = $250,000 / $1,000,000 = 0.25 = 25%
The answer is 25%."

When CoT works well: Math problems, logic puzzles, multi-step planning, code generation

When CoT fails: Tasks requiring external knowledge (the model might „reason“ its way to a hallucination), tasks where the step-by-step decomposition isn’t obvious

ReAct: Reasoning + Acting

CoT is purely internal. ReAct (Reasoning + Acting) interleaves thought with external action. The agent thinks, observes, and acts in a loop:

# ReAct loop
Thought 1: I need to find the current population of Tokyo
Action 1: search("Tokyo population 2027")
Observation 1: According to World Population Review, Tokyo's population is 37.4 million
Thought 2: Now let me compare with New York
Action 2: search("New York population 2027")
Observation 2: New York metro population is 20.1 million
Thought 3: Tokyo is larger. The ratio is 37.4/20.1 ≈ 1.86x
Answer 3: Tokyo's population is approximately 1.86 times larger than New York's."

Each cycle: Thought → Action → Observation → repeat until answer

Tree of Thoughts (ToT): Exploring Multiple Paths

CoT follows a single reasoning path. ToT explores multiple paths simultaneously and evaluates them:

class TreeOfThought:
    def solve(self, problem, breadth=3, depth=4):
        candidates = [problem]
        
        for level in range(depth):
            next_candidates = []
            for candidate in candidates:
                # Generate B possible next thoughts
                branches = self.generate_candidates(candidate, n=breadth)
                
                # Evaluate each branch
                for branch in branches:
                    score = self.evaluate(branch)
                    next_candidates.append((branch, score))
            
            # Keep top-K candidates
            candidates = sorted(next_candidates, key=lambda x: -x[1])[:breadth]
        
        return max(candidates, key=lambda x: x[1])

When to use ToT: Games (Chess, 24-puzzle), creative writing with multiple plot options, complex planning with branching possibilities

Reflexion: Self-Critique and Improvement

After completing a task, the agent evaluates its own output and stores lessons for future attempts:

class ReflexionAgent:
    def attempt(self, task, max_attempts=3):
        memory = self.memory_store.get_recent_lessons(task.type)
        
        for attempt in range(max_attempts):
            # Include past lessons
            prompt = f"Task: {task.description}nnPast lessons:n{memory}nnAttempt:"
            result = self.llm.complete(prompt)
            
            # Self-evaluate
            critique = self.evaluate(task, result)
            
            if critique.is_correct:
                return result
            else:
                # Store lesson for future attempts
                lesson = self.extract_lesson(result, critique)
                memory += f"n- {lesson}"
                self.memory_store.save(lesson, task.type)
        
        return result  # Return best attempt after max tries

Program-Aided Language Models (PAL)

For tasks requiring precise computation, have the agent write and execute code instead of reasoning in natural language:

Example: Instead of reasoning about a complex math problem in text, the agent writes a Python function and executes it. The code serves as a verifiable reasoning trace that produces exact results.

Combining Patterns in Production

Advanced agents combine multiple reasoning patterns:

class HybridReasoningAgent:
    def solve(self, task):
        # Step 1: Decompose with CoT
        subproblems = self.chain_of_thought.decompose(task)
        
        results = []
        for subproblem in subproblems:
            if subproblem.requires_tools:
                # Step 2: Gather information with ReAct
                facts = self.react_loop(subproblem)
            else:
                facts = None
            
            if subproblem.has_multiple_solutions:
                # Step 3: Explore with Tree of Thoughts
                solution = self.tree_of_thoughts.explore(subproblem, facts)
            else:
                solution = self.direct_answer(subproblem, facts)
            
            results.append(solution)
        
        # Step 4: Self-critique with Reflexion
        combined = self.synthesize(results)
        critique = self.evaluate(task, combined)
        
        if not critique.is_correct:
            return self.reflect_and_retry(task, critique)
        
        return combined

Debugging Reasoning Failures

When agents reason incorrectly, it usually falls into these patterns:

Premature commitment: The agent locks into a wrong path early and doesn’t recover
Confirmation bias: Subsequent steps support the initial (wrong) answer
Error propagation: A mistake in step 1 cascades through all subsequent steps
Overthinking: Too many unnecessary steps introduce noise
Underthinking: Not enough reasoning steps for the complexity

Mitigation: Use self-consistency (generate multiple CoT paths and majority vote), add verification steps, and implement Reflexion for iterative improvement.

Conclusion

The reasoning pattern is the cognitive architecture of your agent. Start with ReAct for tool-using agents, add ToT for problems with multiple solutions, and layer Reflexion on top for self-improvement. The best agents in 2027 don’t just think — they think about how they think, learn from mistakes, and adapt their reasoning strategy to the task at hand.

Part of the Agent Memory & Knowledge Systems series on DataGate.ch

📚 Related Posts

DataGate AI Content Intelligence Dashboard — DataGate AI Content Intelligence Dashboard *{box-sizing:border-box;margin:0;padding:0} :root{--bg:#0f172a;--card:#1e293b;--accent:#3b82f6;--accent2:#8b5cf6;--green:#10b981;--yellow:#f59e0b;--red:#ef4444;--text:#e2e8f0;--muted:#94a3b8} body{font-family:'Segoe UI',system-ui,sans-serif;background:var(--bg);color:var(--text);padding:16px;line-height:1.6} .header{display:flex;align-items:center;justify-content:space-between;flex-wrap:wrap;gap:12px;margin-bottom:16px} .header h1{font-size:1.5rem;background:linear-gradient(90deg,var(--accent),var(--accent2));-webkit-background-clip:text;-webkit-text-fill-color:transparent} .header .badge{background:linear-gradient(135deg,var(--accent),var(--accent2));color:#fff;padding:4px 12px;border-radius:20px;font-size:.75rem;font-weight:600}…
Topic Trend Tracker — Topic Trend Tracker *{box-sizing:border-box;margin:0;padding:0} :root{--bg:#0f172a;--card:#1e293b;--accent:#3b82f6;--accent2:#8b5cf6;--green:#10b981;--yellow:#f59e0b;--red:#ef4444;--text:#e2e8f0;--muted:#94a3b8} body{font-family:'Segoe UI',system-ui,sans-serif;background:var(--bg);color:var(--text);padding:20px;line-height:1.6} .wrap{max-width:1100px;margin:0 auto} h1{font-size:1.6rem;margin:4px 0 16px;background:linear-gradient(90deg,var(--accent),var(--accent2));-webkit-background-clip:text;-webkit-text-fill-color:transparent} .sub{color:var(--muted);margin-bottom:20px;font-size:.9rem} .grid{display:grid;grid-template-columns:1fr 1fr;gap:16px}…
Audience Segmentation Explorer — Audience Segmentation Explorer *{box-sizing:border-box;margin:0;padding:0} :root{--bg:#0f172a;--card:#1e293b;--accent:#3b82f6;--accent2:#8b5cf6;--green:#10b981;--yellow:#f59e0b;--red:#ef4444;--text:#e2e8f0;--muted:#94a3b8} body{font-family:'Segoe UI',system-ui,sans-serif;background:var(--bg);color:var(--text);padding:20px;line-height:1.6} .wrap{max-width:1100px;margin:0 auto} h1{font-size:1.6rem;margin:4px 0 16px;background:linear-gradient(90deg,var(--accent),var(--accent2));-webkit-background-clip:text;-webkit-text-fill-color:transparent} .sub{color:var(--muted);margin-bottom:20px;font-size:.9rem} .grid{display:grid;grid-template-columns:1fr 1fr;gap:16px}…
AI Content Performance Analyzer — AI Content Performance Analyzer *{box-sizing:border-box;margin:0;padding:0} :root{--bg:#0f172a;--card:#1e293b;--accent:#3b82f6;--accent2:#8b5cf6;--green:#10b981;--yellow:#f59e0b;--red:#ef4444;--text:#e2e8f0;--muted:#94a3b8} body{font-family:'Segoe UI',system-ui,sans-serif;background:var(--bg);color:var(--text);padding:20px;line-height:1.6} .wrap{max-width:1100px;margin:0 auto} h1{font-size:1.6rem;margin:4px 0 16px;background:linear-gradient(90deg,var(--accent),var(--accent2));-webkit-background-clip:text;-webkit-text-fill-color:transparent} .sub{color:var(--muted);margin-bottom:20px;font-size:.9rem} .stats{display:grid;grid-template-columns:repeat(auto-fit,minmax(140px,1fr));gap:12px;margin-bottom:20px}…
Wave 151 Hub: AI Agent Engineering — 🌊 Wave 151: AI Agent Engineering The definitive guide to building production-grade AI agents —…

Reasoning Patterns in AI Agents: Chain-of-Thought, ReAct, and Beyond

Reasoning Patterns in AI Agents: Chain-of-Thought, ReAct, and Beyond

Chain-of-Thought (CoT): The Foundation

ReAct: Reasoning + Acting

Tree of Thoughts (ToT): Exploring Multiple Paths

Reflexion: Self-Critique and Improvement

Program-Aided Language Models (PAL)

Combining Patterns in Production

Debugging Reasoning Failures

Conclusion

📚 Related Posts

Schreibe einen Kommentar Antwort abbrechen