7 Planning and reflection for complex tasks
This chapter explains why purely reactive agents struggle with complex tasks and how planning and reflection give agents “time to think.” A ReAct-style loop is adaptable because it repeatedly observes, chooses a tool, and acts, but it can lose direction, repeat work, use partial information, or fail to recover from tool errors. The chapter compares this with how human experts work: they first break a problem into parts, proceed step by step, pause to evaluate progress, and adjust when new information or failures appear.
Planning is presented as a way to set direction by decomposing a complex request into manageable tasks and recording that task list in the agent’s context. The chapter implements a simple planning tool that stores tasks with statuses such as pending, in progress, and completed, allowing the LLM to reference the plan as it decides what to do next. It emphasizes that planning should be used selectively: it is valuable for multi-step research or tasks requiring information from multiple sources, but wasteful for simple questions or obvious procedures. The implementation stays intentionally simple, relying on the LLM’s language abilities rather than heavy code logic.
Reflection is introduced as a complementary mechanism for checking and correcting progress during execution. A reflection tool records the agent’s assessment of what it has learned, whether it is on track, whether errors require a new approach, and whether a final answer is ready. Reflection is especially useful after meaningful steps, when tools fail, when combining conflicting information, or before giving a final response. Together, planning and reflection form a cycle: plan, act, reflect, and replan when needed. This cycle helps agents avoid common reactive failures by maintaining direction, verifying completeness, preserving progress, and recovering from mistakes.
The roles of planning and reflection in AI agents.
The Planning-Reflection cycle in agent execution.
Summary
- Planning and reflection give agents "time to think." Instead of reacting moment by moment, agents plan before acting and check after acting. This grants metacognition: the ability to examine their own process.
- Planning decomposes complex problems into clear, manageable units. When "1. Research Kipchoge's record, 2. Research Moon distance, 3. Calculate time" is recorded in the context, the LLM references this plan to maintain direction across multiple steps.
- Reflection is valuable when problems arise, not when everything goes smoothly. When tools fail or results are unexpected, Reflection enables cause analysis and alternative strategies instead of repeating the same failures.
- Planning and reflection form a complementary cycle. Planning provides direction, reflection checks the direction, and when necessary, it triggers re-planning. Neither works in isolation.
- From a context engineering perspective, both are generation strategies. Planning adds "what to do next" to the context, while reflection adds "evaluation and direction so far." These texts influence the LLM's subsequent decisions.
FAQ
Why do reactive ReAct agents struggle with complex tasks?
ReAct agents decide what to do based only on the current observation, which makes them adaptable but short-sighted. On complex multi-step tasks, they can lose direction, repeat searches, fail to use information they already found, answer with partial information, or keep retrying failed tool calls until reaching max_steps.
What do planning and reflection add to an AI agent?
Planning and reflection give agents “time to think.” Planning happens before acting and decomposes a complex problem into manageable tasks. Reflection happens during or after execution and checks whether the agent is making progress, whether the approach is still valid, and whether the plan needs to change.
How is planning similar to how human experts work?
Human experts usually do not start complex work immediately. A researcher first breaks a question into subproblems, and an experienced developer often writes a specification or implementation plan before coding. Planning gives the agent a similar structure: it identifies what must be done, in what order, and how to know when each part is complete.
When should an agent use a planning tool?
An agent should use planning for complex questions that require multiple research steps or combining information from different sources. For example, calculating how long it would take Eliud Kipchoge to reach the Moon requires finding his marathon record, calculating his pace, finding the Moon’s distance, and then calculating the travel time.
Planning is usually unnecessary for simple single-search questions, such as “What’s the weather in Seoul today?”, or tasks with obvious procedures, such as translating a short text.
How does the chapter implement a simple planning tool?
The chapter implements a create_tasks tool that receives a list of tasks and returns them as formatted text. Each task has a content field and a status field. The statuses are pending, in_progress, and completed.
The returned task list is recorded in the agent’s context as a tool result, so the LLM can refer to it in later steps like a to-do list.
Why does the planning tool regenerate the entire task list instead of editing one task at a time?
Regenerating the whole task list keeps the implementation simple. Partial updates, such as “mark task 3 as completed,” require extra logic to track indices and synchronize state. For typical plans with 5–10 tasks, rewriting the full plan costs few extra tokens and avoids many state-management bugs.
What is reflection in an AI agent?
Reflection is the act of pausing during execution to evaluate progress. It helps the agent ask questions such as: “What have I learned so far?”, “Am I close to the original goal?”, “Is this approach working?”, and “Do I need to change direction?”
Reflection records an evaluation in the context, which then influences the agent’s next decision.
When should an agent use reflection?
Reflection is useful after completing a meaningful step, when a tool fails, when multiple pieces of information need to be synthesized, or before giving a final answer. These checkpoints help the agent avoid drifting, recover from errors, resolve conflicting information, and verify that all required information has been gathered.
Reflection should not be used after every single tool call, because that creates unnecessary overhead.
How is reflection different from summarization?
Summarization compresses context, usually when the context becomes too long. Its main purpose is reducing token usage. Reflection is different: it is selectively triggered when the agent needs to check direction, analyze an error, synthesize results, or verify readiness for a final answer. Its purpose is decision support, not compression.
How do planning and reflection work together?
Planning and reflection form a cycle. First, the agent creates a plan and executes tools according to that plan. Then it reflects on the results. If things are going well, it continues to the next task. If the current plan is no longer valid, reflection can set need_replan=True, encouraging the agent to call create_tasks again and revise the plan.
Planning mainly helps the agent look ahead, while reflection helps it look back and correct course.
Build an AI Agent (From Scratch) ebook for free