Skip to content

Agent Fundamentals

An AI agent is an autonomous system that uses an LLM as its reasoning engine to perceive its environment, reason about situations, take actions via tools, and adapt based on results. Unlike a chatbot (single-turn responses), an agent connects to databases, APIs, and tools to autonomously complete multi-step tasks.

Key Facts

  • Agent = LLM brain + tools + memory + planning
  • More capable models produce more capable agents (GPT-4 >> GPT-3.5 for complex agents)
  • Function calling capability is essential - the model must reliably output structured tool calls
  • Each reasoning step costs tokens - a single request may require 5-20 LLM calls
  • Use workflows (fixed step sequences) when the process is known; use agents only when dynamic decision-making is needed

Agent Components

1. LLM Brain (Reasoning Engine)

Core decision-making. Processes context, reasons about next steps, generates tool calls.

2. Tools

External capabilities: search, code execution, file operations, APIs, communication. Any function with a description for the LLM.

3. Memory

  • Short-term: current conversation context
  • Long-term: persistent knowledge across sessions (vector stores, databases)
  • Working memory (scratchpad): accumulated thoughts, actions, observations during execution

4. Planning

  • No planning: direct tool call from user request
  • Sequential: step-by-step execution plan
  • Hierarchical: subtasks handled by sub-agents
  • Iterative refinement: plan -> execute -> evaluate -> revise

The ReAct Loop

The foundational agent execution pattern (Reasoning + Acting):

1. THOUGHT: Analyze situation, decide next action
2. ACTION: Call a tool with specific inputs
3. OBSERVATION: Receive tool output
4. Repeat until task complete
5. FINAL ANSWER: Synthesize and respond

Example:

User: What's the weather in Paris and should I bring an umbrella?

Thought: I need to check weather in Paris
Action: weather_api(city="Paris")
Observation: Temperature: 15C, Rain probability: 80%

Thought: High rain probability means umbrella needed
Final Answer: Paris is 15C with 80% chance of rain. Bring an umbrella.

Agent Types

Type Description Use Case
Tool-Use LLM decides which tool to call. No complex planning. Simple API integrations
Conversational Maintains dialogue, asks clarifying questions Customer support
Plan-and-Execute Creates full plan first, then executes step by step Complex multi-step tasks
Self-Correcting (Reflexion) Evaluates own output, critiques, retries Code generation, analysis

Agent vs Workflow

Factor Agent (autonomous) Workflow (predefined)
Flexibility High - adapts to novel situations Low - follows fixed steps
Predictability Low - may take unexpected actions High - deterministic path
Debugging Hard - trace through reasoning Easy - check each step
Cost Higher - more LLM calls Lower - minimal LLM calls
Best for Open-ended research, dynamic tasks Known processes, pipelines

Agent Architectures

Single Agent

One LLM handles everything. Simple but limited for complex tasks.

Router Pattern

LLM classifier routes to specialized agents:

User Request -> Router (classifies intent)
  -> FAQ Agent
  -> Technical Agent
  -> Billing Agent

Supervisor Pattern

Boss agent delegates to specialized workers:

User Request -> Supervisor
  -> Worker 1 (Research)
  -> Worker 2 (Analysis)
  -> Worker 3 (Writing)
-> Supervisor synthesizes

Error Handling

Agents can fail at multiple points: - Malformed tool calls from LLM - Tool execution failures (API error, timeout) - Infinite loops - Misunderstanding task and taking wrong action

Mitigation: max iteration limits (10-20 typical), output validation, fallback to human, structured error recovery.

Agent Benchmarks

Benchmark What It Tests
SWE-bench Real GitHub issues (code understanding + fixing)
WebArena Web browsing agent evaluation
GAIA General AI assistants
ToolBench Tool-use across diverse APIs

Gotchas

  • Start with workflows, add agency gradually - don't make everything autonomous
  • Local/small models produce errors with complex agent workflows - use capable models
  • Agent cost estimate: $0.01-$1.00 per request depending on complexity
  • The scratchpad (accumulated history) grows with each step - must be managed (truncation, summarization)
  • Logging everything (thoughts, actions, observations) is essential for debugging agents

See Also

  • [[agent-design-patterns]] - ReAct, plan-and-execute, reflexion patterns in detail
  • [[function-calling]] - How agents invoke tools
  • [[multi-agent-systems]] - Multi-agent architectures
  • [[agent-memory]] - Memory management for agents
  • [[agent-security]] - Securing agent systems