What Are AI Agents?

An AI agent is generally an LLM-based system that can plan, act, and use tools in an environment, often with some autonomy. Unlike a one-shot chatbot, an agent can perform multi-step reasoning and decision-making, possibly invoking external APIs or functions during its reasoning. In essence, the LLM serves as the “brain,” but it is wrapped in an agentic architecture that lets it interact with data, software, or other agents in a loop.

For example, consider a coding agent that receives a task like “find and fix the bug in this codebase.” The agent might parse the code, identify potential files to change, search for relevant documentation or code examples, apply modifications, run tests, and iterate. Each of those steps could involve the LLM reasoning (“I think the bug is in this function”), and acting (“run tests” via a tool). The agent architecture orchestrates these steps, checks results, and decides when the task is done or needs human help.

Architectures and Workflows

Recent research has identified common patterns in agent design. One influential approach is ReAct, which interleaves chain-of-thought reasoning with concrete actions. In ReAct, an LLM might alternate between articulating its reasoning (“I should query the database for X”) and specifying actions (“CALL_DATABASE_X with query …”). This synergy allows the model to break complex tasks into smaller steps and interact with tools (like databases or APIs) as part of its solution.

Another theme is planning with sub-agents. For complex tasks, an “orchestrator-worker” pattern can be used. Here, one LLM (the orchestrator) breaks the problem into subtasks and delegates each to worker LLM instances (which may themselves use tools). The orchestrator then aggregates the results. Anthropic describes this workflow: “In the orchestrator-workers workflow, a central LLM dynamically breaks down tasks, delegates them to worker LLMs, and synthesizes their results”. This is especially useful in coding, where one agent can, say, plan which files to modify, while other agents generate or review code for each file.

Some frameworks add memory or statefulness: agents remember past queries, use a short-term or long-term memory module, and refine their plans over multiple turns. Others emphasize tool learning: e.g., the Toolformer approach trains an LLM to learn when and how to call external APIs or calculators in a self-supervised way.

In all cases, the key idea is that an agent is not just a static LLM; it is an active process. It may loop: receiving input, planning actions, executing tools, observing results, updating its plan, and so on. Anthropic summarizes: “Agents can handle sophisticated tasks, but their implementation is often straightforward. They are typically just LLMs using tools based on environmental feedback in a loop”. The “tools” might be as simple as a search API, a calculator, or as elaborate as code execution environments.

Coding Agents

A special, fast-growing category is coding agents – AI agents designed to write, analyze, or refactor code. These agents combine powerful code-generation models (like OpenAI’s Codex or Google’s Gemini) with tools specific to software development. For example, an agent might use a code search tool to find relevant libraries, a symbol navigation tool to jump to definitions, and a test executor to verify output.

Research and industry examples highlight the promise of coding agents. DeepMind’s AlphaCode (2022) was an early system where an LLM generated many code solutions to competition problems, then filtered them to find correct answers. More recent work like AlphaEvolve (2025) uses an evolutionary framework: multiple LLM-generated algorithms are automatically evaluated and combined to improve algorithmic code. In parallel, the CodeAgent framework (2024) showed that wrapping a code LLM with external tools (like a project-aware symbol finder and test-runner) can drastically boost performance on real-world coding tasks, even outperforming off-the-shelf copilots.

In practical terms, a coding agent might take a natural-language task (e.g. “implement this API,” “fix this bug”) and iteratively work on a code repository. They can open files, generate or modify code, compile or run tests, and commit changes – all through an API interface. Anthropic demonstrates a “coding agent” workflow for multi-file edits and iterative feedback (see Figure below). This allows AI to handle project-level development tasks, not just isolated snippets.

How RAG and Agents Intersect

RAG and AI agents are complementary. An agent can treat a retrieval system as one of its tools. For instance, a coding agent might use RAG to lookup documentation or code examples relevant to a function it’s writing. When an agent is uncertain, it can query a knowledge base via RAG to ground its decisions. In effect, RAG provides the “memory and world knowledge” that agents need to be accurate.

Conversely, agents enhance RAG. A simple RAG pipeline might return facts, but an agent can plan how to use those facts. For example, an agent could decide which documents to retrieve in the first place (using the LLM’s reasoning to form better queries), or decide how to combine multi-step retrievals. Research on agent memory (e.g. HippoRAG) views memory itself as a retrieval task, blurring the lines between memory-augmented LLMs and RAG. In practice, many RAG systems for QA or chat are implemented as agents under the hood: the model retrieves, then reads, then possibly refines its answer in multiple passes.

The improved reliability of RAG helps agents avoid errors. Agents often loop, and without grounding, small mistakes compound. By injecting fresh facts at each step, RAG can correct an agent’s course. In CodeAgent, for example, the LLM uses retrieval over the target code repository to avoid writing completely incompatible code. In Anthropic’s terms, each agent “gain[s] ‘ground truth’ from the environment at each step (such as tool call results or code execution) to assess its progress”. RAG can be thought of as giving agents a rich “ground truth” to draw from.

Many modern agent frameworks natively support RAG-like tools. For example, Google’s new Agent Development Kit (ADK) lets agents use built-in tools such as Search and a “Code Exec” tool. It even explicitly integrates with RAG libraries: you can plug in LangChain, LlamaIndex, or Kendra vector search within an ADK agent. OpenAI’s Agents SDK (2025) similarly encourages chaining retrieval calls into agent workflows (e.g. pull company data). In short, the trend is to give AI agents access to the internet, databases, or document corpora – classic RAG sources – so that agents are no longer left to hallucinate or guess.

Read other posts

< Attention Mechanism . Every Product Manager should have a personal LLM Eval Framework >