So you want to build real AI agents, huh?
The AI hype train is roaring, and "agentic AI" is the latest gold rush. But, really, let's cut the crap. Most of what people are calling agents are just glorified scripts with an LLM bolted on; at best an LLM with tools and a for-loop. If you're serious about building systems that think, adapt, and actually do complex stuff, you need to understand the guts of agent architecture composition. This isn't about prompt engineering for your chatbot; this is about architecting the future.
Agentic AI: more than just a buzzword
Agentic AI means systems where AI agents get shit done, autonomously. They're not just waiting for your click; they're navigating messy, dynamic environments to hit their targets. The agentic architecture is the skeleton that makes this possible; it shapes the workflow, marshals the AI models, and basically defines the agent's playground.
At its core, an intelligent agent system has a few key parts:
- Perception/profiling: How it sees the world (data acquisition).
- Memory: Where it stashes its knowledge.
- Planning: The brains of the operation; strategy and decision-making.
- Action: How it executes.
- Learning strategies: How it gets less dumb over time.
These bits directly fuel an agent's agency: its intentionality (planning), forethought (thinking ahead), self-reactiveness (course correction), and critically, self-reflectiveness (learning from its screw-ups and successes). That last part is key. We're not just automating tasks; we're aiming for systems that evolve. Big difference.
Single-agent vs. multi-agent systems (MAS)
Sure, single-agent architectures are simple. Predictable. Fast for some baby tasks. But try scaling them or throwing a multi-step workflow their way, and they choke. They're rigid, narrow, and frankly, boring.
The real action, the future, is in multi-agent systems (MAS). Think a crew of specialized AI agents, each a rockstar in its domain, working together on problems that would make a single agent cry. MAS means more accuracy, better adaptability, and the muscle to tackle massive problems. They pool resources, optimize workflows through collaboration, and generally kick single-agent butt in any complex scenario. There is a growind demand for distributed intelligence. So, if a framework can't handle composing and managing multi-agent systems effectively, it's almost considered obsolete.
Agent orchestration & composition
Agent orchestration is the art and science of making your network of specialized AI agents play nice together to automate complex shit. The orchestrator; whether it's a central AI agent or the framework itself, is the conductor, making sure the right agent hits the right note at the right time.
The lifecycle usually looks like this:
- Assess and plan.
- Pick your specialist agents.
- Slap in an orchestration framework (LangChain, watsonx Orchestrate, Power Automate, etc).
- The orchestrator dynamically picks and assigns agents.
- Coordinate and execute.
- Manage data and context (the messy part).
- Continuously optimize and learn (hopefully with a human keeping an eye on things).
Common orchestration flavors:
- Centralized: One boss agent. Simple, but that boss better be good.
- Decentralized: Agents figure it out amongst themselves.
- Hierarchical: Layers of bosses. Management, AI style.
- Federated: Independent agents/orgs collaborating without oversharing.
These aren't just fancy terms. They're how you solve the real headaches of MAS: coordination, conflict resolution (because agents have opinions too), and smart task allocation. Agent composition frameworks, then, are not just about building individual agents. They're about defining the spiderweb of their interactions and the rules of engagement. "Composition" is the implementation of badass orchestration.
The contenders - a deep dive into key agent composition frameworks
The AI agent framework scene is exploding. Everyone's got a new way to build these critters. Let's rip apart some of the major players: LangGraph, Microsoft Autogen, OpenAI's new toys, MCP and its crew (like fast-agent), Mastra, CrewAI, and the agentic side of LlamaIndex.
LangGraph: For the control freaks (in a good way)
LangGraph, an offshoot of the ever-present LangChain, is for those who like their agent workflows stateful and with clear lines of command. Think complex, multi-actor applications where LLMs need to remember what happened last Tuesday. This is your jam if you're building systems that need to be auditable and debuggable, because its graph-based state machine approach lays everything bare. No black boxes here.
Core Idea: Workflows are directed graphs. Nodes are tasks, functions, or your AI agents. Edges are the connections and conditional jumps. This structure means you can actually see and manage the logic. It's built on layers that handle state, agent wrangling, running the graph (often with a Pregel-inspired model), and talking to the outside world. The big wins? Stateful processing (agents remember stuff), modular agent architecture (specialists do their thing), and dynamic workflow management (decision trees with branching paths).
Playing with multiple agents (LangGraph Style):
- Supervisor: The classic boss pattern. A main agent dishes out tasks to worker bees and pulls the results together. A common twist is the tool-calling supervisor, where sub-agents are just tools the boss LLM decides to use. We use this, and everyone's grandma does it too.
- Swarm: Less defined in the docs they provided, but hints at more distributed, hive-mind style collaboration.
- Hierarchical: Bosses managing other bosses or teams. For when your problem is a Russian doll of complexity.
- Network: Any agent can ping any other agent. Total freedom, potential for total chaos if you're not careful.
- Custom Workflows: Roll your own logic. Either pre-defined flows or let the LLMs decide the next step dynamically.
Chatting and remembering: It’s all about a shared state object (often just a list of messages) that gets passed around. Agents can "handoff" control or data, or a supervisor can treat other agents like tools it calls. Smart cookies will use scoped agent memory and distinct state setups for sub-graphs to avoid context pollution; basically, one agent scribbling over another's important notes. You decide if agents share their messy thought process (scratchpad) or just the clean final answer.
Quick example: a supervised research team Imagine building a report:
- SupervisorAgentNode: Gets the query (e.g., "Impact of quantum computing on crypto"). Breaks it into sub-tasks ("find papers," "summarize algorithms," "draft summary"). Decides who does what next.
- LiteratureSearchAgentNode: A tool. Supervisor says "go find papers on X," it uses a search API, returns snippets.
- SummarizationAgentNode: Takes snippets, boils them down.
- DraftingAgentNode: Gets summaries, drafts a section. The graph's state tracks the query, sub-tasks, snippets, summaries, drafts, and whose turn it is. Conditional edges go from the Supervisor to specialists based on the current sub-task. Specialists report back to the Supervisor, who updates state and picks the next move. Classic top-down control.
# LangGraph Code Snippet: Supervisor & Workers
# Heads up: This is conceptual. You'll need LangChain, LangGraph,
# and an LLM provider (like OpenAI) properly set up.
from typing import Literal
from langchain_openai import ChatOpenAI # Assuming OpenAI
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.types import Command
# model = ChatOpenAI(model="gpt-4-turbo") # Placeholder: Initialize your LLM
# Define the state for our graph
# MessagesState conveniently stores a list of messages
class SupervisorState(MessagesState):
next_agent: str # Who's up next?
# Supervisor Agent: The Brains
def supervisor_node(state: SupervisorState) -> dict:
print("---SUPERVISOR NODE---")
# Real supervisor logic: LLM call based on state['messages'] to decide next_agent or END.
# Simplified for brevity:
if len(state['messages']) > 5: # Arbitrary end condition
next_action = END
elif "task for agent 1" in state['messages'][-1].content.lower():
next_action = "agent_1"
else:
next_action = "agent_2"
return {"next_agent": next_action}
# Worker Agent 1
def agent_1_node(state: SupervisorState) -> dict:
print("---AGENT 1 NODE---")
# Agent 1 does its thing. Could be an LLM call, tool use, etc.
# Simplified: Process the last message
processed_message = f"Agent 1 reporting: Processed '{state['messages'][-1].content}'"
return {"messages": [("ai", processed_message)]} # Add its output to state
# Worker Agent 2
def agent_2_node(state: SupervisorState) -> dict:
print("---AGENT 2 NODE---")
processed_message = f"Agent 2 reporting: Handled '{state['messages'][-1].content}'"
return {"messages": [("ai", processed_message)]}
# LangGraph Code Snippet: Supervisor & Workers (Continued)
# Let's build the graph
builder = StateGraph(SupervisorState)
builder.add_node("supervisor", supervisor_node)
builder.add_node("agent_1", agent_1_node)
builder.add_node("agent_2", agent_2_node)
# Wire it up
builder.add_edge(START, "supervisor") # Kick things off with the supervisor
# Conditional routing from supervisor to workers or end
def route_to_agent(state: SupervisorState):
return state["next_agent"] # The supervisor decided this
builder.add_conditional_edges(
"supervisor",
route_to_agent,
{
"agent_1": "agent_1",
"agent_2": "agent_2",
END: END
}
)
# Workers report back to the supervisor
builder.add_edge("agent_1", "supervisor")
builder.add_edge("agent_2", "supervisor")
# Compile the graph and it's ready to go!
graph = builder.compile()
# Example of how you might run this thing:
# initial_input = {"messages": [("user", "Start with a task for agent 1, then agent 2")]}
# for event in graph.stream(initial_input, {"recursion_limit": 10}):
# for key, value in event.items():
# print(f"{key}: {value}")
# print("---")
Microsoft Autogen: for the chatty, collaborative swarms
Microsoft's Autogen is built for crafting AI agents that love to talk; to each other. It's strong on multi-agent conversational systems and wrangling complex agentic workflows. If you envision your agents brainstorming, debating, or collaboratively coding, Autogen is worth a hard look.
Architectural breakdown: Autogen's got layers:
- Core: The foundation. Think event-driven agent machinery, asynchronous messaging (key for agents not stepping on each other's toes).
- AgentChat: Sits on Core, gives you a nicer, task-driven API. Group chat management, code execution, pre-built agent types. This is where most people start.
- Extensions: Integrations with the outside world; Azure code executors, OpenAI models, and even MCP workbenches.
The big deal with Autogen v0.4+ is its asynchronous, event-driven architecture. Agents communicate via async messages, supporting both event-driven and request/response patterns. This makes it robust and extensible, ready for proactive, long-running agents. It's modular, so you can plug in custom agents, tools, memory, and models. Plus, built-in metrics, tracing, and OpenTelemetry support mean you can actually see what your agent swarm is up to. This async, event-driven heart is what lets Autogen handle crazy dynamic multi-agent chats; think nested chats, agents jumping in and out of groups. More fluid than strictly sequential graph systems if you need that flexibility.
The multi-agent conversation framework: Autogen revolves around "conversable" agents:
- ConversableAgent: The base class for any agent that can send and receive messages to get tasks done.
- AssistantAgent: Your typical LLM-powered brain. Writes text, code, processes results. Usually doesn't need human hand-holding or execute code by default.
- UserProxyAgent: Your human stand-in. Can ask for input, run code (like Python scripts from an AssistantAgent), and call tools.
This setup lets you automate chats between multiple agents, letting them tackle tasks solo or with human guidance. Autogen supports dynamic conversation patterns like hierarchical chat, dynamic group chat (a manager agent plays traffic cop), FSM-based transitions (control who talks next), and nested chat (chats within chats for modular problem-solving).
Quick example: collaborative code gen & debugging User wants a Python function for Fibonacci numbers, unit tests, and wants it all run.
- UserProxyAgent: Kicks off with the prompt.
- PlannerAgent (custom AssistantAgent): Breaks it down: "Define function, implement logic, write tests, prep execution script."
- CoderAgent (AssistantAgent for code): Writes the Python for the function and tests.
- ExecutorAgent (UserProxyAgent with code execution enabled): Runs the script, captures output/errors.
- DebuggerAgent (another AssistantAgent): If errors, it analyzes them, suggests fixes to CoderAgent. They all yak it out in a group chat. The Planner plans, Coder codes, Executor runs. Errors? Loop between Coder, Executor, Debugger until tests pass or they give up. The UserProxyAgent watches, clarifies, or gives the thumbs up. This shows Autogen's muscle for conversational agents tackling iterative tasks like coding.
# Microsoft Autogen Code Snippet: Assistant & User Proxy
# Disclaimer: Requires Autogen installed and an LLM configuration.
# Set up your OAI_CONFIG_LIST (json file or env var) or define config_list manually.
import os
import autogen
# LLM Configuration (Example using OpenAI GPT-4 Turbo)
# Option 1: From OAI_CONFIG_LIST
try:
config_list_gpt4 = autogen.config_list_from_json(
"OAI_CONFIG_LIST", # Ensure this points to your config
filter_dict={
"model": ["gpt-4", "gpt-4-turbo", "gpt-4-32k"], # Add your model variants
},
)
except FileNotFoundError:
print("OAI_CONFIG_LIST not found. Attempting manual config.")
# Option 2: Manual Configuration (if OAI_CONFIG_LIST isn't set)
# IMPORTANT: Make sure OPENAI_API_KEY environment variable is set if using this.
api_key = os.environ.get("OPENAI_API_KEY")
if not api_key:
print("OPENAI_API_KEY not found. Cannot configure LLM.")
config_list_gpt4 = [] # Empty list, will cause issues if not handled
else:
config_list_gpt4 = [
{
'model': 'gpt-4-turbo',
'api_key': api_key,
}
]
# Create an AssistantAgent (the LLM-powered workhorse)
assistant = autogen.AssistantAgent(
name="CoderAssistant",
llm_config={
"config_list": config_list_gpt4,
"temperature": 0, # For more predictable outputs
}
)
# Microsoft Autogen Code Snippet: Assistant & User Proxy (Continued)
# Create a UserProxyAgent (acts as the user, can execute code)
user_proxy = autogen.UserProxyAgent(
name="UserProxy",
human_input_mode="NEVER", # "ALWAYS" to require human input, "TERMINATE" to stop on human input
max_consecutive_auto_reply=10, # Avoid infinite loops
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={ # Configuration for code execution
"work_dir": "autogen_coding_dir", # Directory to save and run code
"use_docker": False, # Set to True or a Docker image name to use Docker (safer!)
},
llm_config={ # Can also have its own LLM for deciding replies/actions
"config_list": config_list_gpt4,
"temperature": 0,
}
)
# Kick off the conversation!
# The user_proxy sends the initial task to the assistant.
# This is commented out to prevent execution without a live API key & full setup.
# if config_list_gpt4: # Only attempt if LLM config is present
# user_proxy.initiate_chat(
# assistant,
# message="""What date is it today? Then, tell me the year-to-date performance
# for META and compare it against TESLA. Finally, plot a chart and save it to a file."""
# )
# else:
# print("LLM configuration is missing. Skipping Autogen chat initiation.")
OpenAI Swarm & new agent tools: OpenAI's own agent playbook
OpenAI isn't just sitting back; they're building tools to make agentic apps easier, moving from their experimental Swarm SDK to more solid APIs and a new Agents SDK. They clearly want you building agents on their side.
Swarm ideas & the responses API: OpenAI's early thinking with Swarm was about lightweight, controllable, and testable agent coordination. Core bits:
- Agent: Instructions (goals, behavior) + functions/tools (what it can do).
- Handoffs: How one agent passes the baton to another (usually by a function returning another Agent object).
Building on this, they dropped the Responses API. This thing aims to mix the simplicity of Chat Completions with the smarter tool-use of the Assistants API. They do this by handling tasks with multiple tool uses and several model turns in one API call. Comes with built-in tools like web and file search.
Orchestration via the agents SDK: To really streamline multi-agent workflows, OpenAI launched a new open-source Agents SDK. This gives you more structure:
- Agents: LLM instances you can easily configure with instructions and tools (theirs or yours).
- Handoffs: Formalized mechanism for smart, context-aware control transfer between agents.
- Guardrails: Configurable safety checks for input/output validation. Keeps your agents from going totally rogue.
- Tracing & Observability: Tools to see what your agents are actually doing. Critical for debugging and tuning.
Features like Guardrails and Tracing show they're thinking about production readiness (safety, reliability, maintenance). That’s vital for enterprise adoption because businesses want agents that don’t just work, but work predictably and securely. Raw model power isn't enough; they're building out the whole ecosystem.
Quick example: tiered customer support system A classic multi-agent play: different support tiers.
- TriageAgent: First contact. Instructions: "Figure out what the user wants. Shopping query? Hand off to ShoppingAgent. Returns/tech support? Hand off to SupportAgent." Has
handoffs=
capability. - ShoppingAgent: Instructions: "Help find products, compare, use WebSearchTool."
- SupportAgent: Instructions: "Assist with support, refunds, tech issues. Use
submit_refund_request
tool (your custom backend interaction)."
User says, "Wanna return an item." TriageAgent gets it, sees "return," hands off to SupportAgent. SupportAgent takes over, gathers details (order number, why it sucked), uses its submit_refund_request
tool. This nails the handoff concept, core to Swarm and the new Agents SDK, for routing tasks to the right specialist. Standard stuff in real customer service, now doable with these tools.
# OpenAI Agents SDK Code Snippet: Tiered Customer Support (Conceptual)
# NOTE: This is based on described features of the OpenAI Agents SDK.
# Actual implementation details may vary. Ensure you have the SDK installed and configured.
# from openai_agents import Agent, Runner # Or similar imports based on actual SDK
# from openai_agents.tools import WebSearchTool # Example built-in tool
# from openai_agents.tools.custom import function_tool # For custom tools
# Placeholder for actual SDK imports if they differ
class Agent:
def __init__(self, name, instructions, tools=None, handoffs=None, model=None):
self.name = name
self.instructions = instructions
self.tools = tools or []
self.handoffs = handoffs or []
self.model = model
print(f"Conceptual Agent '{self.name}' initialized.")
class Runner:
def __init__(self, starting_agent):
self.starting_agent = starting_agent
print(f"Conceptual Runner initialized with '{self.starting_agent.name}'.")
def run_sync(self, input_text):
print(f"Runner received: {input_text}")
# Simplified logic: direct to first handoff or a tool based on keywords
# This would be an LLM call in reality, guided by TriageAgent's instructions
if "shop" in input_text.lower() and self.starting_agent.handoffs:
target_agent = next((h for h in self.starting_agent.handoffs if "Shopping" in h.name), None)
if target_agent:
print(f"Handing off to {target_agent.name}")
# Simulate agent execution
if any("WebSearchTool" in str(t) for t in target_agent.tools):
return type('obj', (object,), {'output_text': f'{target_agent.name} used WebSearchTool for: {input_text}'})
return type('obj', (object,), {'output_text': f'{self.starting_agent.name} processed: {input_text}'})
elif ("return" in input_text.lower() or "support" in input_text.lower()) and self.starting_agent.handoffs:
target_agent = next((h for h in self.starting_agent.handoffs if "Support" in h.name), None)
if target_agent:
print(f"Handing off to {target_agent.name}")
# Simulate agent execution with a custom tool
if any("submit_refund_request" in str(t.name) for t in target_agent.tools):
return type('obj', (object,), {'output_text': f'{target_agent.name} used submit_refund_request for: {input_text}'})
return type('obj', (object,), {'output_text': f'{self.starting_agent.name} processed: {input_text}'})
else:
return type('obj', (object,), {'output_text': f'{self.starting_agent.name} handled: {input_text}'})
def function_tool(func):
func.is_tool = True
func.tool_name = func.__name__
return func
class WebSearchTool:
def __str__(self):
return "WebSearchToolInstance"
# Define a custom tool (conceptually)
@function_tool
def submit_refund_request(item_id: str, reason: str) -> str:
"""Submits a refund request. Returns 'success' or 'failure'."""
print(f"Conceptual: Refund for item '{item_id}', reason: '{reason}'")
return "success" if item_id == "123" else "failure: item not found"
# OpenAI Agents SDK Code Snippet: Tiered Customer Support (Conceptual - Continued)
# Define specialized agents (conceptually)
support_agent = Agent(
name="Support_and_Returns_Agent",
instructions="You are a support agent. Help with product issues and submit refund requests using the 'submit_refund_request' tool.",
tools=[submit_refund_request], # Custom tool
model="gpt-4o" # Example model
)
shopping_agent = Agent(
name="Shopping_Assistant_Agent",
instructions="You are a shopping assistant. Search the web for product info using 'WebSearchTool'.",
tools=[WebSearchTool()], # Built-in tool instance
model="gpt-4o"
)
# Define a triage agent that hands off
triage_agent = Agent(
name="Triage_Agent",
instructions="You are a triage agent. Understand user queries and route to Shopping_Assistant_Agent for shopping or Support_and_Returns_Agent for support/refunds. Ask clarifying questions if unsure.",
handoffs=[shopping_agent, support_agent], # Other agents it can hand off to
model="gpt-4o"
)
# Initialize the Runner with the starting agent
# runner = Runner(starting_agent=triage_agent)
# Example conceptual runs (commented out):
# user_query_shopping = "I'm looking for new running shoes, can you find some top-rated ones?"
# response_shopping = runner.run_sync(input_text=user_query_shopping)
# print(f"User: {user_query_shopping}")
# print(f"Final Conceptual Response: {response_shopping.output_text}")
# user_query_support = "I want to return item 123 because it's faulty."
# response_support = runner.run_sync(input_text=user_query_support)
# print(f"User: {user_query_support}")
# print(f"Final Conceptual Response: {response_support.output_text}")
A side-note → Model Context Protocol (MCP): The universal translator for AI tools
The Model Context Protocol (MCP) is an open shot at standardizing how your apps feed context; tools, resources, prompts; to LLMs and AI agents. Think of it as trying to create a common language for AI systems to tap into external powers.
MCP Guts: It’s a client-server dance:
- Hosts: Your LLM apps (IDEs like Cursor, desktop apps, custom agent setups). They run and manage connections.
- Clients: Live in the host app, chat one-on-one with an MCP server. They handle protocol negotiation, message routing, and subscriptions.
- Servers: Lightweight programs exposing specific powers (tools, resources, prompts) via MCP. Can be local or remote, focused on doing one thing well.
Communication is layered: a protocol layer (JSON-RPC 2.0 for message framing, request/response) and a transport layer (stdio for local, HTTP with Server-Sent Events for remote). Messages are Requests, Results, Errors, or Notifications (one-way pings).
How MCP makes tool & context sharing less of a nightmare: MCP wants to standardize how agents grab and use outside functions. Servers shout out their capabilities (tools, resources, prompt templates) during an initial handshake. Client agents can then find and call these. Key ideas: make servers easy to build and stack like LEGOs, keep servers on a need-to-know basis (they don’t see the whole chat history, host keeps that), and let clients and servers add new tricks over time. The dream? MCP as the "USB-C for AI." If it catches on, you could mix and match LLMs, agent frameworks, and tool providers like a DJ. Less vendor lock-in, more innovation. This is the kind of thinking behind ideas like the "Agent Mesh"; diverse agents and services all talking nice through standard interfaces.
Architectural patterns MCP enables: MCP itself is just the protocol, but it lights up various agent system designs:
- Single-agent MCP: One LLM agent calls tools from MCP servers. Simple, direct, but can bottleneck or overload context if the agent juggles too many tools.
- Multi-agent orchestrated MCP: Many specialized agents collaborate, each using MCP clients to get tools for their specialty. Better for parallel work and resilience.
- Hierarchical MCP agents: Agents in layers; planners up top, executors below. Each might use MCP tools. Good for breaking down big tasks.
- Event-driven MCP agents: Agents react to events, often asynchronously. MCP tools could be called on events, or MCP servers might fire off events themselves.
Quick example: agent tapping an mcp server for file system access Agent needs to read a log file.
- Host app: Your Python script running the AI agent.
- Agent component: The AI agent (could be simple LLM loop) gets a request: "Summarize critical errors in 'application.log' today."
- MCP client component: Part of the agent, talks to MCP servers.
- FileSystemMCPServer: Separate local process. Speaks MCP, offers tools like
readFile(path)
,writeFile(path, content)
,listFiles(directory)
.
Flow:
- User asks agent.
- Agent's LLM decides it needs to read
application.log
. - Agent tells MCPClient to call
readFile(path='application.log')
on FileSystemMCPServer. - MCPClient sends JSON-RPC request (e.g., via stdio).
- FileSystemMCPServer gets request, reads file, sends MCP Result with content (or Error).
- Response goes to MCPClient, then to Agent.
- Agent uses LLM to process log, summarize errors for user.
This shows MCP's core value: letting an AI agent securely and standard-ly use external powers (like local file access) not built into the LLM, by chatting with a specialized server. No custom integration per tool; that's the promise.
fast-agent: MCP-native agents with a need for speed (and decorators)
fast-agent jumps in as a framework for building and playing with AI agents and workflows, proudly waving the Model Context Protocol (MCP) flag. It's an MCP-native solution, meaning its bones are built around MCP ideas. It supports the full MCP toolkit: Tools, Prompts, Resources, Sampling, and Roots. A cool trick? Multi-modal support, letting agents chew on images and PDFs in prompts, resources, and MCP tool results.
This framework wants to get you building fast. It throws pre-built agent and workflow examples at you (many inspired by Anthropic's agent-building insights and lets you string agents into complex workflows with a slick set of Python decorators. That decorator-based vibe for defining chains, parallel runs, routers, and orchestrators is a big deal. It massively lowers the bar for cooking up sophisticated multi-agent systems. You focus on what you want (the logic, the agent skills) instead of the gnarly how (managing chats, state, errors). This screams rapid prototyping and quick iteration. And because it's MCP-native, these easy-to-build workflows can plug into a wider, standardized world of MCP tools and services without a fuss.
Key composition patterns (the decorator magic): fast-agent gives you these workflow-building decorators:
@fast.agent
: Your basic building block. Defines an agent with instructions, request params, tools, etc.@fast.chain
: Links agents in a sequence. Output of one feeds the next. Simple.@fast.parallel
: Fan-out/fan-in. Message hits multiple agents at once. Optionally, a fan-in agent mops up the combined results.@fast.evaluator_optimizer
: A duo: a generator (makes content/solutions) and an evaluator (judges it, gives feedback). They iterate, generator refines based on feedback, until a quality bar is met or it hits max tries.@fast.router
: Uses an LLM to look at a message and send it to the right agent from a list, based on their instructions.@fast.orchestrator
: For the big jobs. An LLM-powered orchestrator agent figures out a plan to spread work among available agents. Can plan it all upfront or go step-by-step.
Quick example: multi-lingual content factory with quality control Let's say you want product descriptions in English, French, and Spanish, and you want them good.
- English content writer agent (
@fast.agent
): Takes product details, writes a compelling English description. - Translator agents (
@fast.agent
for French,@fast.agent
for Spanish): Each takes English text, translates accurately. - Translation aggregator agent (
@fast.agent
): Takes French and Spanish translations, bundles them into a neat JSON object ({'french_text': '...', 'spanish_text': '...'}
). - Parallel translation workflow (
@fast.parallel
):- Fans out the English description to the French and Spanish translator agents simultaneously.
- The
translation_aggregator_agent
is the fan-in, collecting and structuring their outputs.
- Quality assurance evaluator agent (
@fast.agent
): Reviews the structured translations. Rates them (e.g., POOR, FAIR, GOOD, EXCELLENT). If not EXCELLENT, it spits out specific feedback. - Content refinement workflow (
@fast.evaluator_optimizer
):- The
generator
is ourparallel_translation_workflow
. - The
evaluator
is ourquality_assurance_evaluator_agent
. - It aims for a
min_rating
(e.g., "GOOD") with amax_refinements
(e.g., 2 retries).
- The
Flow:
- User gives product details.
english_writer_agent
(maybe first in a chain) makes the English version.- This goes to
content_refinement_workflow
. - Inside,
parallel_translation_workflow
runs: English text hits French and Spanish translators at the same time. - Their outputs go to
translation_aggregator_agent
. - This structured output is judged by
quality_assurance_evaluator_agent
. - If quality is meh (below GOOD), the
evaluator_optimizer
logic (conceptually) feeds the feedback and original input back toparallel_translation_workflow
for another shot. Rinse and repeat until quality is met or retries run out. The final, polished multi-lingual content (a dictionary) is the result. This shows fast-agent's declarative power: parallel processing for speed, an evaluator loop for quality, all strung together with decorators.
# fast-agent Code Snippet: Multi-Lingual Content Factory (Conceptual)
# Heads-up: This is conceptual. You'll need the fast-agent library
# and an LLM provider set up for this to be more than just fancy text.
# import fast_agent as fast # Or however the library is actually imported
# --- Agent Definitions ---
# Let's imagine 'fast' is our imported library access point.
# These would be decorated functions if 'fast' was a real imported object.
# English Content Writer
@fast.agent(instruction="You are an expert marketing copywriter. Write a compelling product description for the given product details.")
def english_writer_agent(product_details: str) -> str:
# Conceptual: LLM call to generate English description
print(f"[english_writer_agent] ACTION: Generating content for {product_details[:30]}...")
return f"Amazing English description for {product_details}!"
# French Translator
@fast.agent(instruction="Translate the provided English text accurately into French.")
def french_translator_agent(english_text: str) -> str:
# Conceptual: LLM call for French translation
print(f"[french_translator_agent] ACTION: Translating to French: {english_text[:30]}...")
return f"Incroyable description en Français pour {english_text}"
# Spanish Translator
@fast.agent(instruction="Translate the provided English text accurately into Spanish.")
def spanish_translator_agent(english_text: str) -> str:
# Conceptual: LLM call for Spanish translation
print(f"[spanish_translator_agent] ACTION: Translating to Spanish: {english_text[:30]}...")
return f"Increíble descripción en Español para {english_text}"
# Translation Aggregator
@fast.agent(instruction="Combine the French and Spanish translations into a structured JSON object with 'french_text' and 'spanish_text' keys.")
def translation_aggregator_agent(french_translation: str, spanish_translation: str) -> dict:
print(f"[translation_aggregator_agent] ACTION: Aggregating translations.")
return {"french_text": french_translation, "spanish_text": spanish_translation}
# Quality Assurance Evaluator
@fast.agent(instruction="Review the provided translations. Rate their quality as POOR, FAIR, GOOD, or EXCELLENT. If not EXCELLENT, provide specific feedback for improvement.")
def quality_assurance_evaluator_agent(translations: dict) -> dict:
# Conceptual: LLM call to evaluate and provide feedback
print(f"[quality_assurance_evaluator_agent] ACTION: Evaluating translations: {str(translations)[:50]}...")
# Mocking a good review for simplicity in conceptual flow
return {'quality_rating': "EXCELLENT", 'feedback': "Looks great!"}
# fast-agent Code Snippet: Multi-Lingual Content Factory (Conceptual - Continued)
# --- Workflow Definitions ---
# Parallel Translation Workflow
@fast.parallel(
name="multilingual_translation_service",
fan_out=[french_translator_agent, spanish_translator_agent], # Would be actual agent objects or names
fan_in=translation_aggregator_agent # Actual agent object or name
)
def parallel_translation_workflow(english_description: str) -> dict:
# Conceptual: fast-agent handles the parallel execution and fan-in.
# It would call french_translator_agent(english_description)
# and spanish_translator_agent(english_description) concurrently.
# Then, their results would be passed to translation_aggregator_agent.
print(f"[parallel_translation_workflow] STARTING for: {english_description[:30]}...")
french = french_translator_agent(english_description)
spanish = spanish_translator_agent(english_description)
aggregated = translation_aggregator_agent(french_translation=french, spanish_translation=spanish)
print(f"[parallel_translation_workflow] COMPLETED. Result: {str(aggregated)[:50]}...")
return aggregated
# Evaluator-Optimizer for Refinement
@fast.evaluator_optimizer(
name="refined_multilingual_content_generator",
generator=parallel_translation_workflow, # The parallel workflow is the generator
evaluator=quality_assurance_evaluator_agent,
min_rating="GOOD", # The rating to achieve from the evaluator_agent
max_refinements=2 # Max number of retries
)
def content_refinement_workflow(english_description: str) -> dict:
# Conceptual: fast-agent manages the loop.
# 1. Calls parallel_translation_workflow(english_description).
# 2. Passes its output to quality_assurance_evaluator_agent.
# 3. If rating < GOOD and refinements < max_refinements, uses feedback
# (conceptually, by re-prompting or guiding the generator) and retries.
print(f"[content_refinement_workflow] STARTING for: {english_description[:30]}...")
attempts = 0
max_refinements = 2 # From decorator
min_rating_map = {"POOR": 0, "FAIR": 1, "GOOD": 2, "EXCELLENT": 3} # Example rating scale
target_rating_score = min_rating_map["GOOD"] # From decorator
current_input_for_generator = english_description
final_result = {}
while attempts <= max_refinements:
print(f" Attempt {attempts + 1}")
generated_content = parallel_translation_workflow(current_input_for_generator)
evaluation = quality_assurance_evaluator_agent(generated_content)
current_rating_score = min_rating_map.get(evaluation['quality_rating'], -1)
if current_rating_score >= target_rating_score:
print(f" Met quality target with rating: {evaluation['quality_rating']}")
final_result = generated_content
break
else:
print(f" Quality target not met ({evaluation['quality_rating']}). Feedback: {evaluation['feedback']}")
# Conceptual: In a real scenario, feedback would be used to modify input for the generator
# For this simulation, we'll just re-run with original input if we haven't met quality.
# current_input_for_generator = f"{english_description} (Refinement attempt {attempts + 1} based on feedback: {evaluation['feedback']})"
final_result = generated_content # Store last attempt if all refinements fail
attempts += 1
if attempts > max_refinements:
print(" Max refinements reached.")
break
print(f"[content_refinement_workflow] COMPLETED. Final result: {str(final_result)[:50]}...")
return final_result
# --- Example Invocation (Conceptual) ---
# product_info = "Our new amazing SuperWidget 3000!"
# initial_english_content = english_writer_agent(product_info)
# final_multilingual_content = content_refinement_workflow(initial_english_content)
# print("\n--- FINAL OUTPUT ---")
# print(final_multilingual_content)
Mastra: TypeScript agents for the web-native world
Mastra rolls in as a "batteries-included" TypeScript framework for building, testing, and deploying your agentic apps. It’s gunning for a smooth ride from local dev to production, all on a single stack to dodge that nasty "glue code" headache. If your world is web and Node.js, Mastra wants to be your agent HMFIC (Head MF In Charge).
Agent and workflow definition in typescript:
Mastra's heart is its Agent
class. You set 'em up with a name, instructions (what to do), the model (OpenAI, Anthropic, take your pick), tools, and workflows they can run. A neat trick: instructions, model, tools, and workflows can be dynamic, set by functions that get a runtimeContext
. This means your agents can adapt on the fly based on what's happening around them.
Workflows in Mastra are for your multi-step dances and complex agent collabs. Primitives like createStep
and createWorkflow
help you define these, with schema validation often handled by Zod (a popular TypeScript validation library). This heavy TypeScript focus and a workflow syntax built to feel natural for JavaScript devs is a big play for the massive web dev community. It could seriously speed up how agentic AI gets jammed into web apps and enterprise systems already on the JS/TS stack, potentially opening the floodgates for devs who aren't Python gurus.
Hierarchical and sequential multi-agent moves: Mastra handles common multi-agent patterns:
- Sequential workflows:
createWorkflow
plus chaining steps with.then(nextStep).commit()
. If the output schema of one step matches the input of the next, data flows automatically. Clean. - Hierarchical multi-agent systems: The classic supervisor-sub-agent setup. Sub-agents get wrapped as "tools" that the supervisor agent calls. Mastra's
createTool
function can wrap a whole agent, making it callable by another boss agent.
Quick example: hierarchical blog post factory This is straight from Mastra's playbook: a Publisher agent bosses around a Copywriter and an Editor by treating them as tools.
- Define sub-agents (
CopywriterAgent
,EditorAgent
): Each is anAgent
instance with its own instructions and model. - Wrap sub-agents as tools (
copywriterTool
,editorTool
):- Use
createTool
for each. - Define
id
,description
,inputSchema
(e.g., forcopywriterTool
,{ topic: z.string() }
), andoutputSchema
(e.g.,{ copy: z.string() }
). - The
execute
function of the tool calls the respective agent's.generate()
method with the input.
- Use
- Define supervisor agent (
PublisherAgent
):- An
Agent
instance. - Instructions: "First, call copywriter to write copy on a topic. Then, call editor to edit it. Return final edited copy."
- Its
tools
property includescopywriterTool
andeditorTool
.
- An
Flow: PublisherAgent gets a topic ("Future of AI Agents"):
- Its LLM, guided by instructions, knows it needs content. Calls
copywriterTool
with the topic. copywriterTool
runs, callingcopywriterAgent.generate()
. Draft blog post comes back.- PublisherAgent's LLM takes this draft, calls
editorTool
. editorTool
runs, callingeditorAgent.generate()
. Edited post comes back.- PublisherAgent returns the final, polished copy. This clearly shows Mastra building hierarchies where a supervisor runs the show by calling specialized sub-agents as tools. Powerful stuff for breaking down big jobs.
// Mastra Code Snippet: Hierarchical Blog Post Generation (Conceptual TypeScript)
// Based on Mastra's documentation. Requires Mastra, AI SDK (e.g., @ai-sdk/anthropic), and Zod.
// --- File: src/mastra/agents/copywriter.ts (Conceptual) ---
// import { Agent } from "@mastra/core";
// import { anthropic } from "@ai-sdk/anthropic"; // Or your preferred model provider
// Mock Agent class for conceptual clarity if @mastra/core is not available
class Agent {
// constructor(config: { name: string; instructions: string; model: any; tools?: any }) {
// console.log(`Conceptual Mastra Agent '${config.name}' created.`);
// }
// async generate(prompt: string): Promise<{ text: string }> {
// console.log(`[${this.name}] Generating for prompt: ${prompt.substring(0,30)}...`);
// return { text: `Generated content for: ${prompt}` };
// }
// Allow any config for conceptual example
constructor(config: any) {
this.name = config.name;
console.log(`Conceptual Mastra Agent '${config.name}' created.`);
}
async generate(prompt: string): Promise<{ text: string }> {
console.log(`[${this.name}] Generating for prompt: ${prompt.substring(0,30)}...`);
return { text: `Generated content for: ${prompt}` };
}
name: string; // Add name property for the conceptual example
}
const copywriterAgent = new Agent({
name: "Copywriter",
instructions: "You are a copywriter agent that writes blog post copy.",
model: anthropic("claude-3-5-sonnet-20241022") // Example model
});
// --- File: src/mastra/agents/editor.ts (Conceptual) ---
// import { Agent } from "@mastra/core"; // Re-import or ensure scope
// import { openai } from "@ai-sdk/openai";
const editorAgent = new Agent({
name: "Editor",
instructions: "You are an editor agent that edits blog post copy.",
model: openai("gpt-4o-mini") // Example model
});
// Export them for use in tool definitions (conceptual)
// export { copywriterAgent, editorAgent };
// Mastra Code Snippet: Hierarchical Blog Post Generation (Conceptual TypeScript - Continued)
// --- File: src/mastra/tools/copywriterTool.ts (Conceptual) ---
// import { createTool } from "@mastra/core/tools";
// import { z } from "zod";
// import { copywriterAgent } from "../agents/copywriter"; // Actual import
// Mock createTool and Zod for conceptual clarity
const z = {
object: (schema: any) => ({ describe: (desc: string) => ({ ...schema, _description: desc }) }),
string: () => ({ _type: "string", describe: (desc: string) => ({ _type: "string", _description: desc }) })
};
const createTool = (config: any) => {
console.log(`Conceptual Mastra Tool '${config.id}' created.`);
return {
...config,
// Mock execution for conceptual flow
async execute({ context }: { context: any }) {
console.log(`[${config.id}] Tool execute called with context:`, context);
if (config.id === "copywriter-agent") {
const agentResult = await copywriterAgent.generate(`Create a blog post about ${context.topic}`);
return { copy: agentResult.text };
}
if (config.id === "editor-agent") {
const agentResult = await editorAgent.generate(`Edit the following: ${context.copy}`);
return { copy: agentResult.text }; // outputSchema wants 'copy'
}
return {};
}
};
};
const copywriterTool = createTool({
id: "copywriter-agent",
description: "Calls the copywriter agent to write blog post copy.",
inputSchema: z.object({ topic: z.string().describe("Blog post topic") }),
outputSchema: z.object({ copy: z.string().describe("Blog post copy") }),
// execute: async ({ context }: { context: { topic: string } }) => { // More specific context type
// const result = await copywriterAgent.generate(`Create a blog post about ${context.topic}`);
// return { copy: result.text };
// },
});
// --- File: src/mastra/tools/editorTool.ts (Conceptual) ---
// import { createTool } from "@mastra/core/tools"; // Re-import or ensure scope
// import { z } from "zod"; // Re-import or ensure scope
// import { editorAgent } from "../agents/editor"; // Actual import
const editorTool = createTool({
id: "editor-agent",
description: "Calls the editor agent to edit blog post copy.",
inputSchema: z.object({ copy: z.string().describe("Blog post copy to be edited") }),
outputSchema: z.object({ copy: z.string().describe("Edited blog post copy") }), // Changed from 'edited_copy' to 'copy'
// execute: async ({ context }: { context: { copy: string } }) => {
// const result = await editorAgent.generate(`Edit the following blog post only returning the edited copy: ${context.copy}`);
// return { copy: result.text }; // Ensure output key matches schema
// },
});
// Export tools (conceptual)
// export { copywriterTool, editorTool };
// Mastra Code Snippet: Hierarchical Blog Post Generation (Conceptual TypeScript - Final Part)
// --- File: src/mastra/agents/publisher.ts (Conceptual) ---
// Assume Agent, anthropic, copywriterTool, editorTool are imported or available in scope
// Previous conceptual definitions:
// const copywriterAgent = new Agent({ name: "Copywriter", ... });
// const editorAgent = new Agent({ name: "Editor", ... });
// const copywriterTool = createTool({ id: "copywriter-agent", ... });
// const editorTool = createTool({ id: "editor-agent", ... });
const publisherAgent = new Agent({
name: "PublisherAgent",
instructions: "You are a publisher. First, use the copywriterTool to write blog copy. Then, use the editorTool to edit it. Return only the final edited copy.",
model: anthropic("claude-3-5-sonnet-20241022"),
tools: { copywriterTool, editorTool } // Assigning the conceptually defined tools
});
// --- Conceptual Invocation ---
async function generateBlogPost(topic: string) {
console.log(`
--- Conceptual Mastra Workflow for Topic: ${topic} ---
`);
// In a real Mastra app, you'd use the agent's generate method or a workflow runner.
// This is a simplified simulation of the PublisherAgent's logic based on its instructions and tools.
console.log(`[PublisherAgent] Instructed to process topic: "${topic}"`);
// 1. Call copywriterTool (which internally calls copywriterAgent)
const draft = await copywriterTool.execute({ context: { topic } });
console.log(`[PublisherAgent] Received draft from copywriterTool: "${draft.copy.substring(0, 50)}..."`);
// 2. Call editorTool (which internally calls editorAgent)
const finalEditedCopy = await editorTool.execute({ context: { copy: draft.copy } });
console.log(`[PublisherAgent] Received final from editorTool: "${finalEditedCopy.copy.substring(0, 50)}..."`);
console.log(`
--- Final Edited Blog Post (Conceptual) ---
${finalEditedCopy.copy}
-------------------------------------------
`);
return finalEditedCopy.copy;
}
// To run the conceptual example (e.g., in a test file or main script):
// generateBlogPost("The Future of AI Agents");
CrewAI: For orchestrating role-playing, autonomous AI agent crews
CrewAI is all about making your AI agents wear different hats and work together like a well-oiled (or sometimes chaotic) team. You define agents with distinct roles, goals, and even backstories, then assemble them into "crews" to get complex tasks done. If you like thinking about agent collaboration like a director casting actors for a play, CrewAI will click with you.
Core bits (agents, tasks, crew) & the role-goal-backstory vibe: CrewAI stands on three legs:
- Agents: Your individual AI workers. Each gets:
role
: Their job title (e.g., 'Market Researcher', 'Code Ninja').goal
: What they're trying to achieve.backstory
: A bit of flavor text to shape their behavior and perspective. Surprisingly effective.llm
: The brain (your chosen LLM).tools
(optional): What they can use.allow_delegation
(optional): Can they pass the buck to other agents in the crew?
- Tasks: The actual assignments. Each includes:
description
: Detailed orders.expected_output
: What success looks like (format, content).agent
: Who's doing it.dependencies
(optional): Other tasks that gotta be done first. Builds your sequence.
- Crew: Brings agents and tasks together. Manages the collaboration and workflow.
The role-goal-backstory thing is central to CrewAI. It pushes you to give your agents personas. The idea? Clear roles, motivating goals, and contextual backstories make agents behave more coherently and effectively. It's an interesting way to map AI collaboration to human team dynamics, potentially making it easier for non-AI-experts to design agent systems.
Sequential vs. hierarchical action: CrewAI lets tasks flow in a couple of ways:
- Sequential process: Tasks run one after another, dictated by order or dependencies. Good for linear jobs where output of one feeds the next.
- Hierarchical process: A designated manager agent runs the crew. This boss can delegate, monitor, and even sign off on work before the crew moves on. More complex coordination and oversight.
Quick example: market research report (sequential grind) A team of specialists building a market report, step-by-step.
- Define Agents:
research_agent
: Role: 'Market Researcher', Goal: 'Find latest AI market trend articles/data for 2025', Backstory: 'Experienced web-surfing analyst'. Has a web search tool.analysis_agent
: Role: 'Data Analyst', Goal: 'Analyze research data for key insights, growth, challenges', Backstory: 'Quantitative guru'.report_writer_agent
: Role: 'Business Report Writer', Goal: 'Write concise report from analyzed insights', Backstory: 'Pro writer, makes complex stuff clear'.
- Define Tasks:
research_task
: Description: 'Find 5 recent articles, 3 data points on 2025 AI market trends', Expected Output: 'List of URLs, data summary', Agent:research_agent
.analysis_task
: Description: 'Analyze findings for top 3 insights, 2 growth areas, 1 challenge', Expected Output: 'Structured summary of insights/growth/challenges', Agent:analysis_agent
, Dependencies:[research_task]
.writing_task
: Description: 'Write 500-word report (intro, insights, growth, challenges, conclusion)', Expected Output: 'Formatted 500-word Markdown report', Agent:report_writer_agent
, Dependencies:[analysis_task]
.
- Assemble the Crew:
market_research_crew
: Agents:[research_agent, analysis_agent, report_writer_agent]
, Tasks:[research_task, analysis_task, writing_task]
, Process:Process.sequential
.
Flow:
When market_research_crew.kickoff()
is called:
research_task
runs (research_agent uses web search).- Its output (articles, data) feeds
analysis_task
(thanks to dependency). analysis_agent
chews on it, extracts insights.- Its output (insights summary) feeds
writing_task
. report_writer_agent
drafts the final report. Boom. A clear pipeline, specialized agents contributing in sequence. That's CrewAI for structured, multi-step teamwork.
# CrewAI Code Snippet: Market Research Report (Conceptual)
# Requires CrewAI, an LLM library (e.g., langchain_openai), and potentially tools like SerperDevTool.
from crewai import Agent, Task, Crew, Process
# from langchain_openai import ChatOpenAI # Example LLM
# from crewai_tools import SerperDevTool # Example tool, formerly from crewai.tools
# --- Conceptual Setup (Replace with your actual initializations) ---
# print("Conceptual: Initializing LLM and Tools...")
# llm = ChatOpenAI(model="gpt-4-turbo", api_key="YOUR_OPENAI_API_KEY") # Replace with your key/setup
# search_tool = SerperDevTool(api_key="YOUR_SERPER_API_KEY") # Replace with your key/setup
# For conceptual execution without live keys/dependencies:
class MockLLM:
def __init__(self, name="mock_llm"): self.name = name
def __repr__(self): return f"MockLLM({self.name})"
class MockTool:
def __init__(self, name="mock_tool"): self.name = name; self.description = "A mock tool."
def __repr__(self): return f"MockTool({self.name})"
def run(self, query):
return f"Mock tool results for: {query}"
llm = MockLLM()
search_tool = MockTool(name="WebSearchTool")
# --- End Conceptual Setup ---
# Define Agents with Roles, Goals, and Backstories
research_agent = Agent(
role='Market Researcher',
goal='Find the latest articles and data points on AI market trends for 2025',
backstory='An experienced analyst skilled in web research and data gathering. Your insights are crucial.',
llm=llm,
tools=[search_tool],
verbose=True,
allow_delegation=False # This agent works alone on its tasks
)
analysis_agent = Agent(
role='Data Analyst',
goal='Analyze the provided research data to identify key insights, growth areas, and potential challenges in the AI market',
backstory='A quantitative analyst specializing in market trend identification and statistical analysis. Precision is key.',
llm=llm,
verbose=True,
allow_delegation=False
)
report_writer_agent = Agent(
role='Business Report Writer',
goal='Write a concise and compelling market research report based on the analyzed insights and data',
backstory='A professional business writer skilled in communicating complex information clearly and effectively to an executive audience. Clarity and impact are paramount.',
llm=llm,
verbose=True,
allow_delegation=False
)
# CrewAI Code Snippet: Market Research Report (Conceptual - Continued)
# Define Tasks for the Agents
research_task = Task(
description=(
'Conduct thorough web research to find at least 5 recent (last 6 months) articles'
' and 3 significant data points regarding AI market trends projected for 2025.'
' Focus on credible sources and quantifiable data.'
),
expected_output=(
'A list of URLs for the articles and a concise summary of the data points found, including their sources.'
' Example: 1. url1 (Source: Forbes), 2. url2 (Source: Gartner)... Data: Metric X grew Y% (Source: Statista).'
),
agent=research_agent,
# human_input=False # Not typically set directly on task, managed by agent or execution params
)
analysis_task = Task(
description=(
'Analyze the research findings (articles and data points) provided by the Market Researcher.'
' Identify the top 3 key insights, 2 major growth areas, and 1 potential challenge for the AI market in 2025.'
' Provide brief justifications for each point.'
),
expected_output=(
'A structured summary in bullet points: \n'
'- Key Insight 1: [Insight] (Justification: ...) \n'
'- Growth Area 1: [Area] (Justification: ...) \n'
'- Potential Challenge 1: [Challenge] (Justification: ...)'
),
agent=analysis_agent,
dependencies=[research_task] # Depends on the research_task's output
)
writing_task = Task(
description=(
'Write a 500-word market research report. The report should incorporate the analyzed key insights, growth areas, and potential challenges.'
' Structure: Introduction, Key Findings (Insights, Growth, Challenges), Conclusion. Maintain a professional tone.'
),
expected_output=(
'A well-formatted 500-word market research report in Markdown format. Ensure all sections are covered and the tone is appropriate for executives.'
),
agent=report_writer_agent,
dependencies=[analysis_task] # Depends on the analysis_task's output
)
# Assemble the Crew with a Sequential Process
market_research_crew = Crew(
agents=[research_agent, analysis_agent, report_writer_agent],
tasks=[research_task, analysis_task, writing_task],
process=Process.sequential, # Tasks will be executed in the defined order based on dependencies
verbose=2 # 0 for no logging, 1 for basic, 2 for detailed
)
# Kick off the Crew's work (Conceptual Execution)
# print("\n--- Kicking off Conceptual CrewAI Market Research --- ")
# result = market_research_crew.kickoff()
# print("\n--- Conceptual CrewAI Market Research Completed ---")
# print("Final Report (Conceptual):")
# print(result) # The result of the last task in the sequence
LlamaIndex (Agentic Capabilities): For when your agents need to read... a lot
LlamaIndex, born from the fires of Retrieval Augmented Generation (RAG), has beefed up significantly. It's not just for RAG anymore; it's a full-blown framework for data-centric AI apps, and that includes multi-agent systems. Its agent features are laser-focused on workflows that wrestle with data and documents. If your agents need to be paper-pushers (in the digital sense), LlamaIndex is your huckleberry.
AgentWorkflow architecture: the guts LlamaIndex's multi-agent game revolves around its AgentWorkflow architecture. Two main parts:
- Agent Module: Base classes for agents.
FunctionAgent
: For LLMs that do function calling. Has methods liketake_step
(decide next action/tool),handle_tool_call_results
(process tool output),finalize
(end turn).ReActAgent
: For the ReAct (Reasoning and Acting) pattern, good for LLMs without native function calling. Thinks, acts, observes, repeats.- Both come from
BaseWorkflowAgent
.
- AgentWorkflow module: The conductor. Orchestrates agents and task flow.
init_run
: Sets up context and memory.setup_agent
: Figures out who's on duty, preps their system prompt and chat history.run_agent_step
: Calls current agent'stake_step
to see what tools to hit next.parse_agent_output
: Translates agent's desires into actions.call_tool
: Runs the tools.aggregate_tool_results
: Gathers tool results, decides next move (continue, handoff, or finish).
A slick move here: handoff between agents is just another tool. Agent wants to pass the baton? It calls the "handoff tool," which tells the workflow who's up next.
LlamaIndex didn't just stumble into agents. It evolved from a RAG powerhouse to a multi-agent framework with a serious focus on "Agentic Document Workflows" (ADW). This carves out a killer niche: automating complex knowledge work that's all about document interaction; legal analysis, medical records, financial audits. By building on its strengths in data indexing, parsing (LlamaParse is a beast), and retrieval (LlamaCloud), LlamaIndex is aiming to own the enterprise AI space where structured and unstructured document smarts are king. If your agents need to really understand and manipulate documents, LlamaIndex might offer more specialized firepower than general-purpose agent frameworks.
Agentic Document Workflows (ADW): Beyond basic ocr ADW isn't just glorified OCR or simple RAG. It’s about agents doing complex, multi-step knowledge work on top of your documents. Document agents in an ADW setup typically:
- Extract and structure info from docs (often via LlamaParse).
- Keep track of document context and where they are in a business process (state).
- Fetch and analyze relevant reference material from knowledge bases (like LlamaCloud).
- Generate recommendations or take actions based on business rules and what they've read. Think contract review: agent extracts clauses, checks against regulations, flags risks, generates compliance reports. Heavy-duty stuff.
Quick example: multi-agent research and report generation (research-write-review)
LlamaIndex docs show this off: AgentWorkflow managing three specialized FunctionAgent
s that hand off tasks sequentially, sharing state.
- Define Tools:
search_web
(e.g., via Tavily),record_notes
(writes to shared state),write_report
(writes to shared state),review_report
(writes review to shared state). These are Python functions wrapped withFunctionTool
. - Define Agents:
ResearchAgent
: System prompt: "Search web, record notes. Satisfied? Handoff to WriteAgent." Tools:search_web
,record_notes
. Can handoff toWriteAgent
.WriteAgent
: System prompt: "Write Markdown report from notes. Grounded in research. Done? Get feedback from ReviewAgent (at least once)." Tools:write_report
. Can handoff toReviewAgent
.ReviewAgent
: System prompt: "Review report. Approve or request changes for WriteAgent." Tools:review_report
. Can handoff toWriteAgent
.
- Setup AgentWorkflow: Provide the list of agents, name the
root_agent_name
(e.g., "ResearchAgent"), and set up aninitial_state
dictionary (e.g.,{"research_notes": {}, "report_content": "", "review": ""}
).
Flow: User wants a report on internet history.
- Workflow starts with
ResearchAgent
. ResearchAgent
searches, records notes into shared state. Hands off toWriteAgent
.- Workflow makes
WriteAgent
current. It uses notes from state, writes report to state. Hands off toReviewAgent
. ReviewAgent
checks report from state, saves review to state. Might hand back toWriteAgent
for fixes or, if happy, workflow ends. This nails how AgentWorkflow guides specialized agents through a multi-stage task, managing state and handoffs for complex jobs like research and writing.
# LlamaIndex Code Snippet: Research-Write-Review Workflow (Conceptual)
# Requires LlamaIndex core, and an LLM integration (e.g., llama-index-llms-openai)
import os
from llama_index.core.tools import FunctionTool
from llama_index.core.agent.types import Context # For tool context
# from llama_index.llms.openai import OpenAI # Example LLM
# from llama_index.tools.tavily_research import TavilyToolSpec # Example external tool
# --- Conceptual Setup: Mock LLM and Context for standalone execution ---
class MockLLM_LlamaIndex:
def __init__(self, model="mock_llama_model"): self.model = model
def __repr__(self): return f"MockLLM_LlamaIndex(model='{self.model}')"
# Add predict or chat methods if FunctionAgent internally calls them during init or planning
async def achat(self, messages, tools=None, tool_choice="auto"):
# Simulate LLM deciding to use a tool or respond
# This is highly simplified for conceptual flow
user_query = messages[-1].content.lower()
if "search" in user_query and tools and any(t.metadata.name == "search_web" for t in tools):
return type('obj', (object,), {
'message': type('obj', (object,), {
'tool_calls': [type('obj', (object,), {'id': 'call_123', 'function': type('obj', (object,), {'name': 'search_web', 'arguments': '{"query": "internet history"}'})})]
})
})
# Add more mock logic for other tools/handoffs if needed for deeper simulation
return type('obj', (object,), {'message': type('obj', (object,), {'content': 'LLM fallback response', 'tool_calls': []})})
llm = MockLLM_LlamaIndex() # Replace with actual LLM: e.g., OpenAI(model="gpt-4-turbo")
async def mock_get_state(key):
if key == "state":
# Ensure state is initialized conceptually for tools
if not hasattr(mock_get_state, '_shared_state'):
mock_get_state._shared_state = {"research_notes": {}, "report_content": "", "review": ""}
return mock_get_state._shared_state
return None
async def mock_set_state(key, value):
if key == "state":
mock_get_state._shared_state = value
MockContext = type('MockContext', (object,), {'get': mock_get_state, 'set': mock_set_state})
ctx_instance = MockContext()
# --- End Conceptual Setup ---
# --- Tool Definitions ---
def search_web_func(query: str) -> str:
print(f"[TOOL search_web_func] ACTION: Searching web for: '{query}'. (Mocked)")
return f"Mocked search results for: {query}"
search_web = FunctionTool.from_defaults(fn=search_web_func, name="search_web", description="Searches the web for information on a given topic.")
async def record_notes_func(notes: str, notes_title: str, ctx: Context = ctx_instance) -> str:
print(f"[TOOL record_notes_func] ACTION: Recording notes titled '{notes_title}'.")
current_state = await ctx.get("state") # type: ignore
if "research_notes" not in current_state:
current_state["research_notes"] = {}
current_state["research_notes"][notes_title] = notes
await ctx.set("state", current_state) # type: ignore
return f"Notes '{notes_title}' recorded successfully."
record_notes = FunctionTool.from_defaults(fn=record_notes_func, name="record_notes", description="Useful for recording notes on a given topic into shared state.")
async def write_report_func(report_content: str, ctx: Context = ctx_instance) -> str:
print(f"[TOOL write_report_func] ACTION: Writing report content.")
current_state = await ctx.get("state") # type: ignore
current_state["report_content"] = report_content
await ctx.set("state", current_state) # type: ignore
return "Report content written successfully to shared state."
write_report = FunctionTool.from_defaults(fn=write_report_func, name="write_report", description="Useful for writing a report on a given topic to shared state.")
async def review_report_func(review: str, ctx: Context = ctx_instance) -> str:
print(f"[TOOL review_report_func] ACTION: Submitting review.")
current_state = await ctx.get("state") # type: ignore
current_state["review"] = review
await ctx.set("state", current_state) # type: ignore
return "Report review submitted successfully to shared state."
review_report = FunctionTool.from_defaults(fn=review_report_func, name="review_report", description="Useful for reviewing a report and providing feedback into shared state.")
# LlamaIndex Code Snippet: Research-Write-Review Workflow (Conceptual - Agent Definitions)
# Assumes tools (search_web, record_notes, etc.) and llm (MockLLM_LlamaIndex instance)
# are conceptually available from the previous snippet.
from llama_index.core.agent.workflow import FunctionAgent # Core import
# --- Agent Definitions ---
# llm = MockLLM_LlamaIndex() # Conceptually, llm is already defined.
# Tools like search_web, record_notes, write_report, review_report are also assumed defined.
ResearchAgent = FunctionAgent(
name="ResearchAgent",
description="Useful for searching the web for information on a given topic and recording notes on the topic.",
system_prompt=(
"You are the ResearchAgent. Your job is to search the web for information on a given topic "
"and then use the record_notes tool to save your findings. "
"Once notes are recorded and you are satisfied, you MUST hand off control to the WriteAgent."
),
llm=llm, # Using the conceptually defined mock LLM
tools=[search_web, record_notes],
can_handoff_to=["WriteAgent"],
verbose=True
)
WriteAgent = FunctionAgent(
name="WriteAgent",
description="Useful for writing a report on a given topic from research notes stored in shared state.",
system_prompt=(
"You are the WriteAgent. Your task is to write a report in markdown format using the research notes. "
"The content should be grounded in those notes. Once the report is written using the write_report tool, "
"you MUST hand off to the ReviewAgent to get feedback at least once."
),
llm=llm,
tools=[write_report],
can_handoff_to=["ReviewAgent"],
verbose=True
)
ReviewAgent = FunctionAgent(
name="ReviewAgent",
description="Useful for reviewing a report and providing feedback, then saving the review to shared state.",
system_prompt=(
"You are the ReviewAgent. Review the report provided in shared state. Use the review_report tool to submit your feedback. "
"Your feedback should either approve the current report or request specific changes for the WriteAgent to implement. "
"If changes are needed, hand off back to WriteAgent. If approved, the workflow can conclude."
),
llm=llm,
tools=[review_report],
can_handoff_to=["WriteAgent"], # Can hand back for revisions
verbose=True
)
# print("Conceptual LlamaIndex Agents (ResearchAgent, WriteAgent, ReviewAgent) defined.")
# LlamaIndex Code Snippet: Research-Write-Review Workflow (Conceptual - Workflow Setup)
# Assumes Agents (ResearchAgent, WriteAgent, ReviewAgent) are conceptually defined.
from llama_index.core.agent import AgentWorkflow
# --- AgentWorkflow Setup ---
# Agents ResearchAgent, WriteAgent, ReviewAgent are assumed to be defined from previous snippet.
# llm and initial_state_dict are also assumed to be conceptually available.
initial_state_dict = {"research_notes": {}, "report_content": "", "review": ""} # As per example
workflow = AgentWorkflow(
agents=[ResearchAgent, WriteAgent, ReviewAgent],
root_agent_name="ResearchAgent", # First agent to run
initial_state=initial_state_dict,
# llm=llm # Workflow might take an LLM for its own decisions if any
)
print("Conceptual LlamaIndex AgentWorkflow defined.")
# --- Conceptual Invocation ---
async def run_llama_workflow():
print("\n--- Kicking off Conceptual LlamaIndex Workflow ---")
# In a real scenario, you'd call workflow.run() or an async equivalent
# with the initial input/task.
# E.g., response = await workflow.arun(input="Write a report on internet history")
print(f"Conceptual workflow response: {response}")
print(f"Final state: {workflow.get_state()}") # To inspect final shared state
print("Conceptual LlamaIndex workflow finished (mocked). Inspect mock_get_state._shared_state to see results.")
# To run conceptually (if using asyncio for async tools/agents):
import asyncio
asyncio.run(run_llama_workflow())
# For a simpler synchronous conceptual idea, you might imagine triggering the root agent:
ResearchAgent.run_step(task_id="task123", input="Write report on internet history")
# ... but AgentWorkflow handles this orchestration.
Framework Face-off - The Nitty-Gritty Comparison
Alright, you've seen the contenders. Now, let's throw them in the ring. Picking a framework isn't just about cool features; it's about what fits your project, your stack, and your team's sanity. Each one of these bad boys has its own flavor, its own way of doing things.
Key differentiators: What makes 'em tick
When you squint, a few big differences pop out:
-
Programming paradigm & language: This is ground zero.
- Python-heavyweights: LangGraph, Microsoft Autogen, OpenAI's SDKs, CrewAI, LlamaIndex. No surprise, Python owns the AI/ML space.
- TypeScript/JavaScript corner: Mastra is waving this flag, targeting the web dev massive. Think agents in Node.js or even browsers.
- Declarative vs. code-first: Some, like fast-agent (with its decorators) or CrewAI (with its YAML configs for agents/tasks), let you declare workflows at a high level. Others; LangGraph, Autogen (especially its Core layer), Mastra (core agent setup), LlamaIndex; are more about getting your hands dirty with code.
-
Orchestration style: How do they herd the cats?
- Explicit graph-based: LangGraph and LlamaIndex AgentWorkflow draw you a map. Clear flow, clear state changes.
- Event-driven/conversational: Autogen is king here. Async messaging, dynamic group chats. Wild.
- Handoff-driven: OpenAI Swarm and the new Agents SDK are big on agents explicitly passing the baton.
- Role-based & process-driven: CrewAI wants you to think in terms of human-like roles and run plays (sequential or hierarchical).
- Workflow decorators: fast-agent gives you high-level shortcuts for common patterns.
- Tool-based hierarchy: Mastra likes supervisors using sub-agents as tools.
-
State management: How do they remember stuff?
- Explicit state objects: LangGraph is all about a defined state object that gets passed and tweaked.
- Message history & context passing: Autogen, CrewAI, OpenAI Swarm lean on chat history as the main state carrier.
- Scoped/shared context: LlamaIndex AgentWorkflow has a shared context/state agents read from and write to.
- Dedicated memory systems: Mastra talks about more advanced memory systems. Don't kid yourself: orchestration style and state management are joined at the hip. Graph-based systems? Naturally lead to clean state objects that follow the graph. Dynamic, chatty systems? State is usually in the message history. You gotta evaluate these two together.
-
Tool integration: How do they use external powers?
- Standardized via Protocol: MCP-based stuff like fast-agentms for plug-and-play with any MCP tool server. LangGraph also has MCP adapters. Good sign.
- Framework-specific tools: Most have their own way. LangChain's huge toolset (for LangGraph), Autogen's function calling, OpenAI's built-in tool use, Mastra's tool definitions, CrewAI's tool assignments, LlamaIndex's tool abstractions. Lots of different wrenches.
Strengths and weaknesses: picking your weapon for the job
No silver bullets here. Different frameworks for different fights:
- LangGraph: Best for complex, stateful workflows where you need to see every step (enterprise processes, finance). The explicitness can be a climb for simple agents.
- Microsoft Autogen: Shines for dynamic, multi-agent chats and collaboration (research buddies, brainstorming, co-coding). Managing super-strict, deterministic workflows needs careful design.
- OpenAI Swarm & new agent tools: Obvious choice if you're deep in the OpenAI cult. Streamlined integration, clear handoffs. Downside? Potential OpenAI ecosystem lock-in, though the new SDK says it'll play nice with others.
- MCP & fast-agent: Interoperability via MCP is the big win; tap into a standard tool ecosystem. fast-agent's decorators make defining complex workflows quick. Your fate depends on good MCP servers being available.
- Mastra: The go-to for TypeScript/JavaScript shops. Great for baking agents into web apps, Node.js backends. Good for structured workflows and agent-as-tool hierarchies.
- CrewAI: Intuitive if you think in human team structures (Role-Goal-Backstory). Good for users who want to define agent collaboration using familiar roles. Hierarchical process is a plus for managed delegation.
- LlamaIndex (agentic capabilities): King of document-heavy agent workflows. Leverages its RAG, parsing (LlamaParse!), and indexing might. Perfect for automating knowledge work in legal, medical, finance. Overkill if your agents don't care about docs.
See the pattern? No single framework rules them all, at least not yet. It's all about specialization. LlamaIndex is great for document-centric agents. Autogen is the conversational wizard. This means big projects might need a franken-stack of frameworks. Which screams for interoperability standards (like MCP) so these different agent systems can actually talk to each other.
Ease of dev & scalability: how fast can you build, how big can you go?
-
Ease of development:
- Abstraction level: fast-agent (decorators), Autogen Studio, CrewAI UI Studio offer higher abstraction. Faster for common patterns or less techy users. Code-first like LangGraph gives more control but steeper learning curve.
- Language & ecosystem: Python comfort? Mastra for TypeScript fans? Richness of surrounding ecosystem (e.g., LangChain's integrations for LangGraph) matters.
- Docs & community: Good docs and an active community save your ass when you're stuck.
-
Scalability: Can it handle more agents, crazier interactions, bigger data, faster throughput?
- Architectural design: Async architectures (like Autogen) are built for scale; handle many ops without choking.
- State management: Efficient state is vital. Frameworks that can offload state to scalable stores or have smart in-memory management win. Sloppy state handling is a bottleneck.
- Distributed execution: Spreading agents across processes/machines is key for real scale. Autogen's
GrpcWorkerAgentRuntime
is one example of this thinking.
Bottom line on scalability: frameworks built for async ops, distributed execution, and serious state/memory management (especially with external scalable storage) are your enterprise-grade bets. Scalability isn't just agent count; it's the complexity and length of their chats and the data they chew on.
Table: Comparative overview of agent composition frameworks
to boil it all down, here’s a cheat sheet. don't take it as gospel; your mileage will vary.
Feature | LangGraph | Microsoft Autogen | OpenAI Swarm/Agents SDK | fast-agent | Mastra | CrewAI | LlamaIndex (Agentic) |
---|---|---|---|---|---|---|---|
Primary Language | Python | Python, .NET | Python (SDK initially) | Python | TypeScript | Python | Python |
Core Orchestration | Explicit Graph, State Machine | Event-Driven, Conversational | Handoff-driven | Workflow Decorators (MCP-native) | Programmatic, Tool-based Hierarchy | Role-based, Process-driven (Seq/Hier) | Explicit Graph (AgentWorkflow) |
State Management | Explicit State Object | Message History, Context | Context Variables, Message History | MCP-managed (implicit) | Memory Systems, Context | Context Passing between Tasks | Shared Context/State in Workflow |
Tool Integration | LangChain Tools, Custom, MCP Adapters | Function Calling, Custom, MCP Workbench | OpenAI Tools, Custom Functions | MCP Servers (native) | Custom Tools, MCP (planned/possible) | Custom Tools | Function Calling, Custom Tools |
Key Multi-Agent Patterns | Supervisor, Hierarchical, Network, Swarm | Dynamic Group Chat, Nested Chat, Hierarchical | Handoffs, Tiered Delegation | Chain, Parallel, Router, Orchestrator | Sequential, Hierarchical (Agent-as-Tool) | Sequential, Hierarchical | Sequential Handoffs, Supervisor-like |
Primary Strengths | Stateful, auditable enterprise workflows | Dynamic multi-agent collaboration, code-gen | OpenAI ecosystem, clear handoffs | Interoperability (MCP), rapid workflow dev | TypeScript/Web env, structured workflows | Intuitive role-based design, human-like teams | Document-intensive workflows (ADW) |
This table gives you the high-altitude view. The "best" framework is the one that gets your job done without making you want to flip your desk.
Architectural blueprints - common agent composition patterns
While frameworks bring their own syntactic sugar, the multi-agent systems they build often follow some battle-tested architectural patterns. These are the fundamental ways to organize agents, control their chatter, and divvy up the work. Understanding these patterns helps you see the forest for the trees, regardless of which shiny new framework you're playing with.
Foundational patterns: the building blocks
These keep popping up everywhere:
-
Vertical/hierarchical pattern: (The Boss Hog)
- Description: A leader/supervisor agent calls the shots, overseeing sub-agents or sub-tasks. Clear chain of command.
- Pros: Accountability is clear. Easy to break down tasks. Central control keeps things aligned.
- Cons: Leader can be a bottleneck. Single point of failure. Sub-agents might feel micromanaged (less autonomy).
- Seen In: LangGraph (Supervisor, Hierarchical), Autogen (hierarchical chat), OpenAI Agents SDK (TriageAgent example), Mastra (Publisher-Worker), CrewAI (Hierarchical process), LlamaIndex (supervisor-like workflows).
-
Horizontal/decentralized/network pattern: (The Round Table)
- Description: Agents are peers. Decisions are often group-driven. Freer interaction.
- Pros: Fosters dynamic problem-solving, innovation. Good for parallel work.
- Cons: Coordination can be a beast, leading to slowdowns. Group decisions can take forever.
- Seen In: LangGraph (Network), Autogen (Dynamic Group Chat).
-
Sequential/pipeline pattern: (The Assembly Line)
- Description: Agents in a line. Output of one is input for the next. Each agent owns a stage.
- Pros: Simple to grasp, build, and debug. Good audit trail because the flow is linear.
- Cons: Can be rigid. A failure at one stage kills the whole line.
- Seen In: LangGraph (explicit edges), fast-agent (
@fast.chain
), CrewAI (Sequential process), Mastra (sequential workflow with.then()
).
-
Orchestrator-Worker (master-worker) pattern: (The General Contractor)
- Description: Central orchestrator doles out tasks to worker agents, manages execution. Often a specific type of hierarchical setup.
- Event-driven twist: Use something like Apache Kafka. Orchestrator drops command messages on topics. Worker agents (as consumer groups) grab tasks. Decouples orchestrator from direct worker nagging.
- Pros: Efficient task delegation, central coordination. Workers can specialize. Event-driven adds resilience, scalability.
- Cons: Orchestrator can be a bottleneck if not built for speed or if it gets too complex.
-
Router pattern: (The Traffic Cop)
- Description: A dedicated router agent gets tasks/messages, sends them to specialized agents based on content, rules, or learned logic.
- Pros: Manages complexity by getting tasks to the right specialist. Organizes the system.
- Cons: Router can be a bottleneck or single point of failure. Routing logic better be solid.
- Seen In: fast-agent (
@fast.router
), LangGraph (conditional edges can do this).
-
Aggregator pattern: (The Synthesizer)
- Description: Multiple agents do their thing independently. Their outputs are collected and combined by an aggregator agent into one final result.
- Pros: Blends diverse perspectives or processing results. Good when you need multiple viewpoints or parallel work followed by integration.
- Cons: Aggregation logic can be tricky, especially with conflicting or varied outputs.
- Seen In: fast-agent (
@fast.parallel
with afan_in
agent).
-
Blackboard pattern: (The Shared Whiteboard)
- Description: Agents collaborate indirectly by reading/writing to a shared knowledge base (the "blackboard"). No direct agent-to-agent chat needed.
- Event-Driven Twist: Blackboard can be a data streaming topic (Kafka again). Agents produce/consume messages representing shared info.
- Pros: Loose coupling between agents. Good for problems solved incrementally by diverse specialists.
- Cons: Keeping the blackboard consistent and avoiding write-wars can be tough.
-
Market-based pattern: (The Auction House)
- Description: Models a marketplace. Agents negotiate, bid, or compete for tasks/resources. For dynamic resource allocation, distributed decisions.
- Event-Driven Twist: Separate topics for bids and asks. A market maker service matches them, publishes deals.
- Pros: Highly adaptive, can lead to efficient resource use.
- Cons: Designing good bidding/negotiation protocols is complex.
How Frameworks Enable These Patterns
The frameworks we’ve dissected give you the lego bricks to build these patterns:
- LangGraph: Its explicit graph (nodes as agents/functions, edges as control flow, conditional edges for routing) directly builds Sequential, Hierarchical (Supervisor), Router, and Network patterns. State management is key for passing info.
- Microsoft Autogen:
ConversableAgent
and dynamic group chats are prime for Horizontal/Decentralized/Network. Nested chats and function calling let you whip up Hierarchical and Orchestrator-Worker structures in a conversational style. - OpenAI Swarm/Agents SDK: The handoff mechanism is a straight shot to Sequential flows (A to B to C) and Hierarchical/Router patterns (TriageAgent sending to specialists).
- fast-agent: Its decorators (
@fast.chain
,@fast.parallel
,@fast.router
,@fast.orchestrator
,@fast.evaluator_optimizer
) are high-level maps to Sequential, Aggregator (with fan-in), Router, Orchestrator-Worker, and even iterative refinement loops. - Mastra: Workflows with
.then()
make Sequential easy. Treating agents as tools enables Hierarchical (supervisor-worker) setups, like its Publisher-Copywriter-Editor example. - CrewAI: Explicit
process='sequential'
orprocess='hierarchical'
for crews directly implements these. Task dependencies also force sequential execution. - LlamaIndex:
AgentWorkflow
(with its root agent and tool-based handoffs) builds Sequential and Hierarchical pipelines of agents acting on shared state.
Here’s the kicker: many "framework-specific" multi-agent patterns are just fancy versions of these fundamental blueprints. LangGraph’s "Supervisor"? Clear Hierarchical/Orchestrator-Worker. fast-agent’s @fast.router
? That’s the Router pattern. Knowing this helps you cut through the framework jargon, understand the core architectural choices, and maybe even translate designs between frameworks. This is about principled design, not just picking the shiniest tool.
The bleeding edge - challenges and what's next
So, we've got all these cool frameworks and patterns. Are we living in an AI agent utopia yet? Keep dreaming. Building robust, scalable, and reliable multi-agent systems is still a knife fight. And the future? It's a mix of exciting and terrifying.
Current headaches: why this is still hard
Deploying real multi-agent systems means wrestling with some serious demons:
- Standardization & interoperability: Or lack thereof. It's a Tower of Babel. Agents from different frameworks or providers can't easily chat. Kills reusability, cripples scalability of big, diverse agent networks. MCP is a good try for tools, but we need more.
- Scalability: More agents, crazier interactions? Performance and resource management become a nightmare. Think communication overhead, state syncing, bottlenecked central agents.
- Debugging & observability: LLM agents are autonomous, often non-deterministic. Multi-agent setups? Exponentially harder to debug. Tracing interactions, finding root causes, monitoring performance across a distributed mess; it's brutal.
- Evaluation: Current benchmarks are mostly crap. They look at task completion, ignore cost, reproducibility, robustness, safety, real-world practicality. We need holistic, standard, realistic ways to measure these things.
- Reasoning flaws & context amnesia: LLMs still suck at complex logic, causality, math. Agents forget crucial info over long chats or many turns, losing coherence.
- Tool use fumbles: Frameworks offer tool integration, but agents still struggle to pick the right tool, format inputs correctly, interpret outputs reliably, and handle tool failures or changes.
- Security & privacy: Agents touch sensitive data, external systems. Robust security, access control, user privacy are non-negotiable. Preventing misuse, breaches, rogue agent actions is critical.
- Ethics & alignment: Making sure agent goals and actions line up with human values? Yeah, that's a big one. Designing for fairness, transparency, accountability, and actual benefit is an ongoing war.
These problems are all tangled up. No standards? Harder to build debug tools. Reasoning errors in a complex agent web? Good luck finding the source without solid tracing. This means solutions need to be multi-pronged. Fix one area, and it might help others. But it's a slog.
Emerging Trends & Future Stargazing: Where this madness is headed
The agent composition field is a blur, but some trends are emerging from the chaos:
- Evolvable & layered protocols: Static, rigid communication rules are out. We need protocols that adapt as agents get smarter. Layered architectures for transport, messaging, semantics.
- Agent mesh / network protocols: Think Agent Network Protocol (ANP), Agent-to-Agent (A2A). Designing for seamless, scalable collaboration between diverse agents from different creators, forming an "agent mesh."
- Privacy-preserving tech: Agents + sensitive data = need for federated learning for agents, secure multi-party computation, differential privacy baked into communication and design.
- Smarter reasoning architectures: Hybrid systems (neural + symbolic), specialized reasoning modules (math, legal), meta-reasoning (agents reflecting on their own thinking) to bust through current LLM limits.
- Long-Term memory & next-level context: Hierarchical memory, episodic memory (like humans), better retrieval-augmented setups so agents can remember what they did last week and why.
- Human-agent collaboration that doesn't suck: Mixed-initiative models (humans and AI fluidly sharing control), explainable AI (XAI for agents so we know why they did that), personalized agent behavior.
- Collective intelligence infrastructures: The big dream. Massive networks of specialized agents tackling systemic problems too big for individual agents or even human teams. Needs robust discovery, coordination, knowledge sharing.
- Real-World evaluation: Standardized, holistic benchmarks that measure agents on accuracy, cost, safety, robustness, and if they're actually useful in the wild.
- Event-driven architectures for MAS: Using event-driven principles (often with Kafka) for resilient, scalable, decoupled multi-agent systems. Agents react to events asynchronously. Cleaner, faster.
The writing's on the wall: this isn't just about smarter individual agents. It's about building a more robust, interconnected, intelligent ecosystem for them. Standardized communication, better building tools, and ensuring they learn, adapt, and operate safely and ethically. The goal? Moving from collections of agents to actual collective intelligence that can tackle problems we can't even imagine yet. This needs new theories, badass tools, and rock-solid infrastructure.
The bottom line - wrapping this up & getting tactical
We've ripped apart the guts of AI agent architecture frameworks. It's a wild, fast-moving space, crucial for pushing AI beyond simple task-doers into autonomous, collaborative systems that can actually chew on complex, multi-step problems. If you're not thinking about this, you're already behind.
Key findings: what we've learned (or should have)
Dissecting LangGraph, Autogen, OpenAI's new toys, MCP-based systems like fast-agent, Mastra, CrewAI, and LlamaIndex has hammered home a few truths:
- Orchestration is everything: Coordinating specialized agents is job #1. Frameworks differ wildly, from explicit graphs (LangGraph, LlamaIndex) to dynamic chats (Autogen) and role-playing (CrewAI).
- State management is king (or queen): Agents need to remember shit. How frameworks handle context and memory—explicit state objects, message histories, dedicated memory—is make-or-break for coherence and long tasks.
- Tools unleash action: Agents with tools are agents that can do things in the real world (or at least, the digital one). Standardizing tool integration (looking at you, MCP) is a game-changer.
- Modularity & specialization tame complexity: Break down big problems for specialist agents. Hierarchies, teams; it’s how humans solve hard stuff, and it works for AI too. Makes systems easier to build and fix.
- Dev experience & abstraction levels vary massively: Python vs. TypeScript. Code-first deep dives vs. declarative decorators or visual UIs. Pick your poison based on your team and project.
The big trend? Multi-agent systems are the future. Single agents are for hobbyists. Real-world problems demand teams of specialized AI agents. This puts a massive spotlight on solid orchestration, clean communication, and smart state management.
Guidance: how to pick your poison (wisely)
Choosing an agent composition framework is a big architectural bet. It’ll dictate your development speed, system power, and how much you’ll want to tear your hair out later. There's no magic bullet. The right choice comes from brutally honest assessment of your project's needs.
Key factors to chew on before you commit:
-
Project demands & complexity level:
- Task nature: Is it a super-structured, auditable beast (think LangGraph)? A dynamic, chatty free-for-all (Autogen)? A document-heavy slog (LlamaIndex)? Or does it map nicely to a human team (CrewAI)?
- Statefulness: How vital is remembering complex state over long interactions? Frameworks with explicit, robust state management (LangGraph again) have the edge.
- Auditability vs. flexibility: Need a clear paper trail (graphs)? Or more dynamic freedom?
-
Your team's DNA & existing stack:
- Language wars: Python is king, but Mastra is there for TypeScript/JavaScript crews.
- Dev skills: Does your team get graph theory? Event-driven architectures? Specific LLM provider APIs?
- Existing infrastructure: How well does the framework plug into your current databases, messaging systems, cloud crap?
-
Interoperability; can it play nice?
- If your system needs to talk to a zoo of external tools or agents from different vendors, frameworks built on or supporting standards like MCP (fast-agent, LangGraph with MCP adapters) are a smarter bet.
-
Scalability; will it choke?
- For systems that need to handle tons of agents or high traffic, look at async architectures (Autogen) or those built for distributed execution and smart state handling.
-
Lifecycle management & production readiness:
- Think beyond dev. How’s the support (or ecosystem) for deployment, monitoring, debugging, versioning, ongoing evaluation? Platforms like Orq.ai are trying to solve this. Guardrails and built-in tracing (OpenAI Agents SDK) are good signs of production focus.
Specific gut-check recommendations:
- Highly stateful, auditable enterprise workflows needing tight control: LangGraph.
- Dynamic multi-agent collaboration, research, code-gen in a chatty style: Microsoft Autogen.
- Deep in the OpenAI ecosystem, need streamlined tools & clear handoffs: OpenAI Agents SDK.
- Rapid workflow building & MCP interoperability: fast-agent with its decorators.
- TypeScript-first development (web apps, Node.js): Mastra.
- Tasks that map to human teams, needing role-based agent design: CrewAI.
- Document-heavy agent workflows needing deep interaction with structured/unstructured docs: LlamaIndex & its ADW.
In complex fields like finance, start with hybrid architectures blending pattern strengths. Match patterns to business goals (e.g., sequential for audit trails, network for trading). Build complexity slowly. Prioritize observability for tracing. Design for human-AI collaboration. And balance agent autonomy with strict controls and safeguards; don't let your financial bots go rogue!
Picking an agent framework is a foundational move. Given how fast this field is changing and the different strengths of current frameworks, you might even need a portfolio strategy; different frameworks for different problems. The absolute crucial play? Prioritize frameworks that are leaning into emerging interoperability standards. That’s your ticket to future flexibility and avoiding a siloed, incompatible agent mess.
The road to truly intelligent, collaborative AI agent ecosystems is still being paved. Your choice of composition tools will massively shape how, and if, you get there. Choose wisely. Now, go build something that doesn't suck.