Run this notebook: Open in Colab Open in Kaggle

# Install required packages
# !pip install langchain langchain-openai langgraph chromadb

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain.tools import Tool
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationBufferMemory

load_dotenv()
print("✅ Setup complete")

Part 1: Framework Overview¶

Popular Agent Frameworks¶

Framework	Best For	Learning Curve
LangChain	General-purpose agents, quick prototypes	Easy
LangGraph	Complex workflows, cyclic graphs	Medium
CrewAI	Multi-agent teams, role-based agents	Easy
AutoGPT	Autonomous task completion	Medium
Custom	Full control, specific requirements	Hard

When to Use Each¶

LangChain:

✅ Quick prototypes
✅ Standard agent patterns
✅ Rich ecosystem of tools
❌ Limited control over agent loop

LangGraph:

✅ Complex state machines
✅ Cyclic workflows
✅ Human-in-the-loop
❌ More boilerplate code

Custom Implementation:

✅ Full control
✅ Optimized for specific use case
❌ More development time

Part 2: LangChain Agents¶

Building Your First LangChain Agent¶

LangChain abstracts the agent loop into two core components: an Agent (the LLM reasoning engine that decides which tool to call) and an AgentExecutor (the runtime that manages the observe-think-act cycle, handles tool dispatch, and enforces iteration limits). The create_openai_functions_agent constructor wires the LLM to OpenAI’s native function-calling API, so the model returns structured JSON tool invocations rather than free-text that must be parsed with brittle regex.

How LangChain Agents Work¶

The executor runs a loop: (1) pass the conversation plus a scratchpad of prior tool calls/results to the LLM, (2) if the LLM returns a function call, execute the matching Tool and append the result to the scratchpad, (3) repeat until the LLM returns a plain text answer or max_iterations is reached. This is functionally equivalent to the ReAct pattern from Notebook 03, but the framework handles prompt formatting, output parsing, and error recovery. The Tool wrapper maps a Python callable to a name and natural-language description that the LLM uses to decide when invocation is appropriate, making tool registration a one-liner rather than a manual JSON schema.

# Initialize LLM
llm = ChatOpenAI(model="gpt-4", temperature=0)

# Define tools
def get_word_length(word: str) -> int:
    """Returns the length of a word."""
    return len(word)

def multiply_numbers(a: float, b: float) -> float:
    """Multiply two numbers together."""
    return a * b

# Create LangChain tools
tools = [
    Tool(
        name="get_word_length",
        func=get_word_length,
        description="Get the length of any word. Input should be a single word."
    ),
    Tool(
        name="multiply",
        func=lambda x: multiply_numbers(*map(float, x.split(','))),
        description="Multiply two numbers. Input should be two numbers separated by comma, e.g., '5,3'"
    )
]

print(f"✅ Created {len(tools)} tools")

# Create agent prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to tools."),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

# Create agent
agent = create_openai_functions_agent(llm, tools, prompt)

# Create agent executor
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    max_iterations=5
)

print("✅ LangChain agent ready")

# Test the agent
response = agent_executor.invoke({
    "input": "What is the length of the word 'LangChain' multiplied by 3?"
})

print(f"\n🤖 Agent Response: {response['output']}")

Real-World Example: Research Agent¶

A research agent demonstrates how multiple tools compose to answer questions that no single tool could handle alone. The agent below has access to Wikipedia for factual retrieval, a calculator for arithmetic, and a date tool for temporal context. When the user asks a compound question like “How many years ago was Python created?”, the agent autonomously chains tool calls: first querying Wikipedia for the creation date, then calling the calculator to subtract from today’s date.

Why This Pattern Matters¶

Production LLM applications rarely need just one capability. By registering heterogeneous tools with clear, descriptive docstrings, you let the model’s function-calling mechanism serve as an implicit router that selects the right tool based on semantic understanding of the query. The quality of tool descriptions directly affects routing accuracy – vague descriptions cause the agent to pick the wrong tool or hallucinate answers instead of calling any tool at all. Each Tool object’s description field is effectively part of the prompt, so treat it with the same care you would give a system message.

import wikipedia
from datetime import datetime

# Wikipedia search tool
def search_wikipedia(query: str) -> str:
    """Search Wikipedia and return a summary."""
    try:
        return wikipedia.summary(query, sentences=3)
    except:
        return f"Could not find information about '{query}'"

# Current date tool
def get_current_date() -> str:
    """Get the current date."""
    return datetime.now().strftime("%B %d, %Y")

# Calculator tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression."""
    try:
        result = eval(expression)
        return str(result)
    except:
        return "Invalid expression"

# Create research tools
research_tools = [
    Tool(
        name="wikipedia",
        func=search_wikipedia,
        description="Search Wikipedia for information. Input should be a search query."
    ),
    Tool(
        name="current_date",
        func=get_current_date,
        description="Get today's date. No input required."
    ),
    Tool(
        name="calculator",
        func=calculate,
        description="Calculate mathematical expressions. Input should be a valid math expression."
    )
]

# Create research agent
research_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a research assistant. Answer questions using available tools."),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

research_agent = create_openai_functions_agent(llm, research_tools, research_prompt)
research_executor = AgentExecutor(
    agent=research_agent,
    tools=research_tools,
    verbose=True
)

print("✅ Research agent ready")

# Test research agent
result = research_executor.invoke({
    "input": "Who invented Python programming language and when?"
})

print(f"\n📚 Research Result:\n{result['output']}")

Part 3: LangGraph Workflows¶

From Linear Chains to Stateful Graphs¶

LangGraph extends LangChain by modeling agent logic as a directed graph where nodes are processing steps and edges define transitions. Unlike a simple sequential chain, LangGraph supports cycles (an agent can loop back to re-plan after receiving new information), conditional branching (route to different nodes based on state), and human-in-the-loop checkpoints. The StateGraph class manages a typed state dictionary that flows through the graph, with each node reading and updating shared state via TypedDict annotations.

Why Graph-Based Orchestration Matters¶

Many real-world agent tasks are not linear pipelines. A coding assistant might plan, write code, run tests, discover a bug, and loop back to rewrite – a cyclic workflow that cannot be expressed as a simple chain. LangGraph’s add_conditional_edges method lets you define routing functions that inspect the current state and choose the next node, enabling patterns like retry loops, parallel fan-out/fan-in, and early termination. The compile() step converts the graph definition into an executable Runnable that supports streaming, async execution, and state persistence for long-running workflows.

from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator

# Define state
class AgentState(TypedDict):
    messages: Annotated[list, operator.add]
    next_step: str
    final_answer: str

# Node functions
def planning_node(state: AgentState) -> AgentState:
    """Plan the approach"""
    print("🧠 Planning...")
    state["messages"].append("Created plan")
    state["next_step"] = "research"
    return state

def research_node(state: AgentState) -> AgentState:
    """Conduct research"""
    print("🔍 Researching...")
    state["messages"].append("Gathered information")
    state["next_step"] = "synthesis"
    return state

def synthesis_node(state: AgentState) -> AgentState:
    """Synthesize findings"""
    print("✍️ Synthesizing...")
    state["final_answer"] = "Research complete with findings"
    state["next_step"] = "end"
    return state

# Build graph
workflow = StateGraph(AgentState)

# Add nodes
workflow.add_node("planning", planning_node)
workflow.add_node("research", research_node)
workflow.add_node("synthesis", synthesis_node)

# Add edges
workflow.set_entry_point("planning")
workflow.add_edge("planning", "research")
workflow.add_edge("research", "synthesis")
workflow.add_edge("synthesis", END)

# Compile
app = workflow.compile()

print("✅ LangGraph workflow created")

# Run the workflow
initial_state = {
    "messages": ["Starting research task"],
    "next_step": "planning",
    "final_answer": ""
}

final_state = app.invoke(initial_state)

print("\n📊 Workflow Result:")
print(f"Messages: {final_state['messages']}")
print(f"Final Answer: {final_state['final_answer']}")

Part 4: Memory Integration¶

Giving Agents Persistent Context¶

Without memory, every agent invocation is stateless – the LLM has no knowledge of prior turns. ConversationBufferMemory solves this by storing the full message history and injecting it into the prompt via the chat_history placeholder. This enables multi-turn interactions where the agent can reference earlier context (“What was my name?” or “Use the same format as before”).

Memory Strategies and Trade-offs¶

Buffer memory is the simplest approach but grows linearly with conversation length, eventually exceeding the model’s context window. LangChain provides alternatives: ConversationSummaryMemory compresses older turns into a running summary (trading fidelity for token efficiency), ConversationBufferWindowMemory keeps only the last \(k\) turns, and VectorStoreMemory embeds messages for semantic retrieval of relevant history. Choosing the right memory strategy depends on your token budget, conversation length, and whether the agent needs exact recall or just topical awareness of past interactions.

from langchain.memory import ConversationBufferMemory

# Create memory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Create agent with memory
memory_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Remember previous conversation."),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

memory_agent = create_openai_functions_agent(llm, research_tools, memory_prompt)
memory_executor = AgentExecutor(
    agent=memory_agent,
    tools=research_tools,
    memory=memory,
    verbose=True
)

print("✅ Agent with memory ready")

# Test memory
print("First message:")
response1 = memory_executor.invoke({"input": "My name is Alice"})
print(response1['output'])

print("\nSecond message (should remember name):")
response2 = memory_executor.invoke({"input": "What's my name?"})
print(response2['output'])

Part 5: Framework Comparison¶

LangChain vs Custom Implementation¶

The fundamental trade-off in agent frameworks is development speed versus control. LangChain lets you build a functional agent in 5-10 lines by composing pre-built abstractions (Tool, AgentExecutor, Memory), but those abstractions impose opinions about prompt formatting, error handling, and the agent loop that may not suit every use case. A custom implementation requires writing the full observe-think-act loop, tool dispatch, output parsing, and retry logic yourself – easily 50+ lines – but gives you complete visibility into every decision the agent makes.

When Custom Wins¶

Custom implementations become worthwhile when you need fine-grained control over token budgets (e.g., dynamically pruning tool descriptions based on context), non-standard reasoning patterns (e.g., tree-of-thought with backtracking), or tight integration with proprietary infrastructure. In production systems where latency and cost matter, the overhead of framework abstractions – extra prompt tokens from verbose templates, unnecessary serialization steps – can add up. Profile both approaches on your actual workload before committing.

import time

# LangChain approach (5-10 lines)
def langchain_agent():
    tools = [Tool(name="calc", func=lambda x: eval(x), description="Calculator")]
    agent = create_openai_functions_agent(llm, tools, prompt)
    executor = AgentExecutor(agent=agent, tools=tools)
    return executor.invoke({"input": "What is 15 + 27?"})

# Custom approach (50+ lines)
def custom_agent():
    # Would need: prompt engineering, tool execution, loop control,
    # error handling, parsing, etc.
    pass

print("✅ LangChain: Quick to build, less control")
print("✅ Custom: More code, full control")

Decision Matrix¶

Criteria	LangChain	LangGraph	Custom
Development Speed	⭐⭐⭐⭐⭐	⭐⭐⭐	⭐
Flexibility	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐⭐
Documentation	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	N/A
Learning Curve	⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐
Community Support	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐
Production Ready	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐⭐

Part 6: Production Patterns¶

Error Handling and Retries¶

Production agents face failures that never appear in tutorials: API rate limits, malformed tool outputs, infinite reasoning loops, and context window overflows. Wrapping the AgentExecutor in a custom subclass lets you intercept exceptions at the execution boundary, log diagnostic information, and return graceful fallback responses instead of crashing. The pattern below catches any exception during the agent loop and converts it into a structured error response that downstream code can handle.

Why Defensive Agent Design Matters¶

An unhandled exception in an agent loop can leave the system in an inconsistent state – partial tool calls executed, memory corrupted, or user-facing errors exposed. Production agents should implement circuit breaker patterns (stop calling a failing tool after \(n\) consecutive errors), timeout guards (abort if the agent hasn’t converged within a time budget), and graceful degradation (fall back to a simpler model or direct response when tools are unavailable). The max_iterations parameter in AgentExecutor is your first line of defense against infinite loops, but application-level error handling provides the safety net.

from langchain.callbacks import StdOutCallbackHandler

# Add custom error handling
class SafeAgentExecutor(AgentExecutor):
    def _call(self, inputs, **kwargs):
        try:
            return super()._call(inputs, **kwargs)
        except Exception as e:
            return {
                "output": f"Error occurred: {str(e)}",
                "error": True
            }

print("✅ Safe agent executor ready")

Monitoring and Logging¶

Structured logging transforms an opaque agent into an observable system. By subclassing AgentExecutor and logging inputs and outputs at the execution boundary, you create an audit trail that answers critical production questions: which tool was called, what arguments were passed, how long each step took, and whether the agent’s final answer addressed the user’s intent. LangChain’s callback system (StdOutCallbackHandler, LangSmithTracer) provides built-in hooks for tracing every intermediate step – tool invocations, LLM calls, token counts – without modifying agent code. In production, pipe these traces to an observability platform (Datadog, Weights & Biases, LangSmith) to monitor latency distributions, error rates, and cost per query across your agent fleet.

import logging

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class LoggingAgentExecutor(AgentExecutor):
    def _call(self, inputs, **kwargs):
        logger.info(f"Agent input: {inputs}")
        result = super()._call(inputs, **kwargs)
        logger.info(f"Agent output: {result}")
        return result

print("✅ Logging configured")

🎯 Knowledge Check¶

Q1: When should you use LangChain vs custom implementation?
Q2: What’s the main advantage of LangGraph?
Q3: How does memory work in LangChain agents?

Click for answers

A1: LangChain for rapid prototyping and standard patterns; custom for full control and optimization
A2: Graph-based workflows with cycles, branching, and complex state management
A3: Memory stores conversation history and passes it to the agent as context

🚀 Next Steps¶

Complete the Agent Framework Challenge
Read Notebook 5: Multi-Agent Systems
Build a production agent with LangChain
Experiment with LangGraph for complex workflows

Great work! You now know how to leverage frameworks to build agents faster! 🎉