Learn how to build production-ready AI applications with LangChain. Covers agents, chains, RAG, memory systems, and deployment patterns for modern web developers.
LangChain has evolved from a Python library into a comprehensive AI application framework. By 2026, it powers production applications ranging from customer support chatbots to complex document analysis pipelines.
This guide walks you through building AI-powered features using LangChain, from basic chains to advanced agentic workflows.
Table of Contents
- What is LangChain in 2026?
- Core Concepts
- Setting Up Your Environment
- Building Your First Chain
- Memory Systems
- Retrieval-Augmented Generation (RAG)
- Building Agents
- Tool Integration
- React Integration
- Production Deployment
What is LangChain in 2026?
LangChain is a framework for building applications powered by large language models. It provides:
- Chains: Composable sequences of operations with LLMs
- Agents: Autonomous systems that use tools to complete tasks
- Memory: Persistence layers for conversation context
- Retrievers: Interfaces for connecting to vector databases
- Prompts: Templating and management for LLM interactions
The 2026 ecosystem includes:
- LangChain (Python): Full-featured, best for complex applications
- LangChain.js: JavaScript/TypeScript port, best for web integration
- LangServe: Production deployment framework
- LangSmith: Observability and evaluation platform
Core Concepts
The LCEL Syntax
LangChain Expression Language (LCEL) is a declarative way to compose chains:
# Basic chain with LCEL
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful {assistant_type}."),
("human", "{question}")
])
output_parser = StrOutputParser()
chain = prompt | llm | output_parser
# Invoke
result = chain.invoke({
"assistant_type": "coding assistant",
"question": "Explain async/await in Python"
})
Why LCEL Matters
# LCEL provides:
# 1. Streaming support (token-by-token output)
# 2. Async support (awaitable chains)
# 3. Parallel execution (for independent steps)
# 4. Fallbacks (graceful degradation)
# Example: Streaming response
chain = prompt | llm | output_parser
# Streaming
for chunk in chain.stream({"question": "Hello"}):
print(chunk, end="", flush=True)
Setting Up Your Environment
Python Environment
# Create virtual environment
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install LangChain and dependencies
pip install langchain langchain-openai langchain-community
pip install chromadb # Vector database
pip install faiss-cpu # Alternative vector DB (CPU-only)
pip install python-dotenv # Environment variables
pip install duckduckgo-search # For agent tools
# Verify installation
python -c "import langchain; print(langchain.__version__)"
Environment Variables
# .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-... # Optional, for Claude
ACTIVELOOP_TOKEN=... # For LangChain datasets
Project Structure
my-langchain-app/
├── app/
│ ├── __init__.py
│ ├── chains/
│ │ ├── __init__.py
│ │ ├── qa_chain.py
│ │ └── summarization_chain.py
│ ├── agents/
│ │ ├── __init__.py
│ │ └── research_agent.py
│ ├── memory/
│ │ ├── __init__.py
│ │ └── conversation_buffer.py
│ └── utils/
│ ├── __init__.py
│ └── embeddings.py
├── .env
├── requirements.txt
└── main.py
Building Your First Chain
Question Answering Chain
# chains/qa_chain.py
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain.chains import RetrievalQA
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
# Initialize components
llm = ChatOpenAI(model="gpt-4o", temperature=0)
embeddings = OpenAIEmbeddings()
# Create vector store (from documents)
vectorstore = Chroma.from_documents(
documents=texts, # List of Document objects
embedding=embeddings,
persist_directory="chroma_db"
)
# Create retriever
retriever = vectorstore.as_retriever(
search_type="similarity",
search_kwargs={"k": 3}
)
# Create QA chain
SYSTEM_PROMPT = """You are a helpful AI assistant. Use the following context to answer the user's question. If you don't know the answer based on the context, say so.
Context: {context}
"""
prompt = ChatPromptTemplate.from_messages([
("system", SYSTEM_PROMPT),
("human", "{question}")
])
# Assemble chain
qa_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
# Use it
result = qa_chain.invoke("What is the main topic of the documents?")
Summarization Chain
# chains/summarization_chain.py
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.chains.summarize import load_summarize_chain
from langchain.schema import Document
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# Map-reduce summarization for long documents
map_prompt = PromptTemplate.from_template(
"Summarize this text in 3 sentences: {text}"
)
reduce_prompt = PromptTemplate.from_template(
"""Combine these summaries into a cohesive summary:
{text}
Combined summary:"""
)
# Load the chain (handles splitting automatically)
summarize_chain = load_summarize_chain(
llm=llm,
map_prompt=map_prompt,
combine_prompt=reduce_prompt,
chain_type="map_reduce"
)
# For very long documents, use refine
refine_chain = load_summarize_chain(
llm=llm,
question_prompt=map_prompt,
refine_prompt=reduce_prompt,
chain_type="refine"
)
# Use
documents = [Document(page_content=long_text)]
summary = summarize_chain.invoke(documents)
Memory Systems
Conversation Buffer Memory
# memory/conversation_buffer.py
from langchain.memory import ConversationBufferMemory
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains.conversational import ConversationalChain
llm = ChatOpenAI(model="gpt-4o")
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True,
output_key="answer"
)
conversation = ConversationalChain(
llm=llm,
memory=memory,
prompt=ChatPromptTemplate.from_messages([
("system", "You are a helpful customer support assistant."),
("human", "{input}"),
("ai", "{chat_history}\n\nAI: {answer}")
]),
input_key="input",
output_key="answer"
)
# Chat interactions
response1 = conversation.invoke({"input": "I need help with my order"})
response2 = conversation.invoke({"input": "What's the status?"})
# The second call knows about the first ("your order")
# Access memory
print(memory.chat_memory.messages)
Vector Store-Backed Memory
For persistent memory across sessions:
# memory/vector_memory.py
from langchain.memory import VectorStoreRetrieverMemory
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_agents import initialize_agent, AgentType
# Create vector store for memories
memory = VectorStoreRetrieverMemory(
vectorstore=Chroma(
embedding=OpenAIEmbeddings(),
persist_directory="memory_db"
),
memory_key="chat_history",
return_messages=True,
k=5 # Return last 5 relevant memories
)
# Add memories explicitly
memory.save_context(
{"input": "User prefers dark mode interface"},
{"output": "Noted: dark mode preference saved."}
)
# Later retrieval
relevant_memories = memory.load_memory_variables(
{"input": "What did the user say about their preferences?"}
)
Summary Memory (for long conversations)
# For conversations that exceed context window
from langchain.memory import ConversationSummaryMemory
memory = ConversationSummaryMemory(
llm=ChatOpenAI(model="gpt-4o"),
memory_key="chat_history",
return_messages=True,
output_key="response"
)
# Periodically summarize to stay within token limits
# The memory automatically summarizes older messages
Retrieval-Augmented Generation (RAG)
Complete RAG Pipeline
# rag/pipeline.py
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
class RAGPipeline:
def __init__(self, documents: list[str]):
self.llm = ChatOpenAI(model="gpt-4o", temperature=0)
self.embeddings = OpenAIEmbeddings()
self.vectorstore = self._create_vectorstore(documents)
self.chain = self._create_chain()
def _create_vectorstore(self, documents: list[str]) -> Chroma:
# Split documents
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
length_function=len,
)
splits = text_splitter.create_documents(documents)
# Create and persist vector store
vectorstore = Chroma.from_documents(
documents=splits,
embedding=self.embeddings,
persist_directory="rag_db"
)
return vectorstore
def _create_chain(self):
retriever = self.vectorstore.as_retriever(
search_kwargs={"k": 3}
)
prompt = ChatPromptTemplate.from_messages([
("system", """Answer the question based on the provided context.
If the answer isn't in the context, say "I don't have information about that."
Context: {context}"""),
("human", "{question}")
])
return (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
def invoke(self, question: str) -> str:
return self.chain.invoke(question)
def add_documents(self, documents: list[str]):
"""Add new documents to the vector store"""
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
splits = text_splitter.create_documents(documents)
self.vectorstore.add_documents(splits)
# Usage
documents = [
"LangChain is a framework for building LLM applications...",
"RAG combines retrieval with generation...",
]
rag = RAGPipeline(documents)
answer = rag.invoke("What is LangChain?")
Hybrid Search with BM25
# rag/hybrid_search.py
from langchain.retrievers import BM25Retriever
from langchain_community.vectorstores import Chroma
from langchain.schema import Document
# Create vector store
vectorstore = Chroma.from_documents(documents, OpenAIEmbeddings())
vector_retriever = vectorstore.as_retriever(search_kwargs={"k": 10})
# Create BM25 retriever
bm25_retriever = BM25Retriever.from_documents(
documents,
preprocess_func=lambda x: x.lower().split()
)
# Combine with Reciprocal Rank Fusion
from langchain.retrievers import EnsembleRetriever
ensemble_retriever = EnsembleRetriever(
retrievers=[vector_retriever, bm25_retriever],
weights=[0.6, 0.4] # Favor semantic search
)
# Use in chain
chain = (
{"context": ensemble_retriever, "question": RunnablePassthrough()}
| prompt
Building Agents
Simple Tool-Calling Agent
# agents/research_agent.py
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_openai import ChatOpenAI
from langchain.tools import Tool
from langchain.prompts import ChatPromptTemplate
from duckduckgo_search import DuckDuckGoSearchRun
# Define tools
search = DuckDuckGoSearchRun()
def search_web(query: str) -> str:
"""Search the web for current information."""
return search.run(query)
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression."""
try:
result = eval(expression)
return str(result)
except Exception as e:
return f"Error: {e}"
tools = [
Tool(
name="web_search",
func=search_web,
description="Search the web for current information about any topic."
),
Tool(
name="calculator",
func=calculate,
description="Evaluate mathematical expressions. Input should be a valid Python expression."
)
]
# Create agent
llm = ChatOpenAI(model="gpt-4o", temperature=0)
prompt = ChatPromptTemplate.from_messages([
("system", """You are a helpful research assistant. Use tools to answer questions accurately.
Always cite your sources when using web search.
Available tools: web_search, calculator"""),
("human", "{input}"),
("placeholder", "{agent_scratchpad}")
])
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# Run agent
result = agent_executor.invoke({
"input": "What is the population of Tokyo? Calculate the square root of that number."
})
ReAct Agent (Reasoning + Acting)
# agents/react_agent.py
from langchain.agents import create_react_agent, AgentExecutor
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain import hub
# Pull the ReAct prompt
react_prompt = hub.pull("hwchase17/react")
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# Create with custom tools
agent = create_react_agent(llm, tools, react_prompt)
executor = AgentExecutor.from_agent_and_tools(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
# The agent will reason step-by-step:
# Thought: I need to find X
# Action: web_search
# Observation: Found Y
# Thought: Now I need to calculate...
result = executor.invoke({
"input": "Find the latest GPT-5 release date and tell me how many days from today."
})
Custom Agent with Structured Output
# agents/research_agent.py
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
# Define expected output schema
response_schemas = [
ResponseSchema(name="summary", description="A 2-3 sentence summary"),
ResponseSchema(name="key_points", description="List of 3-5 key points"),
ResponseSchema(name="sentiment", description="Overall sentiment: positive, negative, or neutral"),
ResponseSchema(name="confidence_score", description="Confidence score 0-1")
]
parser = StructuredOutputParser.from_response_schemas(response_schemas)
prompt = PromptTemplate.from_messages([
("system", """Analyze the following text and provide structured output.
{format_instructions}"""),
("human", "{text}")
]).partial(format_instructions=parser.get_format_instructions())
chain = prompt | llm | parser
result = chain.invoke({
"text": "Your product is amazing! Best I've ever used."
})
# Returns: {summary: "...", key_points: [...], sentiment: "positive", confidence_score: 0.95}
Tool Integration
Building Custom Tools
# tools/database_tools.py
from langchain.tools import Tool
from pydantic import BaseModel, Field
from typing import Optional
import sqlite3
class DatabaseQueryInput(BaseModel):
query: str = Field(description="SQL query to execute")
limit: Optional[int] = Field(default=10, description="Max rows to return")
def query_database(query: str, limit: int = 10) -> str:
"""Execute a read-only SQL query on the database."""
conn = sqlite3.connect("myapp.db")
cursor = conn.cursor()
# Safety: only SELECT queries
if not query.strip().upper().startswith("SELECT"):
return "Error: Only SELECT queries allowed"
try:
cursor.execute(query + f" LIMIT {limit}")
rows = cursor.fetchall()
conn.close()
return str(rows)
except Exception as e:
return f"Error: {e}"
db_tool = Tool(
name="database_query",
func=query_database,
description="""Query the application database. Only SELECT queries are allowed.
Input should be a valid SQL SELECT statement.""",
args_schema=DatabaseQueryInput
)
# Use in agent
agent = create_tool_calling_agent(llm, [db_tool, search_tool], prompt)
Tool Validation and Error Handling
# tools/validated_tools.py
from langchain.tools import Tool
from pydantic import BaseModel, Field, validator
class WikipediaSearchInput(BaseModel):
query: str = Field(description="Search query for Wikipedia")
@validator("query")
def validate_query(cls, v):
if len(v) < 3:
raise ValueError("Query must be at least 3 characters")
if len(v) > 200:
raise ValueError("Query must be under 200 characters")
return v
def wikipedia_search(query: str) -> str:
"""Search Wikipedia for information."""
# Implementation
...
wiki_tool = Tool(
name="wikipedia",
func=wikipedia_search,
description="Search Wikipedia for factual information.",
args_schema=WikipediaSearchInput
)
React Integration
Using LangChain.js in React
// lib/langchain.ts
import { ChatOpenAI } from "@langchain/openai";
import { ConversationalRetrievalQAChain } from "langchain/chains";
import { MemoryVectorStore } from "langchain/memory";
import { OpenAIEmbeddings } from "@langchain/openai";
// Initialize model
export const llm = new ChatOpenAI({
modelName: "gpt-4o",
temperature: 0,
streaming: true,
});
// Create a simple chain
export async function createQAChain(vectorStore: any) {
return ConversationalRetrievalQAChain.fromLLM(
llm,
vectorStore.asRetriever(),
{
memory: new MemoryVectorStore(new OpenAIEmbeddings()),
}
);
}
React Hook for AI Chat
// hooks/useAIAgent.ts
import { useState, useCallback } from 'react';
import { useLangChain } from '@/lib/langchain';
interface Message {
id: string;
role: 'user' | 'assistant';
}
export function useAIAgent() {
const [messages, setMessages] = useState<Message[]>([]);
const [isLoading, setIsLoading] = useState(false);
const [error, setError] = useState<string | null>(null);
const sendMessage = useCallback(async (content: string) => {
setIsLoading(true);
setError(null);
// Add user message
const userMessage: Message = {
id: crypto.randomUUID(),
role: 'user',
content,
};
setMessages((prev) => [...prev, userMessage]);
try {
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
messages: [...messages, userMessage]
}),
});
const data = await response.json();
const assistantMessage: Message = {
id: crypto.randomUUID(),
role: 'assistant',
content: data.response,
};
setMessages((prev) => [...prev, assistantMessage]);
} catch (err) {
setError(err instanceof Error ? err.message : 'Failed to get response');
} finally {
setIsLoading(false);
}
}, [messages]);
return { messages, sendMessage, isLoading, error };
}
Production Deployment
LangServe API
# server.py
from fastapi import FastAPI
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain
from langserve import add_routes
app = FastAPI(title="AI Assistant API")
# Simple chain
llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("human", "{question}")
])
chain = LLMChain(llm=llm, prompt=prompt)
# Add routes
add_routes(app, chain, path="/chain")
# RAG chain
add_routes(app, rag_chain, path="/rag")
# Agent
add_routes(app, agent_executor, path="/agent")
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
Docker Deployment
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application
COPY . .
# Expose port
EXPOSE 8000
# Run
CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8000"]
Environment-Specific Configuration
# config.py
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
openai_api_key: str
environment: str = "development"
class Config:
env_file = ".env"
env_file_encoding = "utf-8"
settings = Settings()
# Use in chains
llm = ChatOpenAI(
model="gpt-4o" if settings.environment == "production" else "gpt-4o-mini",
api_key=settings.openai_api_key
)
Best Practices
Security
# Never log sensitive information
from langchain.callbacks import get_callback_handler
# Redact sensitive data from prompts
def sanitize_input(text: str) -> str:
patterns = [
r'\b\d{3}-\d{2}-\d{4}\b', # SSN
r'\b\d{16}\b', # Credit card
r'api_key=["\'][^"\']+["\']', # API keys
]
for pattern in patterns:
text = re.sub(pattern, "[REDACTED]", text)
return text
Cost Management
# Track token usage
from langchain.callbacks import get_callback_handler
class TokenCountingCallback(get_callback_handler()):
def __init__(self):
self.total_tokens = 0
def on_llm_end(self, response, **kwargs):
self.total_tokens += response.usage_metadata.get("total_tokens", 0)
# Set budgets
MAX_TOKENS_PER_REQUEST = 2000 # Budget control
Summary
LangChain in 2026 provides production-ready patterns for:
| Pattern | Use Case | Complexity |
|---|---|---|
| Chains | Simple LLM workflows | Low |
| RAG | Document Q&A | Medium |
| Agents | Autonomous task completion | High |
| Memory | Conversation persistence | Medium |
Start with chains, add retrieval for RAG, and introduce agents only when needed.
Building an AI application? I specialize in LangChain, RAG architectures, and production AI deployment. Let's discuss your project.
Related Content
- How to Integrate GPT-4o into Your React Application — Start here for basic GPT-4o setup before diving into LangChain
- AI Integration Services — Need help building production-ready AI features with LangChain and modern LLMs?
- Web Development Services — Full-stack web development with AI integration built-in
