Create a Custom AI Research Agent with LangGraph + Tavily in 2026
Introduction
Standard LLM-based research tools (like “deep research” modes in ChatGPT and Gemini) are black boxes. You don’t control the search strategy, the depth of analysis, or when to stop.
With LangGraph + Tavily, you can build a transparent, customizable research agent that:
- Plans its own search queries based on a user’s question
- Executes multiple web searches in parallel
- Analyzes results and identifies knowledge gaps
- Iterates — searches again to fill those gaps
- Produces a final structured report
Unlike the black-box approach, every step is logged, inspectable, and tunable.
Prerequisites
pip install langgraph langchain-openai langchain-community tavily-python python-dotenv
Get API keys:
export OPENAI_API_KEY="sk-..."
export TAVILY_API_KEY="tvly-..." # Free tier: 1000 searches/month
Architecture Overview
LangGraph models the agent as a state graph:
[User Query] → [Planner] → [Search] → [Analyzer] → [Need more info?]
↓ yes/no
[Planner] → [Final Report]
Each node is a function that transforms the shared state. The graph loops until the agent decides it has enough information.
Step 1: Define the State
# state.py
from typing import List, TypedDict, Annotated
import operator
class ResearchState(TypedDict):
"""The shared state across all graph nodes."""
query: str # Original user question
plan: List[str] # List of search queries to execute
search_results: Annotated[List[dict], operator.add] # Accumulated results
analyzed_findings: List[str] # Key findings so far
gaps: List[str] # Identified knowledge gaps
iteration: int # Current search iteration
max_iterations: int # Safety limit
final_report: str # Final output
logs: Annotated[List[str], operator.add] # Execution trace
Step 2: Initialize the Agent
# agent.py
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from tavily import TavilyClient
import os
# Initialize LLM and search client
llm = ChatOpenAI(model="gpt-4o", temperature=0.2)
tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])
Step 3: Planner Node
The planner generates targeted search queries based on the current state:
PLANNER_PROMPT = """You are a research strategist. Given a research question and current findings,
generate 2-3 specific search queries that would help fill knowledge gaps.
Current question: {query}
Current findings: {findings}
Identified gaps: {gaps}
Return ONLY a JSON array of search query strings. Each query should be specific and keyword-optimized."""
def planner_node(state: ResearchState) -> dict:
"""Generate search queries based on research state."""
findings_text = "\n".join(state.get("analyzed_findings", []))
gaps_text = "\n".join(state.get("gaps", []))
response = llm.invoke(PLANNER_PROMPT.format(
query=state["query"],
findings=findings_text or "No findings yet.",
gaps=gaps_text or "None identified yet.",
))
# Parse queries from response
import json
try:
queries = json.loads(response.content.strip().removeprefix("```json").removesuffix("```").strip())
except:
# Fallback: use the query itself
queries = [state["query"]]
return {
"plan": queries,
"iteration": state.get("iteration", 0) + 1,
"logs": [f"[Planner] Generated {len(queries)} queries: {queries}"],
}
Step 4: Search Node
Execute all planned searches in parallel:
def search_node(state: ResearchState) -> dict:
"""Execute all planned searches."""
results = []
for query in state["plan"]:
try:
response = tavily.search(
query=query,
search_depth="advanced", # More thorough
max_results=5,
include_raw_content=True,
)
for r in response.get("results", []):
results.append({
"query": query,
"title": r.get("title", ""),
"url": r.get("url", ""),
"content": r.get("content", ""),
})
except Exception as e:
state["logs"].append(f"[Search] Error for '{query}': {e}")
return {
"search_results": results,
"logs": [f"[Search] Found {len(results)} results across {len(state['plan'])} queries"],
}
Step 5: Analyzer Node
Analyze search results and identify gaps:
ANALYZER_PROMPT = """You are a research analyst. Given a research question and search results:
1. Extract key findings (facts, data points, quotes)
2. Identify what's still missing or unclear (gaps)
3. Assess the confidence level based on source quality and consistency
Research question: {query}
Search results:
{results}
Return a JSON object with:
- "findings": array of strings, each a concrete finding
- "gaps": array of strings, each a question that remains unanswered
- "confidence": "high" | "medium" | "low"
- "has_sufficient_info": boolean (true if you can write a good report)
- "sources_count": number of unique sources used"""
def analyzer_node(state: ResearchState) -> dict:
"""Analyze search results and determine if more research is needed."""
# Format results for the LLM
results_text = ""
for i, r in enumerate(state["search_results"][-15:]): # Last 15 results
results_text += f"\n[{i+1}] {r['title']} ({r['url']})\n{r['content'][:500]}\n"
response = llm.invoke(ANALYZER_PROMPT.format(
query=state["query"],
results=results_text,
))
import json
try:
analysis = json.loads(response.content.strip().removeprefix("```json").removesuffix("```").strip())
except:
analysis = {
"findings": ["Analysis parsing failed - check raw results"],
"gaps": [],
"confidence": "low",
"has_sufficient_info": False,
"sources_count": 0
}
return {
"analyzed_findings": state.get("analyzed_findings", []) + analysis.get("findings", []),
"gaps": analysis.get("gaps", []),
"logs": [
f"[Analyzer] Found {len(analysis.get('findings', []))} findings, "
f"{len(analysis.get('gaps', []))} gaps, "
f"confidence={analysis.get('confidence')}, "
f"sufficient={analysis.get('has_sufficient_info')}"
],
"_has_sufficient": analysis.get("has_sufficient_info", False),
"_confidence": analysis.get("confidence", "low"),
}
Note: The
_has_sufficientand_confidencefields are stored in state but used by the router, not the final report.
Step 6: Report Generator Node
When sufficient info is gathered, produce the final report:
REPORT_PROMPT = """You are a professional research report writer. Based on the following research,
write a comprehensive, well-structured report.
Research question: {query}
All findings:
{findings}
Number of search iterations: {iterations}
Total sources consulted: {sources}
Write a report with:
1. **Executive Summary** (2-3 sentences)
2. **Key Findings** (with data points and citations where available)
3. **Detailed Analysis** (structured by subtopics)
4. **Critical Assessment** (limitations, conflicting viewpoints, confidence level)
5. **Sources** (list referenced URLs)
Format as Markdown with proper headings.
Be objective - present facts and uncertainties, not hype.
Target 800-1200 words."""
def report_node(state: ResearchState) -> dict:
"""Generate the final research report."""
findings_text = "\n".join(
f"- {f}" for f in state.get("analyzed_findings", [])
)
sources = set()
for r in state.get("search_results", []):
sources.add(r["url"])
report = llm.invoke(REPORT_PROMPT.format(
query=state["query"],
findings=findings_text,
iterations=state.get("iteration", 0),
sources=len(sources),
))
return {
"final_report": report.content,
"logs": [f"[Report] Generated final report ({len(report.content)} chars)"],
}
Step 7: Router + Graph Construction
The key innovation in LangGraph is the conditional edge — the router decides whether to loop or finish:
def should_continue(state: ResearchState) -> str:
"""Decide whether to continue researching or generate the report."""
# Safety: stop if we've done too many iterations
if state.get("iteration", 0) >= state.get("max_iterations", 3):
return "report"
# Stop if we have sufficient info
if state.get("_has_sufficient", False):
return "report"
# Stop if confidence is high enough
if state.get("_confidence") == "high":
return "report"
# Otherwise, loop back to research
return "research"
def build_research_graph() -> StateGraph:
"""Construct the research agent graph."""
workflow = StateGraph(ResearchState)
# Add nodes
workflow.add_node("planner", planner_node)
workflow.add_node("search", search_node)
workflow.add_node("analyzer", analyzer_node)
workflow.add_node("report", report_node)
# Set entry point
workflow.set_entry_point("planner")
# Add edges
workflow.add_edge("planner", "search")
workflow.add_edge("search", "analyzer")
# Conditional: loop or finish
workflow.add_conditional_edges(
"analyzer",
should_continue,
{
"research": "planner", # Go back for more
"report": "report", # Generate final output
}
)
workflow.add_edge("report", END)
return workflow.compile()
Step 8: Run It
# run_research.py
from agent import build_research_graph
# Initialize the graph
graph = build_research_graph()
# Run research
result = graph.invoke({
"query": "What are the latest advances in AI-powered code generation tools as of mid-2026?",
"plan": [],
"search_results": [],
"analyzed_findings": [],
"gaps": [],
"iteration": 0,
"max_iterations": 3,
"final_report": "",
"logs": [],
})
print(result["final_report"])
print("\n\n=== Execution Log ===")
for log in result["logs"]:
print(f" {log}")
Sample execution trace:
[Planner] Generated 3 queries: ['AI code generation tools 2026 advances',
'GitHub Copilot vs Cursor vs Claude Code 2026 comparison',
'latest AI coding assistants features review 2026']
[Search] Found 15 results across 3 queries
[Analyzer] Found 7 findings, 2 gaps, confidence=medium, sufficient=False
[Planner] Generated 2 queries: ['AI code generation agent mode capabilities 2026',
'open source AI code generation tools 2026']
[Search] Found 10 results across 2 queries
[Analyzer] Found 5 findings, 1 gap, confidence=high, sufficient=True
[Report] Generated final report (2345 chars)
Step 9: CLI Interface with Streaming
Make it interactive:
# cli.py
from agent import build_research_graph
import json
def main():
graph = build_research_graph()
print("🤖 AI Research Agent")
print("====================")
query = input("\nResearch question: ")
print("\n🔍 Researching (this may take 30-60 seconds)...\n")
# Run with streaming for real-time logs
for event in graph.stream({
"query": query,
"plan": [],
"search_results": [],
"analyzed_findings": [],
"gaps": [],
"iteration": 0,
"max_iterations": 3,
"final_report": "",
"logs": [],
}):
for node, data in event.items():
if "logs" in data:
for log in data["logs"]:
print(f" {log}")
# Get final state
final = graph.invoke({
"query": query,
"plan": [],
"search_results": [],
"analyzed_findings": [],
"gaps": [],
"iteration": 0,
"max_iterations": 3,
"final_report": "",
"logs": [],
})
print("\n" + "=" * 50)
print(final["final_report"])
if __name__ == "__main__":
main()
Customization Ideas
| Modification | Code Change | Effect |
|---|---|---|
| Add more search iterations | max_iterations: 5 | Deeper research, slower |
Use Tavily search_depth="basic" | Faster but less thorough searches | 2x speed, lower quality |
| Switch to local LLM | Replace ChatOpenAI with ChatOllama | Zero API cost, slower |
| Add source quality filter | Filter results by domain (edu, gov, etc.) | Higher quality sources |
| Add summarization node | Condense long results before analysis | Better context window usage |
Comparison: LangGraph vs. Other Approaches
| Aspect | LangGraph Agent | Manual code | CrewAI |
|---|---|---|---|
| Loop control | Native graph | Custom logic | Fixed pipeline |
| State management | Built-in | Manual | Manual |
| Debugging | LangSmith tracing | Print statements | Limited |
| Code size | ~200 lines | ~400 lines | ~150 lines (steeper learning) |
Conclusion
You’ve built a transparent, multi-iteration research agent using LangGraph’s state graph and Tavily’s search API. Unlike ChatGPT’s black-box research mode, your agent logs every decision, search query, and analysis step. You control:
- When to stop researching (custom threshold logic)
- How deep to search (Tavily depth + number of iterations)
- What sources to trust (filtering)
- How the final report is structured (prompt templates)
This same architecture powers production research systems at hedge funds, consulting firms, and content teams. The only difference is scale — add parallel search workers and a more sophisticated planner for enterprise workloads.
Next steps: Add a web UI with Gradio, or integrate with Slack so your team can start research with a /research command.