nanoDeepResearch

OpenAI's Deep Research is a really cool product. This project is inspired by ByteDance's DeerFlow project, an open-source Deep Research project. DeerFlow is using LangGraph to build the agentic workflow which abstracts away a lot of details, e.g. how the ReAct agent is working.

To make sure I understand how the DeerFlow's Deep Research agent works under the hood, I decided to build it from scratch without relying on any existing agentic framework, e.g. LangGraph.

Note that this project is currently pure backend without any frontend interface.

Recommended way of using this repo

This project is NOT for production use, since it lacks many production features, e.g. web UI, token streaming, async execution, etc.

But since it is build from scratch, it is easy to see the output of each step in the workflow, e.g.

for web search, what LLM suggests to search
for coding, what code the LLM generates
for planning, what research plan the LLM outputs
whatever step you are curious about

Prepare the API keys

We need to prepare the API keys for the following services and store them as environment variables:

OpenAI: OPENAI_API_KEY
Claude: ANTHROPIC_API_KEY
Tavily: TAVILY_API_KEY
Jina: JINA_API_KEY

Run the workflow

Go to the parent directory of this repo and run the following command:

python3 -m nanoDeepResearch.main \
--query "what is the area(land+water) ratio between the largest and smallest states in the US"

The example report can be found in example_reports/area_ratio_largest_smallest_state_in_us.md

ReAct Agent

The ReAct agent is a simple agent that uses a ReAct loop to reason and act. It is inspired by the ReAct paper: https://arxiv.org/abs/2210.03629

Given a task query, the ReAct agent will:

Reason about the task using the available tools
Act on the task using the available tools
Get the observation from the action results
Repeat the process until the task is completed

Note that all those steps are decided by the LLM without human intervention, really cool!

State Machine(Graph)

state_machine.py is the class for the whole Deep Research workflow.

Planner: the LLM of the planner agent will understand the user query and break it down into a list of task steps
Research Team: it will take the list of task steps and assign each step to either a researcher agent or a coder agent
Researcher: it is a ReAct agent and can use web search and crawler tools to solve the task
Coder: it is a ReAct agent and can use python to solve the task
Reporter: it will use the observations from the researcher and coder to generate the final report

Lint

make lint

Acknowledgement

Thanks to DeerFlow for open-sourcing their project and providing a lot of inspiration.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
agent		agent
crawler		crawler
example_reports		example_reports
llm		llm
prompt		prompt
state		state
state_machine		state_machine
tools		tools
Makefile		Makefile
README.md		README.md
__init__.py		__init__.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

nanoDeepResearch

Recommended way of using this repo

Prepare the API keys

Run the workflow

ReAct Agent

State Machine(Graph)

Lint

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

wudi/nanoDeepResearch

Folders and files

Latest commit

History

Repository files navigation

nanoDeepResearch

Recommended way of using this repo

Prepare the API keys

Run the workflow

ReAct Agent

State Machine(Graph)

Lint

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages