DEV Community

How to Build a Research Assistant using Deep Agents

Anmol Baranwal on February 20, 2026

LangChain's Deep Agents provide a new way to build structured, multi-agent systems that can plan, delegate and reason across multiple steps. It co...

Read full post

Fliin • Feb 20

Really appreciated the thread isolation pattern inside the research tool. Running the internal Deep Agent in a separate thread to prevent callback propagation is such a clean way to avoid tool-call noise leaking into the frontend stream. Subtle detail, but super important for real-world DevX. 👏

Anmol Baranwal CopilotKit • Feb 23 • Edited

yep! when you actually build and stream these systems in production, you see how fast the event stream gets chaotic. once tools start triggering subagents, debugging becomes painful. isolating it keeps the mental model clean and the frontend predictable.

Guilherme Zaia • Feb 21

Deep Agents abstract the easy part (planning loops). The hard part? Debugging cascading failures when SubAgent B hallucinates because Agent A's file write hit filesystem latency. Where's your observability stack? Without distributed tracing across that LangGraph, production becomes a black box. Streaming state to UI is slick, but can you replay a failed run without re-executing 40 Tavily calls?

Anmol Baranwal CopilotKit • Feb 23

Great points. This post focuses on the execution pattern + UI streaming, not full production observability.

LangGraph's MemorySaver does provide checkpoint-level state snapshots so you can inspect where a run broke in something like LangSmith without blindly re-running everything. It's not distributed tracing, but it's not a total black box either.

On Tavily replay, you are right. There's no caching layer in this demo, so a mid-run failure would re-hit the API. A simple cache around do_internet_search would fix that.

One small clarification: the default Deep Agents "filesystem" is virtual and backed by LangGraph state, so there isn't actual disk latency involved (unless someone wires a custom backend). In the default setup, the concern is more about state consistency across nested agents than actual disk latency.

Matthew Hou • Feb 21

The thread isolation bit is the most interesting part to me — keeping the inner agent's tool-call noise out of the frontend stream sounds simple but I've seen that go wrong fast in multi-agent setups. Once you have agents calling tools that trigger other agents, the event stream becomes impossible to reason about without some kind of boundary.

Curious if you hit any edge cases where the research sub-agent needs to stream partial results back before it's fully done?

Anmol Baranwal CopilotKit • Feb 23

yeah, once you let subagent events bleed into the outer stream, you lose the ability to reason about what's happening at the UI level.

On partial results: the current setup doesn't stream mid-research -- the sub-agent runs to completion inside the thread, then the result crosses the boundary as a single structured output. The research step shows as a single tool call in the stream, you see it start and complete in the demo video, just not the internal search it does underneath.

The real-time feel comes from the main agent updating the plan and writing files as it goes.

Matthew Hou • Feb 23

That makes sense — treating the research step as a single structured output at the boundary is cleaner than trying to stream partial results. You avoid a whole class of ordering bugs that way. The tradeoff is latency (user waits for the full sub-agent run), but for research-type tasks that's probably fine since the user expects it to take a minute anyway. Thanks for the detailed explanation.

Rune Breinholt Andersen • Feb 21

This resonates a lot.

I’ve noticed the same shift while building AI-driven side projects, especially products where speed of creation is no longer the bottleneck.

The real bottleneck is clarity.

Deleting code is often the moment where the product actually improves:
less mental load,
fewer edge cases,
faster iteration.

AI makes adding trivial.
Taste makes removing valuable.

Curious - do you think this changes how we should teach junior developers?

Anmol Baranwal CopilotKit • Feb 23

I think it does. We probably over-index on "how to build" and under-index on "how to decide what not to build." Right now most tutorials teach juniors to build first, refactor later -- but if AI handles the building, that order breaks down.

The earlier skill to develop is judgment: what not to add, when something is done, what complexity is actually load-bearing. AI removes the friction of writing code but it doesn't remove the cost of maintaining it.

Harsh • Feb 23

Super clean implementation! 🔥 Finally someone solved the real challenge with Deep Agents - that black box problem. Live streaming every agent step to the UI is a game changer for debugging and user trust.