langchain-ai · langsmith-fleet · May 29, 2026
diff --git a/src/langsmith/online-evaluations-multi-turn.mdx b/src/langsmith/online-evaluations-multi-turn.mdx
@@ -12,6 +12,18 @@ You can use multi-turn evaluations to measure:
 
 <Note> Running multi-turn online evals will auto-upgrade each trace within a thread to [extended data retention](/langsmith/administration-overview#data-retention-auto-upgrades). This upgrade will impact trace pricing, but ensures that traces meeting your evaluation criteria (typically those most valuable for analysis) are preserved for investigation. </Note>
 
+## How it works
+
+Multi-turn online evaluators follow this evaluation lifecycle:
+
+1. **Trace ingestion**: Each turn in a conversation is traced as a separate run and associated with a thread using a shared thread ID.
+2. **Idle time detection**: After the last trace in a thread is ingested, LangSmith waits for the configured idle time to elapse. This idle period signals that the conversation is complete and ready for evaluation.
+3. **Message assembly**: LangSmith collects the `messages` from each trace in the thread and assembles them into a single conversation history. If each trace contains only the latest message, LangSmith stitches messages together across turns. If each trace contains the full history, LangSmith uses that directly.
+4. **LLM-as-a-judge evaluation**: The assembled conversation is passed to your configured LLM-as-a-judge prompt. The evaluator scores the full thread based on your criteria: semantic intent, outcome, or trajectory.
+5. **Feedback recording**: The evaluator writes feedback to LangSmith using the feedback key you configured, associated with the thread.
+
+This lifecycle means that multi-turn evaluators run once per completed thread, not once per trace. Use [run-level online evaluators](/langsmith/online-evaluations) if you want per-trace evaluation.
+
 ## Prerequisites
 
 - Your tracing project must be using [threads](/langsmith/threads).