---
name: email-thread-reducer
description: "Collapse a long email thread into a deduped timeline of decisions, open questions, and action items — for when you're added to a 40-message chain."
title: Email Thread Reducer
category: data-parsing
difficulty: intermediate
author: admin
icon: 🧵
input: text
output: structured-json
phase: transform
domain: communication
tags: email-processing,thread-summarization,deduplication,text-normalization,decision-extraction,action-items,structured-output,quote-stripping,timeline-generation,fuzzy-matching
best_for:
  - onboarding to long email discussions
  - meeting notes and decision tracking
  - project status synthesis
  - legal or compliance email reviews
---

## Description

Input: a raw email thread (mbox, eml, or pasted text). Output: a structured timeline with four sections — Context (one paragraph), Decisions Made (bulleted with dates and attribution), Open Questions (unresolved), Action Items (owner + due date). Quote-scrolling and signature blocks are stripped automatically.

## Why it works

Email threads accumulate quoted history exponentially, so naive summarization re-ingests the same content five times. Dedup-by-normalized-content first, then summarize, gives you linear input length and a cleaner output.

## How it works

1. Parse the thread, normalize each message (strip quoted history, signature blocks, tracking footers). 2. De-duplicate identical or near-identical content using hash + fuzzy match. 3. Order messages chronologically with speaker and timestamp. 4. Feed the deduped sequence to the LLM with a strict output schema (Context / Decisions / Open Questions / Action Items). 5. Post-check: every Action Item must name an owner and a date — otherwise mark 'unassigned' explicitly rather than dropping.
