Unearth — expert discovery interviews at scale.
Conducts multi-session discovery interviews with domain experts and extracts structured product requirements
The problem Unearth exists to solve
Product teams building for specialists — accountants, lawyers, clinicians, field engineers — need to know how the work actually runs before they ship anything. That knowledge lives in people's hands, compressed into habits nobody verbalises until prodded correctly. Conventional discovery needs a skilled interviewer, scales linearly with their calendar, and produces a pile of notes nobody aggregates. Experts describe solutions ("I need a better spreadsheet template") when the thing worth capturing is the underlying job ("I spend four hours each month categorising two hundred transactions"). Most interviewers accept the solution and build the wrong product. The judgement required to unwind solution-talk into a problem statement stays trapped inside the handful of people who can do it, and it leaves the room with them.
What Unearth does differently
Unearth runs the interview. A forensic agent reconstructs specific past events with sensory detail, redirects solution-talk into jobs-to-be-done, and walks the expert through Ulwick's universal job map phase by phase. A dual-process architecture separates conversational flow (System 1) from methodology steering (System 2, which tracks coverage gaps, detects fluff, and decides when to probe, reflect, or move on).
Every claim the agent forms is rendered back into the chat as a visual — a workflow diagram, four-forces quadrant, pain severity map, outcome statement card — and the expert responds with Agree, Not Quite, or Wrong. Agree advances coverage. Not Quite captures the delta between the agent's model and the expert's reality; that delta is the training signal. Wrong invalidates the claim outright. Sessions stitch together through a living discovery state, so the next conversation picks up at the unresolved edges of the last. When multiple experts describe the same workflow under different terminology — "rideshare drivers", "Uber clients", "gig workers" — semantic matching links them into one pattern and emits a confidence-scored specification to the consuming project via API and webhook.
Why the architecture matters here
- feedback as interaction — The unit of progress is not a message but a confirmed visual. Agree / Not Quite / Wrong turns the chat surface into a labelling pipeline, and the "Not Quite" edits are the highest-value record because they expose precisely where the agent's model was wrong.
- harness over model — The Claude call is wrapped in a dual-process loop with phase state, coverage maps, fluff detection, and a catalogue of forbidden moves (no "Why?", no hypotheticals, no stacked questions, no filling silence). The methodology lives in the harness; the model is the voice.
- projections over features — Cross-expert aggregation is a projection over extraction events. Workflows, pains, and outcome statements from separate interviewees collapse into shared patterns via pgvector similarity, and the generated spec is a read model over confirmed extractions rather than a hand-authored artefact.
// Want another platform?