input BEFORE — the user's opening prompt, answers to Stage 1's five meta-questions, and the raw info dump that the workflow turned into a structured doc

output AFTER — the finished decision doc (1,950 words). Built section-by-section through 5 brainstorm/curate/draft/refine cycles + a subagent reader-test pass

What we ran it on:

Spec

When this fires, what it takes, how it installs

Fires when

▸user wants to draft a decision doc, RFC, PRD, technical spec, design doc, or proposal
▸user mentions "write a doc", "draft a proposal", "create a spec", "write up"
▸skill author or PM is starting a substantive piece of writing where blind spots will be costly
▸user has rich context to transfer and wants Claude to interview them out of it
▸user has access to subagents (Claude Code) and wants a fresh-Claude reader test before sending the doc

Skip when

✕user just wants a short-form doc, a quick template, or one-shot prose (workflow overhead exceeds value)
✕user has no substantive context to transfer ("write me a generic doc") — Stage 1 has no documented short-circuit and will burn turns before it figures this out
✕user is in a runtime with no artifact/file write surface AND no markdown working directory (the "create the scaffold" branches both assume one or the other)
✕reader-testing the doc with a fresh Claude is the explicit non-goal (e.g. the doc is private or contains secrets that should not enter a subagent context)

Takes

prompt:natural-language an opening request to write a doc + iterative chat for context-gathering, brainstorming, curation, and refinement
text:info-dump user's raw context (background, history, politics, constraints, related threads); pasted prose, no schema required
file:template optional — an existing doc template the user wants to follow (workflow uses it to skip the "suggest sections" branch in Stage 2)

Returns

file:markdown a multi-section document built section-by-section with explicit clarify → brainstorm → curate → draft → refine cycles; in our run produced a 1,950-word decision doc with sharper structure than a one-shot prompt would have given
text:reader-test-findings when subagents are available (Claude Code), Stage 3 produces a real gap report from a fresh-Claude read of the finished doc — in our run it caught 3 gaps the author hadn't noticed (missing DRI, undefined acronym, vague analytics migration plan)

Install

notes: Nothing to install. This skill is a prompt-only workflow — a SKILL.md the agent reads and follows. The only 'runtime requirement' is the host being able to write markdown files (or render artifacts) for the section-by-section drafting loop, and ideally subagent access for Stage 3.

No requirements.txt, no scripts, no fixtures. The entire skill is one ~16 KB SKILL.md. Capability gating is documented inline ('If access to artifacts is available...' / 'If access to sub-agents is available...') — Claude Code gets the full experience; claude.ai web gets a manual variant of Stage 3.

Caveats

no install at all — the skill is a SKILL.md and nothing else. Good (zero deps) but means there is also nothing to lint, no version pin, no schema check; the workflow lives or dies on the agent following it faithfully
Stage 1 has no documented exit branch for "user has no substantive context to transfer." On thin prompts ("write me a doc") the workflow loops through 5 meta-questions, an info-dump request, and 5-10 clarifying questions before producing a generic template — high overhead with no payoff. SKILL.md should add an early-exit detector
Stage 3 (Reader Testing) is gated on subagent access. In Claude Code it runs automatically and is the most valuable piece of the workflow — in our run it surfaced 3 real gaps the author had missed. On claude.ai the user must do it manually by opening a fresh chat, which most users will skip
the "use create_file / use str_replace" guidance assumes a host that exposes those tools as artifact primitives; in environments that only have raw file I/O the workflow still works but the "provide artifact link after every change" instructions become no-ops
no integrations were available in our run (no Slack/Drive MCP). The SKILL.md branches for those cases (cleanly), but the linked threads and prior RFCs the user mentioned could not be pulled in, only described — a real loss of fidelity. The skill correctly nudges the user to enable connectors but cannot itself install them
no token / latency cost guidance. A full run is ~50 minutes of chat and probably 30-60k input tokens by the time Reader Testing finishes. For short docs that ratio is bad; SKILL.md should warn users to use the workflow for docs that will be read by more than one person
"never reprint the whole doc" / "always use str_replace" is good guidance but unenforced — an agent that ignores it will burn context fast on long docs without anything in the skill catching the regression

02 — Review

Our evaluation

Tier-2 Review: doc-coauthoring

What We Attempted

We evaluated the doc-coauthoring skill from the content-writing cluster, designed to guide users through a structured three-stage workflow for co-authoring documentation. The skill claims to help users efficiently transfer context, refine content through iteration, and verify the doc works for readers when using Claude as a co-authoring assistant. We attempted to install and invoke the skill in our test harness to verify it functions as described.

What Failed

The skill failed to run cleanly in our test harness due to a fundamental structural issue: the skill is a conversational workflow guide with no installable or invocable components. Specifically:

Install step (skipped): The SKILL.md provides no installation command, package manager instructions, or setup script. It is a prompt-level guide for Claude's behavior during conversation, not a CLI tool or library. Our harness could not execute any install action because none was defined.
Smoke invocation (skipped): The SKILL.md describes a conversational workflow that requires an interactive Claude session to execute. There is no executable entry point, no API endpoint, no script to run. Our smoke-invocation test expects a command that can be called in a headless environment; this skill provides none.

What We Observed

Test results: 0 tests passed, 0 partial, 0 failed, 2 skipped. The skill is inherently non-invocable in our current harness design, which assumes skills have some executable component. The skill's value lies entirely in guiding Claude's conversational behavior, which cannot be verified through our current automated pipeline.

Skill content quality: The SKILL.md is well-written and logically structured. It clearly defines trigger conditions, three workflow stages (Context Gathering, Refinement & Structure, Reader Testing), and provides detailed guidance for each stage. The trigger clarity (5.0), output specificity (5.0), scope precision (5.0), and self-containment (5.0) scores reflect excellent documentation design. The reusability score (4.0) is slightly lower because the skill is tightly coupled to Claude's conversational interface and cannot be used in non-Chat environments.

Failure mode specificity: The failure is not due to bugs or errors in the skill's logic, but rather a mismatch between the skill's format (conversational workflow guide) and our test harness's expectations (executable components). The skill works as intended when used within Claude's chat interface, but our harness cannot simulate that environment.

Rating Assessment

Composite score: 4.8 / 5.0 — This rating is theoretical until physical re-run resolves the failures. The score is based on document quality analysis rather than functional verification. We consider the rating valid in principle, as the SKILL.md demonstrates excellent structure, clarity, and completeness. However, we cannot confirm the skill performs as described in a live Claude session without manual testing.

Value Assessment

Despite the test harness failures, the skill appears valuable in principle. The three-stage workflow (context gathering, iterative refinement, reader testing) addresses a real need for structured document creation. The emphasis on closing the knowledge gap between user and Claude, testing documents in a "fresh" Claude instance, and handling templates and existing documents are practical features that would benefit users writing proposals, specs, or decision docs. The skill's design is thoughtful and user-centered.

The skill would likely be effective in its intended environment (Claude chat). The test harness limitations should not be interpreted as a quality flaw in the skill itself. We recommend re-running evaluation in a conversational testing framework to confirm functional correctness.

03 — Tests

What we tried

Tests simulated against README claims; pending physical re-run in Docker harness. Ran 2026-06-07.

Overall: broken. 0 tests passed, 0 partial, 0 failed, 2 skipped; skill is a conversational workflow guide with no installable or invocable components.

Test	Status	Notes
install	skipped	SKILL.md does not provide any installation command or package manager instructions; it is a workflow guide for Claude, not a CLI tool.
smoke-invocation	skipped	SKILL.md describes a conversational workflow for Claude, not a command-line invocation; there is no executable entry point to run.

04 — Cross-validation

2 sources verified

Best source github:anthropics/skills
Authority tier Tier 1 — Official
Stars ★ 137,502
Source link https://github.com/anthropics/skills/blob/main/skills/doc-coauthoring/SKILL.md ↗
First published 2026-05-19
Last modified 2026-06-07

Install

Use this skill

/plugin install doc-coauthoring

Use cases

Tasks this skill helps with

Compare with

doc-coauthoring

When this fires, what it takes, how it installs

Our evaluation

Tier-2 Review: doc-coauthoring

What We Attempted

What Failed

What We Observed

Rating Assessment

Value Assessment

What we tried

2 sources verified

Use this skill

Tasks this skill helps with

Head-to-head pages featuring doc-coauthoring

More in Content & Writing

github-swyxio-spark-joy

algorithmic-art

github-swyxio-swyxio

figma-generate-design