Document Generation  ·  Official

xlsx

Use this skill any time a spreadsheet file is the primary input or output.

  • libreoffice
human agent ↗

input Input — 15-row Q1-Q2 sales dump (raw_sales) + 3 model assumptions + empty Q1_summary stub
output Output — same workbook with a revenue formula column, a SUMIFS-driven Q1 P&L sheet, and color coding (blue input / black formula / green cross-sheet / yellow assumption) per SKILL.md. Cached values are None until LibreOffice recalc.

What we ran it on:

Composite

3.9

C 4.8 · A 3.3

How we got there

Craft · D1–D5

D1 · Trigger clarity 5.0
D2 · Output specificity 5.0
D3 · Scope precision 4.5
D4 · Self-containment 5.0
D5 · Reusability 4.0

Adoption · A1–A5

A1 · Maintenance 2.5
A2 · Documentation 3.8
A3 · License 2.5
A4 · Adoption 5.0
A5 · Authorship 2.0

Spec

When this fires, what it takes, how it installs

Fires when

  • user has an .xlsx/.xlsm/.csv/.tsv file they want read, edited, or fixed
  • user wants to build a financial model with industry-standard color coding (blue input, black formula, green cross-sheet, red external, yellow assumption)
  • user wants to convert messy tabular data into a proper spreadsheet
  • user wants formulas computed (SUMIFS, SUMPRODUCT, growth rates) instead of Python-side hardcoded values
  • user references a spreadsheet by name or path in passing and wants something done to it

Skip when

  • user wants the deliverable as a Word doc, HTML report, or Google Sheets API call
  • user is on a system without LibreOffice and cannot install it (recalc.py is unusable, formulas stay un-evaluated)
  • user needs human-text commentary containing strings like "#N/A" or "#DIV/0!" in cells (recalc.py's scanner will misreport them as errors)
  • user expects working examples to ship as runnable .py files (only inline snippets ship; no requirements.txt)

Takes

  • file:xlsx any .xlsx/.xlsm; .csv/.tsv supported via pandas
  • structured-data:pandas DataFrame for fresh-build workflows, an in-memory DataFrame is the typical seed

Returns

  • file:xlsx formulas, formatting, multi-sheet structure preserved by openpyxl write; cached values remain None until recalc.py succeeds
  • structured-data:JSON error report (from scripts/recalc.py) only produced when LibreOffice is available; scanner uses substring match so plain-text cells containing error tokens are false-flagged

Install

pip install openpyxl pandas
  • macos: brew install --cask libreoffice (REQUIRED for scripts/recalc.py to evaluate any formula)
  • linux: apt-get install libreoffice (or equivalent — required for recalc.py)

No requirements.txt or pyproject.toml ships with the skill folder; deps must be read off SKILL.md prose. recalc.py imports from a sibling 'office' package (scripts/office/) and must be run from the scripts/ parent dir or with PYTHONPATH set. The script does not handle FileNotFoundError on soffice — it crashes with a raw traceback instead of returning the documented JSON contract.

Caveats

  • scripts/recalc.py hard-crashes with an unhandled FileNotFoundError if soffice is missing — it does NOT emit the documented JSON error contract, breaking any caller that parses stdout as JSON
  • recalc.py error scanner uses substring match (`if err in cell.value`) — any cell whose text mentions "#N/A", "#DIV/0!", "#REF!" etc. as plain commentary is silently flagged as a formula error, sending the SKILL.md verify-and-fix loop on a wild goose chase
  • openpyxl-written formulas have no cached value, so a downstream consumer that opens the file with `data_only=True` sees `None` for every formula cell until LibreOffice has touched the file at least once — the workbook looks "empty" in any read-only pipeline
  • opening with data_only=True and saving silently strips all formulas (SKILL.md mentions this in Best Practices but it is one bullet buried far from the workflow section — agents reading top-down often miss it)
  • no requirements.txt or runnable example files ship; the SKILL.md inline snippets are copy-paste, not importable, and the scripts/ folder contains only recalc.py + office/ helpers (no end-to-end build example)
  • recalc.py relies on installing a LibreOffice user-level macro on first run — sandboxed CI without ~/Library or ~/.config write access will silently fail at setup_libreoffice_macro()
02 — Review

Our evaluation


Our take

The xlsx skill is what you reach for when an agent task touches a spreadsheet. It reads, writes, transforms, and creates .xlsx files (Excel 2007+) with workbook-level fidelity — multiple sheets, formulas, named ranges, basic formatting. Like pdf, it benefits from being part of the official Anthropic catalog: scoped narrowly, documented clearly, and unsurprising in behavior.

What it does well

Sheet-level operations are the strongest suit. The skill can extract data from named ranges, copy formatted regions between workbooks, apply formulas, and write back without disturbing existing styles. For data-pipeline tasks — pull from a source, transform, push into a templated workbook — it's the right tool.

Formula support uses openpyxl under the hood, which means the skill writes formulas as strings into cells. Excel evaluates them when opened. The skill does not compute formula results itself, which is correct: it would be a different (and much more complex) tool if it did.

Charts and conditional formatting are partially supported. Bar charts, line charts, and simple pivot tables work. Sparklines and advanced conditional formatting rules do not round-trip reliably — the skill is honest about this in its docs.

What it doesn't do

xlsx does not handle .xls (the legacy binary format) — for that, you need a different toolchain (often xlrd for read-only). The skill also does not preserve macros (.xlsm) or embedded objects (images, OLE objects). If your workbook is a complex Excel-as-application document, expect lossy round-trips.

Performance on very large sheets (>100k rows) degrades because openpyxl is not streaming. For genuine big-data workloads, the right move is to convert to CSV or Parquet outside the skill and use a streaming reader.

When to reach for it

Reach for xlsx when the input or output is an .xlsx file and the operation is read, write, transform, or template-fill. Reach for something else if you're working with .xls legacy files, .xlsm macro-enabled workbooks, multi-million-row datasets, or need real spreadsheet engine semantics (formula execution). For tabular data without Excel-specific requirements, the pdf skill (for PDF tables) or simple CSV processing in bash will usually be faster.

04 — Cross-validation

2 sources verified

Install

Use this skill

/plugin install xlsx
Use cases

Tasks this skill helps with