Tagged PDFs: Accessibility, Reading Order & Structure

Untagged PDFs are practically invisible to screen readers. A tagged PDF carries an internal structure tree — like HTML — that says "this is a heading, this is a list, this is a figure with alt text." That tree is what makes a document accessible, compliant, and reusable across formats.

Standard Tag Types

TagHTML equivalentPurpose
H1 – H6<h1>…<h6>Document and section headings (no skipped levels)
P<p>Paragraphs of body text
L / LI / Lbl / LBody<ul> / <ol> / <li>Lists with explicit labels and bodies
Table / TR / TH / TD<table>Tables with headers — scope and span where needed
Figure<img>Images, with alt text in /Alt attribute
Link<a>Hyperlinks (also need accessible label)
Artifact(decorative, hidden)Decorative content ignored by screen readers

Reading Order Matters Most

Once tags exist, the order they appear in the structure tree determines how a screen reader speaks the document. Common mistakes: page headers and footers tagged before body content (so the screen reader announces the page number and running title before every paragraph), two-column layouts read as a single column (left-right zig-zag instead of column-by-column), images placed in the middle of sentences instead of around them. Always run through with NVDA or VoiceOver and adjust.

Building an Accessible PDF

  1. Author with semantic styles (heading styles, list styles, table styles) — the export then produces correct tags automatically.
  2. Add alt text to every meaningful figure; mark decorative ones as Artifact.
  3. Declare the document language in the catalog so screen readers pick the right pronunciation engine.
  4. Set the document title and configure it to display in the title bar instead of the filename.
  5. Validate with PAC 3 (or PAC 2024) plus Acrobat's accessibility checker, then test with a screen reader.
  6. Target PDF/UA for compliance; PDF/UA + PDF/A-2a covers both accessibility and archival needs.

Common Mistakes

  • Skipping heading levels (H1 → H3) — confuses outline navigation.
  • Empty alt text on meaningful images, or generic "image" alt text that conveys nothing.
  • Decorative borders and watermarks tagged as figures instead of artifacts.
  • Tables built with tab characters instead of real table tags — screen readers can't navigate cells.
  • No language declared, so all content reads in the screen reader's default voice.

Shrink PDFs Without Breaking Tags

Compress files in-browser while preserving the structure tree and accessibility.

Compress PDF →

Frequently Asked Questions

A PDF with a structure tree describing headings, paragraphs, lists, tables, figures.
Visual is on-page layout; reading is logical sequence used by screen readers.
ISO 14289 accessibility standard for PDFs; aligns with WCAG for PDF.
Document Properties, PAC checker, Acrobat accessibility checker, plus screen-reader testing.
Possible via auto-tagging tools, but expect manual correction on complex layouts.