introduction to pdf/a - english
DESCRIPTION
Why PDF/A? Guaranteed Access to Content: PDF/A ensures that documents may be reproduced in the unforeseeable future. All Documents Accepted: Electronic or paper source; it doesn’t matter. Almost any document can become PDF/A-compliant. An International Standard: ISO 19005 (PDF/A) is managed by the international community in an open and democratic process. Conformance May Be Validated: PDF/A conformance imposes stringent technical requirements. A wide variety of vendors produce tools for checking and correcting documents to ensure they meet PDF/A standards. Better for Business and the Environment: PDF/A is electronic paper. As such, PDF/A enables better document workflows, reduces waste and carbon emissions and improves overall efficiency.TRANSCRIPT
Introduction to PDF/A
David van Driessche Chief Technology Officer, Four Pees Treasurer, GWG
The Problem
Better when electronic?
• I made my university thesis in 1994 - MS-Dos 3.31 & Windows for Workgroups 3.11 - WordPerfect 5 & Word for Windows 6.0 - Saved on a floppy-disk - Backup on an Iomega ZIP disk
• That is only 19 years ago…
What is long enough?
• For businesses in Belgium - Direct tax documents: 5 years - Added-value tax: 7 years - Medical records for employees: 15 years
• For engineers - Lifetime of a building / construction
• For libraries - Forever…
The Solution!
Why is this the solution?
• Invented by Adobe in 1993 - Originally as electronic product documentation
and Internet format - But rapid adoption elsewhere from 1996
• A good format! - Focuses on exact visual representation - Printer and platform independent - Compact and complete - Random access
And standardized!
• Adopted by ISO - ISO 32000
• So also vendor independent!
Oooops
• Missing fonts • Font problems • Complex features • Incorrect file structure • Corrupt PDF files • …
• Room for interpretation!
The Solution!
ISO PDF/A
• A subset of the PDF format • Developed and maintained by ISO • Designed to remove ambiguities from the
PDF file format - With the aim to create documents that are
archivable for 50 years (at least) • An ISO standard that will never expire!
Vanilla or Strawberry?
• Different archives require different things…
• Different flavors of the PDF/A standard cater to that: - PDF/A-1b, the basic flavor
• Guarantees visual reproduction - PDF/A-1a, the accessible (or advanced) flavor
• Incorporates all requirements of the basic flavor • Also focuses on the meaning of the document content
PDF/A-1b
Parijs Visual reproduction
PDF/A-1a
Parijs Visual reproduction +
Meaning
PDF/A-1a, Structure
Parijs Document Title Paragraph Paragraph Paragraph
PDF/A-1a, Tagging
Parijs Description
“WWII: American soldiers watch as the Tricolor flies
from the Eiffel Tower again”
Usability / Complexity
• PDF/A-1b - Relatively easy to make - Easy to make automatically (without human
intervention) • PDF/A-1a
- Contains much more usable content -> great for searchability of archives and additional intelligence about archived pieces
- Very hard to create automatically unless source already contains a lot of information
Demo
• PDF/A-1b versus PDF/A-1a
Evolution
• PDF/A-1a and PDF/A-1b will always remain valid standards
• But new versions of the PDF/A standard have been developed as well.
PDF/A-2
• Allows more modern PDF features - Transparency - Layers (optional content) - JPEG 2000 compression
• Adds support for embedded PDF/A files • Adds a new flavor: “u” for Unicode
- Intermediate in between “a” and “b” - Requires only fonts to be correctly unicode
mapped but no structure or tagging
PDF/A-3
• Adds support for embedding arbitrary files • Examples
- An email archived with its attachments - An Excel spreadsheet archived with the original
spreadsheet embedded
Thanks! Questions?