The PDF 2.0 Document Object Model (DOM) was extracted from ISO 32000-2:2020 and encoded into a specification derived, machine-readable form known as the "Arlington PDF Model". This model is encoded as TSV data, with one file per PDF object. A custom set of declarative functions also capture various relationships that are explicitly stated in ISO 32000-2:2020. The Arlington PDF model is available in GitHub at https://github.com/pdf-association/PDF20_Grammar.
This is a 3D/VR visualisation where:
Instructions are on screen. Nodes (PDF objects) can be dragged around and re-arranged.
For mouse and keyboard navigation, click the "VR" button in the bottom right corner to enter full-screen, then click anywhere on the screen to start. The mouse controls the camera viewpoint while the arrow keys move around. Pressing "ESC" will exit full screen and return to normal web browsing. If you look at a node you will see the name of the PDF object. If you look at a link you will see the name of the object::key reference. But it is a much better experience with a VR headset!
Data last updated: 10 Nov 2020
The PDF 2.0 Normative References tree was extracted from the Normative References clause of ISO 32000-2:2020 which impact parsing of PDF and its many nested formats. Every normative reference was researched and classified, and a comprehensive database of the cascading tree of references was created - see https://github.com/pdf-association/PDF2NormRefs. The PDF Association also maintains a resource page with links to all immediate Normative References at https://reference.pdfa.org/iso/32000/.
This is a 3D/VR visualisation of the GitHub data where:
Instructions are on screen. Documents can be dragged around and re-arranged.
For mouse and keyboard navigation, click the "VR" button in the bottom right corner to enter full-screen, then click anywhere on the screen to start. The mouse controls the camera viewpoint while the arrow keys move around. Pressing "ESC" will exit full screen and return to normal web browsing. If you look at a node you will see the name of the document. But it is a much better experience with a VR headset!
Data last updated: 8 Nov 2020
The PDF specification was originally conceived by Adobe back in the early 1990s, based on their experiences with PostScript. Up to 2007, Adobe controlled the core PDF specification with a number of versions of PDF being regularly released. During this time ISO standardization of PDF subsets also began, orginally driven by the Graphic Arts community to support commercial printing via PDF/X. Since this time many groups, organizations and countries have published a wide variety of PDF subsets or extensions. Some of these PDF specifications found adoption and you will see PDFs 'in the wild', while others did not.
This interactive timeline plots the publication date of major PDF specifications. When a precise publication data is not known, the start of the appropriate month or year is used based on other data sources (such as PDF file metadata or media announcements). The icon of each publication indicates the primary publishing body.
To navigate, use the mouse/touchscreen to click and drag the timeline left and right. Use the mouse scroll wheel to zoom the timeline in and out. If you click on a data point (document), a dialog with the abstract of the related PDF specification will be shown which often includes a URL. The Filter text box below the timeline will filter data points in the top part of the timeline based on a title text match (e.g. try "ISO", "Adobe"). The Highlight text boxes below the timeline will change the marker color in the lowest timeline bar for all data points that match.
Click here to visit the interactive timeline!
Sorry, but there is no 3D/VR here.
Based on SIMILE Timeline Widget.
Data last updated: 16 Dec 2020
The PDF Association has established a public GitHub repo (pdf-issues) to capture and discuss issues with ISO 32000-2:2020 (the latest PDF 2.0 dated revision specification). SafeDocs researchers can review open issues and proposed solutions at https://github.com/pdf-association/pdf-issues.
The PDF Association has established a public GitHub repo (pdf-corpora) as a centralized index to various public sources of PDF-centric corpora. See https://github.com/pdf-association/pdf-corpora.