[1] Comparable to ALTO (XML), it allows the organisation and structure of a page and its contents to be described.
PAGE XML can be used to describe:[citation needed] The format is developed by the Pattern Recognition & Image Analysis Lab (PRIMA) at the University of Salford in Manchester.
[citation needed] It was designed to be used in conjunction with automatic segmentation and transcription techniques (OCR and HTR): indeed, PAGE aims to support each of the different steps in the processing chain for image document analysis (from image enhancement to layout analysis to OCR).
[citation needed] The PAGE XML schema is notably used as an export and import format by automatic transcription software such as eScriptorium[2] and Transkribus.
[3] It is also an export format used by Kraken, a turnkey OCR system optimised for documents in historical and non-Latin scripts.