UVC-based preservation

The proposed solutions of relying on standards and migrations are labeled time consuming and ultimately incapable of preserving digital documents in their original form.

He suggests: "an ideal approach should provide a single, extensible, long-term solution that can be designed once and for all and applied uniformly, automatically, and in synchrony (for example, at every future refresh cycle) to all types of documents and all media, with minimal human intervention."

Rothenberg's approach was met with skepticism and considered too technically challenging, too expensive and too time consuming, and therefore an economic risk (without the support of empirical evidence).

(See further reading section) Raymond A. Lorie, during his employment at IBM Research Centre Almaden, initiated the development of a UVC-based solution to long-term digital preservation.

[2] He describes the approach as ‘Universal’ because its definition is so basic that it will endure forever, ‘Virtual’ because it will never have to be physically built and it is a ‘Computer’ in its functionality.

Raymond van Diessen is responsible for extending the application of the UVC concept to preserve more complex objects.

The National Library of the Netherlands (Koninklijke Bibliotheek, KB) played a major role in demonstrating that emulation based on the UVC concept is a viable option for long-term digital preservation.

His method was to use software emulation to reproduce the behaviour of obsolete computing platforms on newer platforms offering a way of running a digital document’s original software in the far future, thereby recreating the content, behaviour, and ‘look and feel’ of the original document.

Raymond A. Lorie recognized the difficulties in trying to create a program to emulate a 'real' machine on a future platform and realised that this approach was overkill for the purpose of preserving digital objects.

The UVC-based approach resulted in the UVC as one of the permanent access tools for JPEG/GIF87 images within the Preservation Subsystem of the KB’s e-Depot.

[6] The Universal Virtual Computer is part of a broader concept, called the UVC-based preservation method.

This method allows digital objects (like text documents, spreadsheets, images, sound waves, etc.)

The text itself needs to be exported i.e. in ASCII format and can be saved as a sequence of homogeneous elements (all presentation attributes like font, size, etc.

A decoding algorithm (method) extracts the various data elements from the internal representation and returns them tagged according to the schema.

The schema is clearly application-dependent as it describes the structure and meaning of the tags as parts of a specific information type.

In this case, using a similar method as described above, the following needs to be stored at archiving time: In the future - the UVC interprets the UVC code which will yield the same result as the original program running on the original operating system.

When Input/Output interactions are involved things become more complicated as an additional UVC program that mimics the functioning of the Input/Output device processor must be archived.

Like the Java Virtual Machine and the Common Language Runtime, the UVC is actually an emulator which allows a program to run on virtual instances of the necessary, usually obsolete, hardware, and will continue to emulate the necessary hardware as technology continues to evolve.

This is part of the normal design of an application Step 3 – Write the UVC program for data interpretation Step 4 - Archive the schema information by storing an internal representation of the schema information in the bit stream together with a UVC program Q to decode it.

Because of the simplicity of the UVC concept, it is fairly easy for skilled software developers to construct a UVC emulator for a particular platform of the time Step 2 - Develop a Logical Data Viewer (a restore program to restore the data).

Since the logical view for the schema information is fixed a single restore program may actually support all applications.

If the future client already knows the logical view for the documents being restored then the schema does not necessarily needs retrieving.

This format decoder program runs on the UVC, which is the platform-independent layer, independent of future hard- and software changes.

The LDV is an instantiation of the LDS, describing the structure and meaning of the tags as parts of a specific information type.

In return it retrieves an LDV and reconstructs a specific representation of the original object’s meaning.

The architecture relies on concepts that have existed since the beginning of the computer era: memory, registers and basic instructions without secondary features often introduced to improve the execution performance.

For example, a PDF document can be displayed as a series of JPEG images thereby retaining the 'look and feel' of the original digital object but sacrificing the functionality.

The technology continues to be developed by Raymond van Diessen (IBM) to include dynamic objects by exploiting the communication facility between the UVC program and a future application.

This approach is a strategy with high risks Emulation Virtual Machine (EVM) The EVM was presented by Jeff Rothenberg in 1999 and involves introducing an additional layer between the host platform and emulator and is said to be platform and time independent.

It is quite complex as an emulation specification needs to be written for the computer platform on which the original software runs.