The XML Archiver - Documentation - Maintaining Archives

Running XArch

Command Center

Maintaining Archives

Query Language

M A I N T A I N I N G A R C H I V E S

IO Driver

IO driver abstract from the data model of archives and databases. A driver is basically a wrapper around any tool that generates XML output. Instead of writing the output to a file the driver calls the appropriate methods of a provided callback handler that transform elements, attributes and text into internal node objects. We currently provide several different drivers that allow to read and write data in different formats.

Creating Archives

Archives are created using the following command:

CREATE ARCHIVE name OF type {HAVING PROPERTIES property-list}

The name of an archive has to be unique within an archiver instance. The type of an archive specifies the format, i.e., the IO driver used to read and write the archives content. We currently support two IO driver for archives:

W3X	XML-based read-only driver to access archives on the Web.

XML	SAX-based XML Archive and Document Driver.

The main driver to create abd stotre archives is currently the XML driver. The W3X driver is intended to allow read-only access to local XML-archives over the Web.

When creating an archive a set of driver-dependent properties may be specified. For the XML driver the most notable properties are:

COMPRESSED	The archive is either stored as plain text (=false) or in gzip'ed format (=true).

DEFAULT_SCHEMA	The absolute path of the default key specification for the archive.

DIRECTORY	The directory in which the archive file(s) will be stored.

INDENTION	The indention used when writing the XML data file. Possible values are NONE, TAB, or number of space characters.

TIMESTAMP_ATTRIBUTE	The attribute name used to represent timestamps in the data file.

TIMESTAMP_ELEMENT	The element name used to represent timestamps in the data file.

The W3X driver supports only a single property:

URL	The URL of the archive description file on the Web.

The archive description file pointed to by the URL is a modified version of the name.archive file that is created locally for every XML-archive.

The current properties of an archive can be listed using the following command:

DESCRIBE ARCHIVE name

Key Specification

Keys are fundamental for the archiver as they are used to identify corresponding objects when merging different database versions. Thus, each database version has to follow a given key specification. One can think of the key specification as the schema of the database version. The key specification is provided as a text file. The DEFAULT_SCHEMA property allows one to specify a default key specification when creating an archive. Some of the IO driver also allow to specify a separate schema for each database version. In general, there are three types of keys:

EXISTENCE	The specified element is unique among its siblings.

VALUES (path-expr)	The element is uniquely identified by the values of the specified sub-elements.

SUBTREE	The element is uniquely identified by the value of its complete sub-tree.

Note that all keys are relative keys, i.e., they uniquely identify an object among its siblings. Consider the following XML document snippet:

<NAME>Marketing</NAME>

...

We assume that each department is identified by its name and within each department each employee is identified by its Social Security Number. Furtermore, each department has only one name, and each employee has only one Social Security Number, name, and salary. The following key specification then represents the documents key structure:

KEY /COMPANY BY EXISTENCE,

KEY /COMPANY/DEPARTMENT BY VALUES (NAME),

KEY /COMPANY/DEPARTMENT/NAME BY EXISTENCE,

KEY /COMPANY/DEPARTMENT/EMPLOYEE BY VALUES (SSN),

KEY /COMPANY/DEPARTMENT/EMPLOYEE/SSN BY EXISTENCE,

KEY /COMPANY/DEPARTMENT/EMPLOYEE/NAME BY EXISTENCE,

KEY /COMPANY/DEPARTMENT/EMPLOYEE/SALARY BY EXISTENCE,

Merging Database Versions into an Archive

After creating an archive new versions of a database (or document) can be merged into the archive using the following command:

INSERT INTO name {AS version} {IF HAS CHANGES} {NO VALIDATION} FROM [SOURCE data-source | TYPE type document-properties]

There are some optional components in the INSERT INTO-statement:

AS version	Allows to specify a version label that will be used when displaying the content of an archive. If obmitted, the current system time at the merge will be used as the version label.

IF HAS CHANGES	The new version will only be merged into the archive if it differs from the last version that has been merged into the archive.

NO VALIDATION	By default, the resulting archive after merging is validated for correctness before the existing data file is replaced with the new one. This validation may be skipped for performance reasons.

You also have to specify the IO driver (type) that is to be used for reading the data to be merged. XArch currently supports three driver for reading a new version of a database (or document):

DEP	Data Export Program - Relational Database Reader

RDBE	Relational database export driver

XML	SAX-based XML Archive and Document Driver

The first two driver may be used to merge data directly from a relational database. The XML driver allows to merge XML documents into an archive. Depending on the driver used a set of driver-dependent properties may be specified. Please refer to the examples on how to use these IO driver.

You may also create and use data sources when merging data into an archive. A data source is basically a shortcut that maintains the IO driver and properties to be used when reading the data (see examples for more details).

Dropping Archives

Archives are dropped using the following command:

DROP ARCHIVE name

Note that this will delete the archive and all of its related files from your disk!