New XML Validation Technologies in Action

Track: Core Technologies, Case Studies, Deploying XML

Audience Level: Technical view

Time: Wednesday, November 16 14:45

Author: Alex Brown, Griffin Brown Digital Publishing Ltd

Keywords: Validation, DTD, DSDL, RELAX NG, Schema

Abstract:

This paper is based from a number of real-world XML validation projects, and compares and contrasts the experience 'in the trenches' with the current state of the art in XML validation standards.

Validation is a topic of some controversy in the XML community. While there has been movement from the basic validation offered by XML 1.0 DTD's, there is little consensus on whether that movement has been in the right direction. Two rather different XML schema languages, from W3C (XML Schema Definition Language) and ISO (RELAX NG) are perceived to compete, and continuing ISO and W3C standardisation work is doing nothing to reconcile the differences.

Meanwhile, the emerging practice of XML pipelining holds out the prospect of 'mixing and matching' technologies to arrive at a more complete solution.

This presentation will present some real examples of 'hard case' XML validation problems and suggest that standards development in this space can be analysed and advanced with more clarity if considered within a conceptual framework which defines what validation actually 'is', and which explores validation within the context of 'real world' use cases.

Currently validation tends, in practice, to be a process which incorporates many activities, possibly including parsing, transformation, data binding, pipelining and report generation. The result of a validation process is often poorly specified and becomes effectively dependent on the 'quality of implementation' of validation tools.

A validation model must rule which of these activities are in and out of scope. This presentation will consider the features of existing schema languages and argue that some traditional aspects of validation, carried over from the SGML past, represent a confusion of concerns which lead to a needlessly complex architecture and implementations. Similarly, even some validation features (such as the PSVI) introduced into recent validation languages blur our understanding not just of what validation, but of what XML itself, is.

Users faced with the need to validate XML often have requirements which haven't been addressed by existing standards, and which might even conflict with 'the philosophy of XML' (whatever that may mean). Custom coding or new approaches to validation standards are required to get the job done. The presentation will give examples of such problem cases, and solutions that have been adopted in response.