XML Data Binding: Integrating XML and Object-Oriented Technologies

Track: Deploying XML, End-User Applications, Core Technologies

Audience Level: High Level/Technical view

Time: Thursday, November 17 16:45

Author: Neil Chaudhuri, LMI Government Consulting

Keywords: XML, W3C XML Schema, XML Data Binding, Java, JAXB, Castor, Object-oriented, Marshalling, Unmarshalling, DOM, SAX, XPath

Abstract:

Data are the essence of business processes and technical applications, and managing data effectively is critical for success in any industry. To that end, XML has emerged as the dominant syntax for data management. The fundamental organizing principle of XML is hierarchy. Parent-child relationships among data are maintained to infinite depth through markup. Hierarchies also serve as a critical component of XML’s validation capability. An XML Schema document defines the rules for structuring data within an XML instance by describing a finite set of hierarchy sequences and an explicit set of sequences of elements within them. Hierarchy, therefore, is the underlying principle of data management in XML.

While XML is a relatively recent arrival on the technology landscape, object-oriented (OO) programming has long been venerated as the dominant paradigm for developing complex, mission-critical software. From Smalltalk and C++ to Java and C#, OO’s fundamental organizing principles for data management are encapsulation and inheritance. Encapsulation is the principle whereby objects hide their data and allow other objects to have access to them only through defined APIs (Application Program Interfaces). Inheritance is the principle whereby data and behavior are passed from a parent class to its children. Encapsulation and inheritance, therefore, are the underlying principles of data management in OO.

As powerful as XML and OO principles are, it is no surprise that these two threads have been interwoven so often into the fabric of application development. However, it is just as predictable that integrating the two is not without its challenges, for their approaches to data management are nearly incompatible. One utilizes a static hierarchy of data elements while the other utilizes dynamic data exchange among multiple entities through method calls and inheritance.

A solution to the problem of integrating these two approaches is XML data binding. This approach seeks to generate objects from XML Schema documents and populate them with data in instance documents validated against the schemas. Then, after interacting and perhaps evolving in such as way as to meet business requirements, the objects are converted back into XML instances valid against the original schemas. The promise offered by XML data binding is enormous, yet it remains unclear the extent to which this promise has been realized.

This paper addresses this question for the benefit of intermediate and advanced XML and OO developers who have sought to move data seamlessly between the two. The discussion begins with an overview of XML data binding and describes the potential benefits it offers to application developers. It then introduces the reader to two popular Java-XML binding frameworks, JAXB from Sun Microsystems and the open-source tool Castor. The core of this discussion is a comparative and painstaking evaluation of these tools against criteria deemed to be of greatest significance to XML analysts and Java developers in this space. Finally, conclusions are drawn regarding the relative effectiveness of the tools, and suggestions are made for further study.