Performance Analysis of XML APIs

Track: Late Breaking News, Core Technologies

Audience Level: Technical view

Time: Tuesday, November 15 11:45

Author: Eric Perkins, IBM Corporation

Author: Margaret Kostoulas, IBM Corporation

Author: Abraham Heifets, IBM Corporation

Author: Morris Matsa, IBM Corporation

Author: Noah Mendelsohn, IBM Corporation

Keywords: Parsing, XML, Schema, Performance

Abstract:

XML, as a data interchange technology, delivers key advantages in interoperability due to its flexibility, expressiveness, and platform-neutrality. The broad range of applications and growing base of users for XML technologies has driven the development of common tooling, providing a consistent, robust infrastructure on which to build applications. These advantages have spurred widespread adoption of SOAP and web services, as a key component of the next-generation of business computing infrastructure. It is increasingly clear, however, that the advantages of XML result in a heavy performance penalty, and that current parsing technologies are unable to meet the performance demands of an XML-based computing infrastructure.

Current implementations of XML parsers use a variety of different APIs including DOM, SAX, and, in the web services world, JAX-RPC. Some progress has been made in recent years in improving XML parsing performance by replacing heavyweight APIs like DOM to transfer data from the parser to the application with lighter-weight methods, such as SAX, or application-specialized options, like JAX-RPC. While lighter-weight, the event-based SAX API is more difficult for application developers to program to than the straight-forward document model of DOM. Nonetheless, SAX is seen as a performance enabling API. Furthermore, even with lighter-weight APIs, performance remains a significant obstacle to many Web Services applications. This paper discusses various aspects of SAX and other current XML APIs, and analyzes aspects of their performance through the use of micro-benchmarks demonstrating how much of current XML parsing time is being lost to these inefficiencies. With better performance, and advantages in terms of usability and robustness, a new or modified API for XML would enable fast and easy XML parsing and validation across broad classes of XML applications.