HTML was written with one main goal in mind – to define a language that allowed the formatted display and presentation of information across as many platforms as possible, be it text, graphics or even added flexibility through scripting languages to allow interaction within the browser; and server databases have caused an explosion in the volume of information available nowadays.
All is not well, though. Developers are struggling to keep pace with the complexities of maintaining large sites that are compatible across browsers, where a lot of the information is changing on a daily basis. There is also a lot of repetition of information both within a single site and across the Web as a whole and somehow the ends need to be tied up. Surely databases are the answer? Well, yes and no. True, a database can act as the shared source, but not everybody uses or has access to a particular database format. Enter stage right, XML – eXtensible Markup Language.
Why Another Language?
XML, like HTML, has its roots in the Standard Generalized Markup Language (SGML), but shares even more of its features. A great limitation with HTML is that it has driven Web development down a one-way street; content gets produced and displayed but it is very difficult to later extract meaningful information from the mixture of tags and text.
Unlike HTML, XML has no fixed tags. The user must define tags that mark individual items of data within the file. XML holds the promise of a common source definition for the Web in much the same as Rich Text Files allow formatted text to be exchanged between different word processors.
How Does it Work?
The XML file will contain a list of names and addresses all appropriately ‘marked up’ with your chosen tags. It is normal also to define a Document Type Definition (DTD), either within the same file or in a separate file that can be referenced by other XML files. The DTD is not essential, but does allow the XML content to be validated and properly interpreted by other users.
In addition to this, a style sheet is required. This style sheet provides a template that defines how each of the elements in the XML file are to be displayed, by referencing the named tags in the XML file and embedding them within the required HTML formatting tags. A variety of style sheets can be used to define the look and feel of the resulting page. Most familiar will be Cascading Style Sheets (CSS), as used with Dynamic HTML.
DSSSL (Document Style Semantic and Specification language), apart from being a mouthful, is powerful but tricky to get to grips with. Showing most promise for the future is XSL (eXtensible Style Language). It is the language that has been written specifically for XML, it has the power to provide handling and formatting of XML data and already has some direct browser support.
Pushing all of the display and formatting into the selected style sheet brings real benefits. Producing cross-browser content is tedious at the best of times, but using the XML route only requires style sheets to be defined for each browser variant required – no modification of the source files containing the text content and navigation is necessary.
That’s not all – it gets better. Using a single set of common XML files, a range of different style sheets can be used together with an appropriate parser (which checks and processes the XML script) not only to produce HTML code for standard computer-based Web browsers but for all sorts of other applications.
A style sheet containing the necessary formatting and filtering of data for WAP can turn your XML Web content into WML (Wireless Markup Language) suitable for mobile phones, where only a small selected amount of information can be displayed. Alternatively style sheets could also render the XML for direct use with WebTV – the format commonly used for Digital Television, which, though sharing most of the features of standard HTML, is different enough to usually require separate development. It is even possible to produce common document types for distribution and other purposes – packages already exist for producing Acrobat PDF files direct from XML.
Direct support for displaying XML in the browser is limited unfortunately. Microsoft first introduced XML support in version 4 of Internet Explorer via bolt-on XML and XSL parsers. At the moment IE7 with integrated XML support is the best option. Until there is a critical mass of XML-aware browsers on the Web, your best way forward for XML development is to create standard HTML from pre-processed XML and style sheets. This can be accomplished using a parser either off-line or on-line on the server in real time.
Making Web pages more data-aware through the use of XML has the potential of transforming the Web landscape. Large organizations and businesses which are concentrating on the areas of on-line e-commerce, transaction processing and information sharing, require a common language and means of identification to work efficiently.
XML will also make integrating new technologies, such as speech recognition, easier for developers. As the Web expands and transforms itself, XML will only become more important.