XML Notes, Pitfalls, and "Gotchas"

by Rob Locher

The purpose of this document is to record the many tricks and "gotchas" that I had to learn in order to be able to work with XML, that weren't adequately documented elsewhere.  It is mainly meant for developers using Visual Studio.NET, but others might find parts useful too.

If you are not familiar with the American slang word "gotcha" ("got you"), it means the same thing as "pitfall": a problem to avoid.


A DOM "Gotcha"

I was writing some DOM code using the MSXML 4 DOM (ProgID "Msxml2.DOMDocument.4.0" hosted in msxml4.dll).  I'm a little new to the DOM, but I wasn't having any luck whatsoever with several methods that search through an XML document, particularly selectSingleNode(), selectNodes(), and getElementsByTagName().  All the examples in MSDN and other web sites said that I should be retrieving something from my simple XML document, but the most I was retrieving was an empty NodeList.  One of my coworkers suggested that I try the MSXML 3 DOM, and that worked fine!  What was happening?  It turned out to be a problem with namespaces -- the MSXML 4 DOM is more strict than the MSXML 3 DOM if a namespace is involved, because the W3C recommendations require it to be.  My XML document uses a namespace, and selectNodes() wanted an XPath expression that specified a namespace.  The cure turned out to be a setProperty() statement (example is in JavaScript):

 doc.setProperty("SelectionNamespaces", "xmlns:n='my_namespace:xml_v1_0'");

When you call a statement like selectNodes(), you must explicitly use a namespace prefix, like this:

 var nlist = doc.selectNodes("//n:fuel");

This problem is mentioned in MSDN, but in such an obscure place that you are unlikely to find it until you have already discovered the solution to the problem.  I wish MSDN called more attention to pitfalls, but until then I will keep publishing articles like this one.  By the way, the documentation in MSDN just mentioned says that "you can use this property to set default namespace as well", but I found that this doesn't work.  You must use an explicit namespace prefix.

By the way, John Tai points out that explicitly using a namespace prefix works for selectSingleNode() and selectNodes(), but not for getElementsByTagName().  Odd...


How Do You Test Your Schema?

If you are creating a schema, you probably have a draft schema and a test document to try and validate.  You tested that your XML document is well-formed by loading it in your browser, and you tested that your schema is valid by loading it in Visual Studio.NET and looking at it in the Schema view (the graphical view, rather than the source view).  But, strangely enough, no tool has been provided to allow you to validate your document against its schema.  You have to create your own.  Here is an ASP page I wrote that does the job.  (I derived it from a code snippet in the MSDN Library).  By the way I would advise you to validate your test document against your test schema constantly when you're developing the schema, because it frequently happens that your XML document is well formed and your schema is valid, but the document fails validation and the error message points to a place in the document far from the actual error.  When that happens, it helps to know that the error is actually in the place you just changed.

<%@ Language=JScript %>
<meta http-equiv="expires" content="0">
<title>XML Document Validator</title>
<body style="font-family: verdana">
<form action="validatexml.asp" method="post" ID="Form1">
<input type="hidden" name="isPostback" value="true" ID="Hidden1">
<input type="submit" name="sbutton" value="CheckXML" ID="Submit1">

if ("true" == Request.Form("isPostback"))
    var x = new ActiveXObject("MSXML2.DOMDocument.4.0");
    x.async = false;
    x.resolveExternals = true;
    x.validateOnParse = true;
    var filename = "D:\\Projects\\TFWNeurs\\documentation\\XML Documents\\"
        + "NEURS message.xml";
    if (x.load(filename))
        Response.Write("Your XML document is well-formed and valid.");
        if (x.parseError.errorCode != 0)
        Response.Write("parse error (line " + x.parseError.line
            + ", character " + x.parseError.linepos + "): "
            + x.parseError.reason + "<br><br>\n");
        Response.Write("error source:<br><span style='font-family: lucida console'>\n");  
        Response.Write(x.parseError.srcText.replace(/</g, "<").replace(/>/g, ">"));
        Response.Write("===NO PARSE ERROR===\n" + x.xml);
    catch (e)
}  // if
<p> </p>


Visual Studio.NET's "Help" in Creating Schemas

The quickest (and perhaps easiest) way to create a schema for a complex XML document is to load the XML document in Visual Studio.NET, right-click, and choose "Create Schema".  Well watch out, because when I tried it, the document didn't validate against the generated schema!  I had to fix two minor errors.  I don't blame the folks at Microsoft -- the feature is pretty ambitious, and in this case I would rather have a buggy feature that almost works than no feature.  Just beware, and fix any minor errors that crop up before you do anything else with the schema.


A Big "Gotcha"

I found a great big "gotcha" that I haven't seen documented anywhere: if your document refers to a schema and the parser can't find your schema, then the parser will only test to see if the document is well-formed, and won't test to see if the document is valid.  Your application will be none the wiser, and won't be notified that the parser could not find the schema.  So, if you want an application to validate a document against a schema, make sure that you test the application with a document that is well-formed but not valid to see that the application catches the error.  Also keep in mind that different methods of referring to a schema are used depending if you are using Microsoft's XML DOM version 3.0 or 4.0 -- if you are migrating code to the new DOM, you might not be using the schema when you think you are.


XSL Transformations (XSLT)

Where to Use XSL Transformations

I see XSL Transformations described as "powerful and flexible" all over the place.  Maybe they are talking about XSL Formatting Objects, about which I know nothing.  If they are referring to using XSLT to transform an XML document into some other text file, I totally disagree.  While it is true that XSLT is a programming language, its limitations are many.  By way of example, I quote from the XSL Transformations 1.0 Recommendation, paragraph 11.5 : "XSLT does not provide an equivalent to the Java assignment operator 'x = "value";' because this would make it harder to create an implementation that processes a document other than in a batch-like way, starting at the beginning and continuing through to the end."  That's right, you can't assign a new value to a variable!  You can do it with a DOS batch file, but you can't do it in XSLT.

Allow me to briefly enumerate some of XSLT's many shortcomings as a programming language:

So, what does XSLT have going for it, you ask?

As I mentioned, XSLT has the "node set" as an elementary type in the language.  This makes it the ideal language if you want to merely rearrange an XML document in a simple way, without much programmatic manipulation.  No other language allows you refer to a chunk of an XML document as a single entity.  But if your application requires anything more than the simplest programmatic manipulation, or if your output format isn't HTML or XML, then you might do well to consider the alternatives.  Allow me to suggest using your favorite programming langage and the DOM instead.


A HUGE "Gotcha"

This one kept me at a standstill for hours.  Here's what happened: I had an XML document which was both valid and well-formed that I wanted to transform with an XSL (XSLT) stylesheet.  I was using the MSXML 4.0 DOM.  I copied a very simple XSLT transform out of a book that merely output the value of a single attribute as text.  When I applied the transform to the example out of the book, it worked.  When I applied the XSLT transform to my XML document, there was no output.  Of course I tried changing the syntax fifteen different ways to do the same thing.  I also suspected that there was a bug in the application, so I wrote an ASP page to do the transformation.  Still no luck.  I sent the files to my coworker Guy, and he was stumped too.

In desperation I tried a Google Groups search.  (In retrospect, I have been able to find the answer to all sorts of thorny problems fast with a well-chosen Google Groups search.  In future I plan to go to Google Groups sooner!)  The third article I saw contained the answer to my problem, from the newsgroup comp.text.xml .  It turns out that if the nodes in your XML document belong to a namespace, you must reference the namespace in the XPath queries in the XSLT transform.  Not only must you refer to the namespace, but you must do so explicitly by using a namespace prefix, rather than making the default namespace in your transform be the same as the namespace in your XML document.  According to Microsoft's Knowledge Base article 313372, this peculiarity of the MSXML 4.0 DOM is "by design".  Well thanks a lot Microsoft for that design decision!  Actually they are following the W3C specification, as you can plainly see in paragraph 2.4 of the W3C XSL Transforms Version 1.0 Recommendation.  After discovering the gotcha and thinking about it for a while, it seems to me that the default namespace was probably reserved for HTML, since HTML is a very popular format for the output of an XSLT transform.  Still, I didn't expect it; this just reinforces my determination to use a different programming language and the DOM next time, because my favorite programming language includes features that XSLT doesn't, like compiler warnings.

Here's an example of what I'm talking about.

XML document:

<?xml version="1.0" encoding="utf-8" ?>
	xsi:schemaLocation="opnavinst4100:11 neurs.xsd">
	<from uic="N03363" name="Kitty Hawk" />



XSLT transform: if you take away the highlighted text, you might think that it should work, but it won't!

<?xml version="1.0" encoding="utf-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:n="opnavinst4100:11" version="1.0">  
<xsl:output method = "text" />

<xsl:template match="/">
<xsl:apply-templates select="//n:from"/>

<xsl:template match="*">
uic attribute contents: <xsl:value-of select="@uic"/>


Processing Order:

  1. The processor takes the current node set, and goes looking for a template that has a "match" parameter that fits the first node.  If starting from the beginning, the root node "/" is the starting node set.
  2. If the processor finds a template that matches the node, it calls that template on that node.  As far as the next template is concerned, that node is the context node.
  3. Any template is free to use an apply-templates or call-templates instruction to pass nodes returned by the select expression to a further template.  That further template starts at step 1 as far as it is concerned.
  4. After any sub-template has finished processing the current node, it processes the rest of the nodes in the node set the same way.
  5. If any nodes are not matched by a template, then the default templates are applied to that node.  The default templates can be very confusing for people just learning XSLT, because it isn't obvious at all what is going on when for example all the unprocessed text nodes dump their text to the output willy-nilly.  Just keep in mind that if you define a template for each node you won't see the default templates.  To see exactly what the default templates are, check out paragraph 5.8 of the XSL Transformations version 1.0 Recommendation.


If You Need to Control Line Feeds: Hints

If you need explicit control over line feeds ("\n" in C++), the best way that I have found to do so is to sprinkle non-breaking spaces into your code.  In HTML a non-breaking space is &nbsp;.  In XML, a non-breaking space is &#160;.  The way that the processor normally processes white space is that it examinesthe space between every pair of XML tags.  If that space contains any literal text, that is to say any non-white space, then everything between the tags including the line feed at the end of a line is preserved; otherwise the white space is ignored.  So, if you want to insert a line feed in your output, put a non-breaking space on the end of the previous line, which presumably had no literal text in it.  Then, since the non-breaking space is literal text, the line feed at the end of the previous line is preserved, and sent to your document.  Yes, it's screwy!


XPath in a Nutshell

This section is a refresher course for me.  If you haven't learned XPath yet, you probably won't figure it out from this section.

An XPath expression is built out of one or more location steps, each of which can consist of three parts: an axis, a node test, and an optional predicate.  Slashes separate the location steps.  If there is a leading slash, then the first location step is relative to the root of the document (not the root element), otherwise it is relative to the current node.  A "node" is any distinct part of the document.  Typically a "node" means an element or an attribute, but a node can also be a comment, a processing instruction, etc.  Each location step selects a set of nodes, which are then processed by the next location step.  The optional predicate is in brackets; typically it is seen in the last location step, but there is nothing to prevent it from being in any of the location steps or all of them.

Here is what an XPath expression actually looks like:  /axis::node_test[predicate]/axis::node_test[predicate]

Axis Principle Node Type Comments
child element default axis (left out of abbreviated form) gets the child elements of the context node
self element gets the context node
parent element gets parents of the context node
preceding-sibling element gets siblings of the context node that precede it in the document
following-sibling element gets siblings of the context node that follow it in the document
descendant element gets every descendant of the context node, not just the immediate children
ancestor element gets every ancestor of the context node, not just the immediate parents
preceding element gets all the preceding-siblings and their descendants
following element gets all the following-siblings and their descendants
attribute attribute gets all the attribute nodes of the current node (which must be an element)
namespace namespace gets all the namespaces which could be applied to the current node (not the namespace to which the current node belongs)
descendant-or-self element retrieves the current node and all its descendants


Node Tests
Node Test Node Type Returned Comments
name element or attribute most common test
text() text only way to get to a text node
comment() comment
processing-instruction() processing instruction
node() all types do-nothing node test


Abbreviation Abbreviation Of A ... Meaning
* node test returns all nodes of the current axis of the principle node type
// location step descendant-or-self:node()
.. (double period) location step parent::*
. (single period) location step self::node()
[x] predicate [position()=x]
@ (not sure) attribute:: (only seems to work inside a predicate)


Operators (skipping the obvious ones like >)
Operator Meaning
| Union (of node sets)
= Equality
!= Inequality