Using E4X With XHTML? Watch Your Namespaces!

One of the best things about AS3 so far, for me, is the decision to make it much noisier about failing. If there was one thing that was frustrating before, it was trying to track down what failed silently and where, only seeing the effects far downstream, with a barely workable debugger. Things are sooo much better now.

Nonetheless! There are always going to be little things that trip up every new programmer until you learn them, or maybe that trip you up over and over because it’s just so hard to remember. Certainly there will be less of these in AS3, but new is exciting, right? Ok, so enough intro. I post stupid mistakes. You learn from my mistakes. Somewhere, an old woman makes waffles. Read on.

This one is more of a user error, but it’s a fair accident that I think anyone could make. I was using E4X to search for a node in some XHTML source, but it kept coming up null. I doubted first my knowledge of E4X, then my sanity. Finally it struck me: XHTML is namespaced! Let’s take a look at the header of a well formed XHTML 1.0 Transitional document:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
	"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
	<title>Title</title>
</head>

What was that, you say? <html xmlns="http://www.w3.org/1999/xhtml">? Is that a default XML namespace, opened for the whole document?

Because the namespace is declared with no prefix, every element in the document falls under that namespace. Makes sense: XHTML tags are defined in the XHTML namespace.

What this means for us is that when we’re parsing well-formed XHTML documents, we also have to specify the XHTML namespace. If you just do some sort of naive search:

myXhtmlDocument.head.title;

you will get null every time. You are searching for non-namespaced nodes, and all XHTML nodes exist in the XHTML namespace. Not the same!

So the moral of the story is beware the namespace with XHTML. But I want to take this opportunity to look at some different ways to achieve this.

  1. The Reduce, Reuse, Recycle

    xhtmlns.as:

    package
    {
    	public namespace xhtmlns = "http://www.w3.org/1999/xhtml";
    }
  2. You can define a namespace in its own file, as public, and name the file the name of the namespace object. Now client code can import this and you won’t have to re-declare namespaces.

  3. The Globally Open

    MyClass.as:

    package
    {
    	public class MyClass
    	{
    		use namespace xhtmlns;
    
    		...
    	}
    }
    

    Open the namespace for the whole file. Now all your E4X should look inside this namespace! This might not be ideal if you might be operating in different namespaces.

  4. The Per-Node Namespace

    myXhtmlDocument..xhtml::a.(@class=="red");

    Hey! Cool! This is like selector a.red. E4X ain’t so bad! The point here is that the <a> node specifically uses the xhtml namespace. Without opening the namespace, you can apply it to individual nodes with the scope resolution operator.

  5. The Joker

    root..*::Button;

    This one will match all nodes with node name Button, no matter what namespace they are in. So say you were parsing an MXML file. This would match <mx:Button> as well as a custom class you might have defined <custom:Button>. Or, for us, it lets us skip the step of defining the xhtml namespace in the first place.

  6. The Namespace Variable

    var xhtml:Namespace =  new Namespace("http://www.w3.org/1999/xhtml");

    Creating a namespace as a variable won’t allow you to open it for a whole file, but it will allow you to use it in an E4X expression by its variable handle as in myXhtmlDocument..xhtml::title.