Roger Braunstein

Using E4X With XHTML? Watch Your Namespaces!

One of the best things about AS3 so far, for me, is the decision to make it much noisier about failing. If there was one thing that was frustrating before, it was trying to track down what failed silently and where, only seeing the effects far downstream, with a barely workable debugger. Things are sooo much better now.

Nonetheless! There are always going to be little things that trip up every new programmer until you learn them, or maybe that trip you up over and over because it’s just so hard to remember. Certainly there will be less of these in AS3, but new is exciting, right? Ok, so enough intro. I post stupid mistakes. You learn from my mistakes. Somewhere, an old woman makes waffles. Read on.

This one is more of a user error, but it’s a fair accident that I think anyone could make. I was using E4X to search for a node in some XHTML source, but it kept coming up null. I doubted first my knowledge of E4X, then my sanity. Finally it struck me: XHTML is namespaced! Let’s take a look at the header of a well formed XHTML 1.0 Transitional document:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
	"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
	<title>Title</title>
</head>

What was that, you say? <html xmlns=”http://www.w3.org/1999/xhtml”>? Is that a default XML namespace, opened for the whole document?

Because the namespace is declared with no prefix, every element in the document falls under that namespace. Makes sense: XHTML tags are defined in the XHTML namespace.

What this means for us is that when we’re parsing well-formed XHTML documents, we also have to specify the XHTML namespace. If you just do some sort of naive search:

myXhtmlDocument.head.title;

you will get null every time. You are searching for non-namespaced nodes, and all XHTML nodes exist in the XHTML namespace. Not the same!

So the moral of the story is beware the namespace with XHTML. But I want to take this opportunity to look at some different ways to achieve this.

  1. The Reduce, Reuse, Recycle

    xhtmlns.as:

    package
    {
    	public namespace xhtmlns = "http://www.w3.org/1999/xhtml";
    }
  2. You can define a namespace in its own file, as public, and name the file the name of the namespace object. Now client code can import this and you won’t have to re-declare namespaces.

  3. The Globally Open

    MyClass.as:

    package
    {
    	public class MyClass
    	{
    		use namespace xhtmlns;
    
    		...
    	}
    }
    

    Open the namespace for the whole file. Now all your E4X should look inside this namespace! This might not be ideal if you might be operating in different namespaces.

  4. The Per-Node Namespace

    myXhtmlDocument..xhtml::a.(@class=="red");

    Hey! Cool! This is like selector a.red. E4X ain’t so bad! The point here is that the <a> node specifically uses the xhtml namespace. Without opening the namespace, you can apply it to individual nodes with the scope resolution operator.

  5. The Joker

    root..*::Button;

    This one will match all nodes with node name Button, no matter what namespace they are in. So say you were parsing an MXML file. This would match <mx:Button> as well as a custom class you might have defined <custom:Button>. Or, for us, it lets us skip the step of defining the xhtml namespace in the first place.

  6. The Namespace Variable

    var xhtml:Namespace =  new Namespace("http://www.w3.org/1999/xhtml");

    Creating a namespace as a variable won’t allow you to open it for a whole file, but it will allow you to use it in an E4X expression by its variable handle as in myXhtmlDocument..xhtml::title.

14 Responses to “Using E4X With XHTML? Watch Your Namespaces!”

  1. Claus Wahlers Says:

    That’s nice, but how do i parse/access XML with E4X that contains a processing instruction as the first thing in the document?

    http://www.mail-archive.com/flexcoders@yahoogroups.com/msg58230.html

  2. Roger Braunstein Says:

    I think in this example the processing instruction is a red herring. You were testing:

    <?xml-stylesheet href="my.css" type="text/css"?>
    <root xmlns="http://example.com/"; xml:id="wtf" />

    right?
    But I think that’s not well formed. What’s up with the semicolon, and xml: should be xmlns: Try this:

    var a:XMLList = new XMLList(
       '<?xml-stylesheet href="my.css" type="text/css"?>' +
       '<root xmlns="http://example.com/" xmlns:id="wtf" />');
    var b:XML = a[1];
    trace(b.namespaceDeclarations()); //wtf
    trace(b.namespace()); //http://example.com/

    namespaceDeclarations() gets the namespaces the node defines, and namespace() gets the namespace the node is in. You can see that the processing instruction has no impact on whether this works.

  3. Claus Wahlers Says:

    What’s up with the semicolon

    Whoops, that must have been an email typo, sorry for that. There’s no lost semicolon in my original code.

    and xml: should be xmlns:

    Nope, see here: http://www.w3.org/TR/xml-id/

    Shouldn’t b.namespaceDeclarations() also include the default namespace declaration (xmlns=”http://example.com/”), or does that get stripped out because it’s already accessible via b.namespace()?

  4. Roger Braunstein Says:

    Interesting! I didn’t know about xml:id. It appears that for processors that support it, the xml namespace is bound implicitly to http://www.w3.org/XML/1998/namespace, and id is an attribute scoped to this namespace.

    AS3 seems to support this. Hurray! Here are some ways you could extract the id attribute. This shows how you can use namespaced attributes as well.

    var b:XML = XML('<root xmlns="http://example.com/" xml:id="wtf" />');
    trace(b.attributes()[0]); //wtf
    trace(b.@id); //null
    trace(b.@*::id); //wtf
    namespace xml="http://www.w3.org/XML/1998/namespace";
    trace(b.@xml::id); //wtf

    Note that you didn’t have to declare xmlns:xml=”http://www.w3.org/XML/1998/namespace” in the <root/> node, but the attribute was clearly scoped to that namespace. That means AS3 respects its implicit existence. :)

  5. Claus Wahlers Says:

    Note that you didn’t have to declare xmlns:xml=”http://www.w3.org/XML/1998/namespace” in the node, but the attribute was clearly scoped to that namespace. That means AS3 respects its implicit existence. :)

    Yep that’s the way it’s supposed to be. However, toXMLString() seems to behave wrong as it’s adding the xml namespace declaration (which is not valid).
    Also namespaceDeclarations() seems to have a general problem when used with XMLLists downcasted to XML (or something)

  6. Roger Braunstein Says:

    It seems like a limitation that toXmlString() can’t handle default namespaces or implicitly defined namespaces. For instance, when you use xml:id, you’re using a namespace xml that you don’t declare, and when you’re using the default namespace http://example.com that namespace doesn’t have a binding. toXmlString() apparently wants everything to be explicit, so it makes up bindings for the two namespaces — aaa and aab — and uses them to explicitly namespace the id attribute as well as the <root> node. I think the output XML string is functionally equivalent, but it’s clearly not literally equivalent :(

  7. Claus Wahlers Says:

    Yeah something like that, i don’t know. What worries me more though is the buggy (?) behavior of namespaceDeclarations() in that case.

    Sorry btw for hijacking your post ;)

  8. Roger Braunstein Says:

    No, thank you for making things more interesting! Not to mention, I learned something about XML-ID!
    I also think it’s weird that just because the node exists in a namespace, it’s not considered as declaring that namespace…

  9. Peter Hall Says:

    It’s not really that toXMLString() is limited. e4x works on a canonicalised data model. There are several ways to achieve the same XML object from different source strings - using default vs explicit namespaces is just one example. Canonicalisation makes XML simpler to work with, at the expense of losing how it was originally declared.

  10. dispatchEvent » AS3 E4X Rundown Says:

    [...] I can write more about namespaces in a future revision, but for now check out the good discussion started in my previous post, Using E4X? Watch Your Namespaces. [...]

  11. Mark Says:

    Don’t forget:
    if (myXhtmlDocument.namespace(”") != undefined){
    default xml namespace = myXhtmlDocument.namespace(”");
    }

    If your xml has a default namespace it’ll set it as your default, so your calls won’t need the namespace prepended

  12. Mitch Says:

    This is invaluable. Thank you for publishing this.

  13. Philip Bulley Says:

    Great post… figured I had a namespace issue, but had no idea of how to reference it! Thanks mate.

  14. Monokai » Blog Archive » Flash SEO: graceful degradation to add meaning and structure Says:

    [...] Actionscript 3 E4X rundown Using E4X with XHTML? Watch your namespaces! [...]

Leave a Reply