|
|
Using E4X With XHTML? Watch Your Namespaces! |
One of the best things about AS3 so far, for me, is the decision to make it much noisier about failing. If there was one thing that was frustrating before, it was trying to track down what failed silently and where, only seeing the effects far downstream, with a barely workable debugger. Things are sooo much better now.
Nonetheless! There are always going to be little things that trip up every new programmer until you learn them, or maybe that trip you up over and over because it’s just so hard to remember. Certainly there will be less of these in AS3, but new is exciting, right? Ok, so enough intro. I post stupid mistakes. You learn from my mistakes. Somewhere, an old woman makes waffles. Read on.
This one is more of a user error, but it’s a fair accident that I think anyone could make. I was using E4X to search for a node in some XHTML source, but it kept coming up null. I doubted first my knowledge of E4X, then my sanity. Finally it struck me: XHTML is namespaced! Let’s take a look at the header of a well formed XHTML 1.0 Transitional document:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <title>Title</title> </head>
What was that, you say? <html xmlns=”http://www.w3.org/1999/xhtml”>? Is that a default XML namespace, opened for the whole document?
Because the namespace is declared with no prefix, every element in the document falls under that namespace. Makes sense: XHTML tags are defined in the XHTML namespace.
What this means for us is that when we’re parsing well-formed XHTML documents, we also have to specify the XHTML namespace. If you just do some sort of naive search:
myXhtmlDocument.head.title;
you will get null every time. You are searching for non-namespaced nodes, and all XHTML nodes exist in the XHTML namespace. Not the same!
So the moral of the story is beware the namespace with XHTML. But I want to take this opportunity to look at some different ways to achieve this.
-
The Reduce, Reuse, Recycle
xhtmlns.as:
package { public namespace xhtmlns = "http://www.w3.org/1999/xhtml"; } -
The Globally Open
MyClass.as:
package { public class MyClass { use namespace xhtmlns; ... } }Open the namespace for the whole file. Now all your E4X should look inside this namespace! This might not be ideal if you might be operating in different namespaces.
-
The Per-Node Namespace
myXhtmlDocument..xhtml::a.(@class=="red");
Hey! Cool! This is like selector a.red. E4X ain’t so bad! The point here is that the <a> node specifically uses the xhtml namespace. Without opening the namespace, you can apply it to individual nodes with the scope resolution operator.
-
The Joker
root..*::Button;
This one will match all nodes with node name Button, no matter what namespace they are in. So say you were parsing an MXML file. This would match <mx:Button> as well as a custom class you might have defined <custom:Button>. Or, for us, it lets us skip the step of defining the xhtml namespace in the first place.
-
The Namespace Variable
var xhtml:Namespace = new Namespace("http://www.w3.org/1999/xhtml");Creating a namespace as a variable won’t allow you to open it for a whole file, but it will allow you to use it in an E4X expression by its variable handle as in
myXhtmlDocument..xhtml::title.
You can define a namespace in its own file, as public, and name the file the name of the namespace object. Now client code can import this and you won’t have to re-declare namespaces.
March 22nd, 2007 at 8:59 am
That’s nice, but how do i parse/access XML with E4X that contains a processing instruction as the first thing in the document?
http://www.mail-archive.com/flexcoders@yahoogroups.com/msg58230.html
March 22nd, 2007 at 12:08 pm
I think in this example the processing instruction is a red herring. You were testing:
right?
But I think that’s not well formed. What’s up with the semicolon, and xml: should be xmlns: Try this:
namespaceDeclarations() gets the namespaces the node defines, and namespace() gets the namespace the node is in. You can see that the processing instruction has no impact on whether this works.
March 22nd, 2007 at 1:24 pm
Whoops, that must have been an email typo, sorry for that. There’s no lost semicolon in my original code.
Nope, see here: http://www.w3.org/TR/xml-id/
Shouldn’t b.namespaceDeclarations() also include the default namespace declaration (xmlns=”http://example.com/”), or does that get stripped out because it’s already accessible via b.namespace()?
March 22nd, 2007 at 1:50 pm
Interesting! I didn’t know about xml:id. It appears that for processors that support it, the xml namespace is bound implicitly to http://www.w3.org/XML/1998/namespace, and id is an attribute scoped to this namespace.
AS3 seems to support this. Hurray! Here are some ways you could extract the id attribute. This shows how you can use namespaced attributes as well.
var b:XML = XML('<root xmlns="http://example.com/" xml:id="wtf" />'); trace(b.attributes()[0]); //wtf trace(b.@id); //null trace(b.@*::id); //wtf namespace xml="http://www.w3.org/XML/1998/namespace"; trace(b.@xml::id); //wtfNote that you didn’t have to declare xmlns:xml=”http://www.w3.org/XML/1998/namespace” in the <root/> node, but the attribute was clearly scoped to that namespace. That means AS3 respects its implicit existence. :)
March 22nd, 2007 at 3:09 pm
Yep that’s the way it’s supposed to be. However, toXMLString() seems to behave wrong as it’s adding the xml namespace declaration (which is not valid).
Also namespaceDeclarations() seems to have a general problem when used with XMLLists downcasted to XML (or something)
March 22nd, 2007 at 6:25 pm
It seems like a limitation that toXmlString() can’t handle default namespaces or implicitly defined namespaces. For instance, when you use xml:id, you’re using a namespace xml that you don’t declare, and when you’re using the default namespace http://example.com that namespace doesn’t have a binding. toXmlString() apparently wants everything to be explicit, so it makes up bindings for the two namespaces — aaa and aab — and uses them to explicitly namespace the id attribute as well as the <root> node. I think the output XML string is functionally equivalent, but it’s clearly not literally equivalent :(
March 22nd, 2007 at 7:09 pm
Yeah something like that, i don’t know. What worries me more though is the buggy (?) behavior of namespaceDeclarations() in that case.
Sorry btw for hijacking your post ;)
March 22nd, 2007 at 7:13 pm
No, thank you for making things more interesting! Not to mention, I learned something about XML-ID!
I also think it’s weird that just because the node exists in a namespace, it’s not considered as declaring that namespace…
March 28th, 2007 at 6:50 am
It’s not really that toXMLString() is limited. e4x works on a canonicalised data model. There are several ways to achieve the same XML object from different source strings - using default vs explicit namespaces is just one example. Canonicalisation makes XML simpler to work with, at the expense of losing how it was originally declared.
May 21st, 2007 at 8:07 pm
[...] I can write more about namespaces in a future revision, but for now check out the good discussion started in my previous post, Using E4X? Watch Your Namespaces. [...]
July 5th, 2007 at 9:52 am
Don’t forget:
if (myXhtmlDocument.namespace(”") != undefined){
default xml namespace = myXhtmlDocument.namespace(”");
}
If your xml has a default namespace it’ll set it as your default, so your calls won’t need the namespace prepended
July 17th, 2007 at 7:27 am
This is invaluable. Thank you for publishing this.
April 2nd, 2008 at 11:33 am
Great post… figured I had a namespace issue, but had no idea of how to reference it! Thanks mate.
October 20th, 2008 at 4:40 am
[...] Actionscript 3 E4X rundown Using E4X with XHTML? Watch your namespaces! [...]