← Home

Why php5 DomDocuments need to be un/serialize()able

(Looks like there's little understanding as to why we would need to do this even on the php xml dev's list. You might want to have a look at this and, more arrogant, this statement.)

The new native dom exension is one of the really great news in php5 - no doubt. There are many really important usecases and no one would ever want to miss it again! We can easily read, manipulate, write xml elements and do more really nifty stuff with them. Cool :)

We can also extend the native php DomDocument class to add our own behaviour to it. We've shown just one example in "Template playing with php's Dom". So let's take that example.

Here we use the DomDocument class to parse an html (xml) template and build up a controller tree, that's aligning necessary behaviour (unfortunately, the php dom extension doesn't provide any hooks, e.g. to add our behavior to DomElements). Than we let the controller tree react on some user input (request parameters, ...) and manipulate the DomDocument's elements and data. Afterwards we let DomDocument output the html.

That's great. And in our days of 1001 different php template engines that usually don't care about standard-compliance at all: it could probably be a way to bring the php community some inches nearer to recognize and use standards.

But of course: we can't do it this way. Using a template with some more than 50 tags gets terribly slow. With each request php starts up the whole environment from a dark void of nothingness and therefore most of php's greatest template engines like Wact or Smarty use some "template-compile" stage and cache the results to files to recreate them later on.

Ok, there are some php templating engines/frameworks out there that go a comparable way - besides from that they don't use php5's native dom support but some own, homegrown stuff to parse their templates. Prado is possibly one of the best known of them.

Prado builds up each component from its templates and component-definition xml files and caches them. After having done that once, these ready-made components will get fetched from the cache and used. The parsing of the templates and their translation to php objects will be by-passed this way and performance gets considerably faster.

In our experiment striving for a templating engine that uses php5's native dom extension as described here, we've tried to get around that limitation that dom objects simply can't be un/serialized with php's native un/serialize() functions. But there's no simple way around it.

We can save some milliseconds by recovering data that have already been set to our controller tree. But we'll always have to ...

  • lookup and read the template (that's fast, depending on our system)
  • re-create the DomDocument from the template (ok, that's even very fast) and
  • re-create the references pointing from our controllers to the appropriate DomElement (that's worse, depending on the complexity of the template and the needs of our controller).

This is a basic problem. It's a serious limitation of possible usecases of the php5's dom extension. We won't ever be able to attach our own behaviour to DomDocument/Element objects unless we are willing to:

  • re-create the references to the DomDocument/Element instances at every request or
  • throw away the idea of having complete components cached in a ready-made state that can be used to set up the application in a (comparatively) very short time.

Conclusion: php5 DomDocuments need to be un/serialize()able.

Let's advocate for it.