MS Workshop on Open XML

During the last few days, I was invited to participate in a workshop about Open XML targetted at developers. It took place in Microsoft Technology Center in Paris and was organized by Guillaume Renaud (from Microsoft France), Doug Mahugh (Office 2007 Technical "Evangelist", coming from Microsoft Corp.) and Wouter van Vugt (from InfoSupport). They kindly proposed me to present the ODF Converter project, what I was happy to do. Being at the technology center on wednesday afternoon, I was also invited to repeat the presentation (in quite a shorter time!) for the "developers wednesdays", an event organized weekly by MSDN France team (a nice report was posted - in french! - by Julien Chable on his blog). I must say that it was a real pleasure to meet all those people I had read many times on the internet. We've had some very interesting discussions about several aspects of the project: differences between ODF and Open XML, interoperability, Open Source...

The subtable issue, second

Having recently faced a critical issue related to Open XML when working on the converter, I did not miss the occasion to ask such worldwide famous experts for advice. But let me first explain the problem. It is closely linked to the subtable issue I reported a few weeks ago on this blog. In OpenDocument, when a table inside a cell has this "subtable" attribute set to "true", it means that the rendering engine has to join the borders of the tables, so that it looks like a unique, splitted table. But subtable doesn't have any equivalent in OpenXML. As I explained in my previous post, we ended up by simply embedding tables inside cells and removing the outer borders to obtain an acceptable result. It is more like a "better than nothing" workaround than a satisfying solution. But anyway, we did not have any better alternative.

When converting our table (containing a subtable) back to ODF, we would like to find our subtable attribute back, so that we don't lose anything during the whole conversion process. To achieve such a result, we need to find a way to add some custom property to the converted table. This property must be:

  • transparent for the user (Word rendering engine must ignore it)
  • preserved by Word when saving the document
  • recognized by our converter during the reverse conversion.
How to extend Open XML with custom properties

Among all the extensibility features provided by Open XML, I truly thought that we would find some candidates to solve our problem. Let's examine them one after the other.

Custom XML Markup & Smart Tags

Those two features would have been good options to store a custom property for our subtable. Unfortunately, they are not transparent for the user (SmartTags appear as underlined, whereas custom XML markups are shown as little boxes. It is possible to ask Word not to show them, but it depends on the user's configuration).

Custom XML Parts & Content Controls

Open XML came up with the notion of content controls and custom XML parts. But content controls are also always visible in some way to the user, even if they are empty. That does not prevent us to define custom XML parts in our document, in order to store some additional informations, e.g. "subtable" properties for tables. But unfortunately, there is no way to identify an element outside its definition in a way that would not change upon user actions (such as saving).

Processing instructions

We also thought of using processing instructions, but it appeared that Word did not preserve them when saving a document.

Participate in the contest!

I discussed this issue with Doug and Wouter, but we could not find any satisfying solution - I mean, that would be totally transparent to the user. I find it quite disappointing, because this sounds like a basic extensibility feature to me. That does not mean that it is impossible, but only that we did not find the right way to do it yet. One of the workshop attendees suggested to improve the ECMA specification by adding an attribute to custom properties that would prevent Word from rendering them. Well, that's not a bad idea. If some of the Open XML schema designers could hear it...

But in the mean time, any suggestion will be much appreciated!

UPDATE: Doug and Wouter posted nice reports on the Open XML Workshop in Paris on their blogs: here and here