I would have loved to replace this title by something like "Pushing the limits", but that would not have been very honnest... Indeed we sometimes reach the limit of both document formats we are working with: OpenDocument and OpenXML. I don't intend to compare the pros and cons of each of them in details here (I think there are people on the "blogosphere" that do it much better than I would ;-), but just give two examples to illustrate that both formats are just not perfect. I mean, a perfect format should be totally independant from the way it is rendered by an application or another, and there should not be any loss during a transformation (for features covered by both formats, of course).

In OpenDocument, page styles can be implicitely declared. For instance, if you want to put a landscape-oriented page inside a portait-oriented document, you have to declare a page style with landscape orientation, and then insert a page break associated with this page style. Ok, that's fine. But there is a property for a page style that specifies the following style - just as for paragraphs. But the difference between pages and paragraphs is that a paragraph always ends with some special character or element (the user has to type the carriage return key), while a page usually ends "by itself" within the text flow - I mean, there is not necessary a "page break" instruction. So in many cases, unless you actually render the page to see how it is filled by its content, you don't know when a page end occures, and therefore you don't know when the page style is changing for the following one. Practically, we have some "bugs" in our conversion that are direclty linked to this issue, and that can simply not be fixed. For instance, it happens that headers or footers change in the document, but we have no element to know when they change - and of course, OpenXML needs explicit page style changes (otherwise it would have been far to easy). A user should normally always declare explicit page breaks when he wants to modify the page layout (document maintenance would be a lot easier, such as prefering styles to direct formatting), but unfortunately it is not a common practice... Users have always loved to insert new paragraphs to fill empty spaces! I personnaly think that the ODF specification would better forbid page style changes without an explicit declaration.

Another example illustrating the limit of OpenXML this time: cell splitting in tables. In OpenDocument, cell splitting can be handled two ways: either by splitting the table into more little cells and joining them when necessary, or by defining subtables (the way OpenOffice.org works). In OpenXML, the second alternative does not exist: when you want to split a cell, you have to modify the whole table to define new columns or lines, and then join all the new cells that are not concerned by the splitting. The following example should make it clearer:

Consider a 2x2 table :

| cell A1 | cell B1 | 
|- - - - -|- - - - -| 
| cell A2 | cell B2 | 

You want to split the first cell vertically:

| cell A1a |         | 
|- - - - - | cell B1 | 
| cell A1b |         | 
|- - - - - |- - - - -| 
| cell A2  | cell B2 | 

In OpenDocument, the simpler way would be to define a 2x2 table, and a 1x2 table inside the first cell (A1). The "subtable" property ensures that the cell borders will join. In OpenXML, to achieve the same result, you have to declare a 2x3 table with the two first cells of the second row joined. OK, that doesn't seem so terrific. But now, consider that you want to split the B1 cell into three cells horizontally, to have something like:

| cell A1a | cell B1a | 
|          |- - - - - | 
|- - - - - | cell B1b | 
| cell A1b |- - - - - | 
|          | cell B1c | 
|- - - - - |- - - - - | 
| cell A2  | cell B2  | 

In OpenXML, the number of rows depends on the height of the different cells: in the previous example, we must have 5 rows declared to describe the full table. But if the cells were organized like this:

| cell A1a | cell B1a | 
|- - - - - |- - - - - | 
|          | cell B1b | 
| cell A1b |- - - - - | 
|          | cell B1c | 
|- - - - - |- - - - - | 
| cell A2  | cell B2  | 

then you would only need 4 rows to describe the table! As the cell's height is often implicit (depending on the cell's content), it is sometimes impossible to reproduce the correct table layout... To avoid this issue, when converting a subtable from ODF to OOX, we simply define a new table inside the cell exactly the same way as ODF does. This solves the layout problem, but has some drawbacks: the table is not unified any more, it is composed of several imbricated tables. The simplest way to improve the OpenXML specification in this case would be to allow subtables.

Those two examples aim at illustrating the fact that none format can be seen as "better" than the other - each has its own characteristics, its own strengths and weaknesses. One of our goals when working on the converter is to find the "incompatibilities" of both formats - features that can not be converted from one to the other. We try to keep a list that will be made public at the end of the project, hopping that the organizations behind each format will have a look at it and maybe get some ideas to step forward in the direction of the other format. Sweet dreams...