Customizing the XML Authoring Environment

Author: Managing director, Elkera Pty Limited

Date: 07 February 2006

Presentation to the Society for Technical Communication, Wisconsin Chapter on Wednesday, February 7, 2006.

The production of technical and knowledge management documentation increasingly requires the use of specialized tools that support single source publishing requirements. More and more enterprises are finding that XML is the core technology in documentation systems. Content writers need to use XML editing applications to create content in a structured form that is self describing and separated from presentation.

Out-of-the-box, XML editing applications are not particularly easy to use without extensive customization to the DTD or schema and authoring requirements. XML makes it possible to create much more powerful aids for writers than is possible with conventional word processing software but individual customizations are expensive. If the needs of content writers are not properly considered in application planning, writer productivity can be badly affected. Ongoing training, support and data correction costs can be excessive.

In this presentation, Peter Meyer discusses the key issues involved in customizing XML editors to achieve high levels of usability for content writers. Usability is based on DTD design as well as editor configuration. The presentation will compare standard, out-of-the-box applications with carefully tailored applications using the Blast Radius XMetaL Author and the Elkera Utilities for XMetaL Author developed by Elkera to show the levels of usability that can be achieved.

Customizing the XML Authoring Environment

The use of XML in production of technical documentation

There is increasing use of XML in the production of technical documentation. Particular reasons include:

Increasingly complex systems require complex documentation that is not easily managed by manual processes. Points include content sharing and re-use, linking and the use of multi-media.
There are increasing demands for multi language documentation and a need to coordinate changes and minimize the costs of translation.
There is a need to provide documentation in a form that meets user needs in both print and various electronic formats.
There is a growing range of content management systems and other applications that can help to manage XML content and extract information from it.
XML is designed for use in automated document workflow, publishing and information retrieval systems.

The many flavours of XML

XML is not a markup language but a tool to define markup languages. There are many XML markup languages for documentation including Microsoft Office Open XML format (WordprocessorML), OASIS OpenDocument, XHTML 1, XHTML 2, DocBook, DITA, TEI, S1000D, Elkera BNML and many others.

Each of these schema or DTDs is designed for a particular purpose. Some are highly presentation oriented and rely on style names to tell us about the nature of the document content. Office Open XML format (WordprocessorML), OASIS OpenDocument, XHTML 1 fall into this category. The others provide a more hierarchical representation of the document and can better describe the nature of different components in the document.

Structural schema such as DITA, DocBook, S1000D, Elkera BNML and similar schema impose rules on content writers that may require the use of elements in particular sequence and mandate the use of certain elements. These rules are enforced through validation of the document against the schema. Some or all these rules must be understood by the writer before he or she can effectively write content using an XML editor.

The points discussed in this presentation relate only to the use of structural schema with an XML editor.

What is unique about using an XML editor?

When using an XML editor, a writer must insert a valid element defined by the schema before typing content. This is quite different to a word processor where the writer can type, press Enter and continue typing to create new content.

After completing the content for an element in an XML editor, the writer has to:

(1) identify and choose the next element they wish to insert either inside or after the current element;
(2) if the element is outside the current element, find the valid location at which to insert the new element, usually by moving the insertion point;
(3) insert the selected element;
(4) if elements are nested, the writer may have to go back and pick another element to insert before writing new content; and
(5) continue entering content.

In many XML editors, steps 1 and 2 are bound together. Commonly, the editor provides a pick list of valid elements at each location. The problem is that when the writer finishes a paragraph, the allowable elements that may follow that element cannot be seen until the insertion point is moved to the correct location. This can be quite frustrating, particularly for writers who are new to XML editors.

Other demands imposed on writers in XML editors include:

The schema may require an element to be inserted or an attribute value to be completed. If it is omitted, the document is invalid.
When an element has to be moved, it can be moved only to a valid location. If the writer does not know how to find the valid location, moving content can be very frustrating.

Many XML editors provide multiple views of the document, including normal or Tags Off View, Tags On View and a Structure View. To deal with the problems mentioned, writers often have to work in a Tags On or Structured View.

The problem with XML editors

The particular characteristics of XML editors impose several burdens that are not as pronounced when using a word processor. This leads some organizations to try to let technical writers work in Word using style templates. Documents may be converted to XML later in the work flow. This is rarely entirely satisfactory. It is difficult, if not impossible, to reliably convert word processor documents to structural XML without some human intervention to correct errors.

Writers using an XML editor may need a good understanding of the schema rules to understand how to find valid locations for new and moved elements. This imposes a high training burden and may lead some writers to be intimidated by XML editors at first.

The additional steps introduced into the writing process by XML editors can slow down work and break concentration while writing.

If the use of XML editors is to become more widespread, it will be necessary to simplify the process for writers and enable them to work more easily.

The role of schema design

Schema design can accentuate the problems encountered with XML editors:

Element names and their purpose ought to be distinct and logical to persons who work in the relevant subject domain. When many elements perform similar but slightly distinct functions, they are easily misused. Sometimes subtle distinctions are important but often they are not.
Some schema define very loose content models, permitting the same content to be marked up in several different ways. This makes it hard for the writer to understand the basic patterns in the content and it produces inconsistent markup that may cause problems in published outputs. As will be seen later, it also makes it more difficult to provide a tailored interface for content writers.

The solution

Consider the writers

Most XML editors are designed to be customized in some way to provide writers with various shortcuts and aids to their work. XML editors must work with a wide range of schema. It is not practicable to treat them as a ready-to-use product. Sometimes, extensive customization is required to achieve a highly usable writing interface. The capacity of an XML editor to facilitate that customization should be carefully considered during evaluation.

The key to customizing an XML editor is to fully understand the content to be created and the conventions followed by writers. A customized interface need not try to deal with every situation. It should aim to make work easier for that common actions that will likely amount to 80% of the writer's work.

A good test of an effective interface is if the writer can perform most work without seeing any tags. While some editors attempt to work wholly without tags, Tags On Views can be very helpful to enable writers to quickly and accurately perform unusual editing actions.

Schema selection and design

As observed earlier, it is important that the schema is carefully designed to minimize confusion and inconsistency in the markup. If an existing schema is used, it may be necessary to customize it to meet these requirements. Some schema attempt to provide elements to meet a wide range of possible circumstances. If these are not absolutely necessary in your application, they should be removed. Otherwise, they will confuse writers, make it harder for them to recognize the common markup patterns, and make it difficult to provide predictable behaviour in interface commands.

Ideally, the schema should be the simplest model that will create the desired markup structure. Most documents conform to very standard patterns for section/clause, paragraph and list structures. Ideally, the schema should make it easy for the writer to understand how these work. If the interface is then designed carefully, the writer won't really need to know the finer points of the schema rules until they are ready to learn them.

Next valid location element insertion

Next valid location insertion allows the writer to insert markup structures (elements) using a simple command without having to find the valid location first. Taking DocBook as an example, the writer may have created a section with one or more para elements. The insertion point is after the period at the end of the last para. The writer now wants to create a new section after the current section. The writer should be able to initiate an “insert section” command from a keyboard shortcut or other command to insert the new section without having to find the valid location for the section element. The application should take care of this.

Effective use of this strategy requires that the schema use tight content models. If there are many possible valid locations for the section element, constraints must be created in the customization code, thereby making it more expensive to develop. These constraints operate as a pseudo schema change. Ideally, the constraints should be imposed by the schema.

Next valid location element insertion commands can be made more powerful if they recognize shield elements. For example a para may be followed by a blockquote element. If the insertion point is in the para and the writer fires a command to insert a new para after, it may be inserted inside the blockquote. This may not be desired. Ideally, the application should allow the blockquote to be excluded from the valid locations in that circumstance. Instead, the application might only insert a para inside blockquote if the insertion point is already inside the blockquote. These rules should be carefully planned.

Context driven element insertion commands

In word processor software, writers are accustomed to pressing the Enter key to create new paragraphs and list items etc. In a word processor, this is achieved through chaining styles. The limitation of this is that the sequence can never change according to context.

In a structured schema with some XML editors, it is possible to define different behaviour for element insertion commands according to the context. It may even be possible to use the text characters adjacent to the insertion point to further refine context rules. Thus, pressing Enter after a list item that is terminated by a semi colon may create a new list item while pressing Enter after one that is terminated by a period may create a new paragraph after the list.

These rules must be based on conventions that are convenient to the writers who will use the application.

Templates

The use of templates is a simple way to create the major structures in a new document, just as with word processor documents. Very often, the major structural divisions in a document are only created once. If they are included in the template, the writer does not have to bother about creating them and it is probably not necessary to create specific interface commands for them.

Pre-formed structures

Pre-formed structures are packaged element structures inserted on a single command. If it is common to insert several nested elements before typing content, a template with those structures can be created and inserted by an interface command. Often these are context dependant so use of context rules may be helpful to make this feature work well. For example, in one context a command may insert a section with a para element and in another it may insert just the section element.

Data entry forms

In most XML editors it is possible to create forms to enter element or attribute data. These can be very helpful for graphics, metadata, cross references and tables.

Keyboard shortcuts

Keyboard shortcuts can be defined to execute common element insertion or other commands. The underlying action maybe to insert pre-formed structures at the next valid location, according to a defined context. In this way, a great deal of markup can be inserted very easily without the writer having to break concentration.

Menu items

Pull down menus and toolbar buttons should be used to organize the more common and the less common commands so that the main interface is not overloaded.

Element selection utilities

Writers frequently need to select elements, cut them and paste them into a new location. Some editors provide keyboard shortcuts to allow the writer to progressively select containing elements. However, for common actions, specific commands may be created to select specific elements and speed up the writer's work. These are more accurate because the writer knows exactly which elements are selected.

Custom utilities

In most applications writers have to perform common editing commands. They may need to insert elements by pasting copied content in particular contexts. Commands can be created to transform markup and to control the destination of pasted content so the writer does not have to find the valid location in Tags On View before pasting.

Utilities can be created to assist writers to create cross references so they don't have to write in ID values at targets.

Benefits

Reduce change management barriers

All too often, XML applications are developed by technical people who are focussed on content management and publishing systems. The needs of writers are given little or no consideration until it is time to train them on the new system. Inadequate attention to writer needs can delay rollout of a new system and incur excessive training and support costs.

Using a thoughtful customization strategy, transition times for XML authors can be reduced from months to a few days or weeks.

Reduce effort training new writers

One of the biggest hurdles for writers new to XML is to learn about XML and the rules of the schema. The aim of an effective customization should be to reduce the interface to a small set of commands that enable writers to create common markup structures without being concerned about the schema rules. Naturally, writers must understand the basic patterns and rules but these only have to be understood at a high level. In this way, writers can begin work quickly without even seeing XML markup. This builds confidence and allows writers to learn the detail as it is needed. Over time they will build up a good understanding of the schema.

Improve markup consistency and reduce editorial costs

The processes described will ensure that the schema and the user interface are coordinated. Redundant elements should be removed and excessively loose content models tightened. When writers work through defined interface commands, the XML markup created will be more consistent than if they must choose elements and contexts through the standard interface components.

Improved markup accuracy at source means fewer problems and lower rectification costs at later stages in the content life cycle.

Improve writer productivity

Standard interfaces in XML editors do not allow writers to work smoothly. They must frequently stop to perform complex or fiddly element selection and insertion actions in Tags On or Structure Views. A more usable interface will enable writers to work more productively.

Conclusions

In most XML editors, this customization work can involve very considerable effort. In some editors, not all these options are available. With the right tools, the benefits are substantial and continue indefinitely. With larger teams of writers, the benefits should clearly justify the development effort.

Greater use of standard schema should facilitate the development of applications with inbuilt configurations for those schema. This will reduce the cost of customization efforts and make improved XML authoring interfaces available to smaller teams.

Page Options

 

  Print this page

 

  PDF Version

 

  Email this page

         Updated: 10-12-2006