Published on November 19th, 2010 | by Rahel Bailie2
Surfacing content: ways to keep content in sync
One important aspect of publishing content on a website is keeping all content in sync. The content manager’s nightmare is having the same content in multiple places on a website that requires an update. Tracking where the content lives can be nightmarish, particularly when versions can fall out of sync, making subsequent updates more laborious. Having “one source of truth” is the holy grail of content developers.
A recent inquiry on the content strategy group asked about efficient ways of surfacing content in multiple ways. Yet avoiding content duplication isn’t as easy as it may sound. Managing content is more complex than managing data. Data moves from one point to another with little problem. The number “12” is the number “12” no matter where it ends up. But when the content equivalent – a dozen, December, above-average family size – context becomes critical.
This is where having a technical communication background comes in handy. Surfacing content in multiple places is a cornerstone of creating technical documentation, online help, training, and user support material, which can often come from a single content repository, and published out with variations. The “one source of truth” has long been articulated as “single-sourcing” with its corollary, “multi-channel publishing.”
The model for creating content and pushing it out to the surface is a different process, and uses different tools. It also moves responsibility for content further up the production process. The decision on how this is put into effect is a key aspect of a content strategy, and doesn’t get addressed often because of the deep divide between types of content authors.
One source of truth in a Web CMS (WCMS)
The WCMS generally takes in content that is edited in form fields. A simple example would be changing the account settings of Facebook, and having those changes show on your home page. How this happens is programmed by the developers, and any changes to how the content is surfaced either by the writer (de)selecting a check box, or by the developer customizing the code of the WCMS. Thus, the WCMS is the gatekeeper for the content flow.
A practical example in the WCMS world
I recently worked on a project where a number of hotel resorts were described in various ways on the website. On the home page, there would be a one-line description accompanying a photo. On another page, there would be a short teaser paragraph. On yet another page, the hotel contact information would be shown.
In the background, all of the content was in a single database, per hotel. The database cells included: hotel name, city, state, country, reservations phone number, front desk phone number, one-line teaser, short teaser, property description, and at least a dozen more content blurbs. These were provided in an Excel spreadsheet to the technical team, who then programmed how the content would be surfaced, and ensure that the right images match the content as it is displayed.
Single sourcing in a Component CMS (CCMS)
In a CCMS situation, the responsibility for surfacing content is moved upstream, to the writer. The writer uses an XML authoring tool (as the industry matures, tools are starting to leverage common tools like Word to do XML publishing – it’s still in its infancy, though) to create content and determine the variations. The authoring tool creates a individual content files, which then get managed in the CCMS. In other words, the CCMS is not the gatekeeper; it becomes simply the “traffic cop” that supports the author’s work.
Once the writer has created the content and set up the dependencies for surfacing content, the CCMS does an automated generation of the content through some sort of publishing pipeline. This reads all of the XML metadata and determines what content is shown where. At this point, the content is generally pushed out to some area in the WCMS reserved for the content, and then the WCMS picks up its gatekeeping duties.
A practical example in the CCMS world
To publish a travel advisory that needs to be shown to three audiences, you would create the entire long-form advisory and tag each of the sections with an audience, as shown:
<public>Don’t go to country X, effective immediately.
<doctors> <industry_stakeholders>There is a suspected outbreak of a mystery disease. If called by the media, assure them that they will be informed as soon as developments are known.</industry_stakeholders> If someone comes into your office with the known symptoms, quarantine them and get them to a hospital as soon as possible.</doctors>
Stay tuned for details.</public>
The publishing pipeline would send out three separate messages to the appropriate output channel, presumably different places on the website, or a combination of the website and other forms of communication.
- The public would see the preamble plus the concluding statement (in blue).
- Industry stakeholders would see the public message plus the statement intended only for them (in blue + purple).
- Doctors would see the entire message.
Other critical differences in surfacing content
The first difference is in what constitutes the “single source of truth”:
- In a Web CMS, content changes are made to the database, and the content is changed everywhere it’s programmed to do so. The database is the single source of truth.
- In a CCMS, what you have is, in effect, two “single sources of truth” – one is the pre-published source; the other is the published version. The nature of publishing is that these get out of sync after version 1. Think of publishing a multi-hundred page HTML manual. Version 1 uses all new content. Then, there are updates to one section, say pages 20-30. These pages now have Version 1 content and Version 2 content, while the published content is all in the Version 2 manual.
So the second difference is how content is versioned:
- The authors will be concerned with the versions of each content file; after a while, a body of published content can be made up of a range of versions that co-exist in the same repository and get mixed-and-matched by the author to be published.
- The published content is another single source of truth. It is the aggregated “publication” that is for consumption. The consumers of this content have no idea that any given page may be made up of multiple content chunks aggregated together for a seamless reading experience.
Another difference the gatekeeping functions:
- In a WCMS, the code written by the developer provides virtually all gatekeeping functionality.
- In the CCMS, the writer is the primary gatekeeper, but there is another gatekeeping function – the publishing pipeline code. The content is generally publishing using XSLTs (an automated transformation of XML content using XML stylesheet language scripts). The code automates the output process, and changes to the output means tweaking the scripts.
The WCMS world vastly overwhelms the CCMS world in number of systems sold and implemented, though amount the content published by CCMS systems on a given site often considerably dwarfs that output by the WCMS. The next few years will be interesting, as content management systems try to capture market share by enabling “the other type” of authoring experience for organizations that need to adopt more robust methods of creating and surfacing content in more flexible ways.