DITA no image

Published on August 5th, 2013 | by Rahel Bailie

11

Holes in the Template: Piping content into the CMS

When companies want to publish lots of information on their website about their products, it is likely that product information doesn’t originate from their Web CMS. They will have an ERP (Enterprise Resource Planning) system that stores the SKUs and prices, and a PIM (Product Information Management) system that stores the descriptions and variants, and probably some other software that processes other bits of information, and then the front-end Web CMS allows all this information to integrate into beautiful product descriptions, and converge onto a single page.

Content integration: lots of details about footwear coming from many systems

After all, it doesn’t make sense to store all of this information in the Web CMS. You store the information in the system that is meant to manipulate it at a granular level. Once it’s been highly tagged, it is pushed to the display layer, a Web CMS.

Content-convergence

This is done by putting “holes in the template” and calling some scripts to get the right information to populate those template holes. Sounds simple, right? OK, doesn’t sound simple. Deane Barker , in his post, Editors Live in the Holes, describes the folly of not paying enough attention to what happens in the holes. And as the saying goes, therein lies the problem.

If you remember nothing else of this post, remember these next few lines.

  • An ERP system pushes data (SKUs, prices, etc) into the Web CMS.
  • A PIM system pushes content into the Web CMS.
  • A DITA-based CCM system pushes content into the Web CMS.

These are parallel processes. Just as you can manipulate content in many different ways. The content can be shown according to the reader criteria. That could mean that a reader logs in as a premium-package member and sees something different than a standard-package member. Or an administrator sees something different than an end user. Or that a reader chooses some filters (women’s shoes, red, heeled, size 8) and sees content specific to their criteria.

Putting content that is tagged up enough that it can meet all of these types of demands – well, it’s just hard to do in a Web CMS. Not just hard, exceptionally hard. Which is why companies who need to respond to market conditions in a hurry, or companies who want to output to multiple devices, channels, markets, or audiences don’t put their content directly into a Web CMS. They put their content into a heavy-duty authoring system, and then let the Web CMS do what it does best – pipe the content into the right holes.

Someone asked me whether using DITA meant losing out on the ability to easily re-use and re-purpose content for different media and devices. Actually, it’s the other way around. Creating highly semantic content, what Ann Rockley would call Intelligent Content  (I’d link to the site but each conference has a different URL – search for “intelligent content conference” and see what comes up), means being able to re-use and re-purpose content with infinitely more ease and agility. Intelligent content and DITA is not for the company who has fifty pages of highly crafted marketing content that never changes. It’s for the companies whose writers are closer to being content engineers.


Share this post:
These icons link to social bookmarking sites where readers can share and discover new web pages.

  • del.icio.us
  • StumbleUpon
  • email
  • Facebook
  • LinkedIn
  • TwitThis


About the Author

Rahel Anne Bailie is a synthesizer of content strategy, requirements analysis, information architecture, and content management to increase the ROI of content. She has consulted for clients in a range of industries, and on several continents, whose aim is to better leverage their content as business assets. Founder of Intentional Design, she is now the Chief Knowledge Officer of London-based Scroll. She is a Fellow of the Society for Technical Communication, she has worked in the content business for over two decades. She is co-author of Content Strategy: Connecting the dots between business, brand, and benefits, and co-editor of The Language of Content Strategy, and is working on her third content strategy book,



11 Responses to Holes in the Template: Piping content into the CMS

  1. Scott Abel says:

    Thanks for the mention (and the great content, as usual). The Intelligent Content Conference website can always be found at http://www.intelligentcontentconference.com. We’re in the process of revamping it for the next event, February 26-28, 2014 in San Jose, CA. More to come.

  2. Don Day says:

    What a great summary of how templating systems work, Rahel. There is an understandable tension between allowing the editing view to be influenced by the presentation style versus separating the editing view from the presentation. The XML writing pipeline is an extreme example of that separation. I’m actually not going to weigh in on which viewpoint is better because there are valuable points to both propositions. However, I do strongly believe that most such editing interfaces have been poorly integrated with their applications, something that Deane’s post also confirmed for me. Better tools will make for better and happier editors, and making those tools to produce intelligent content will make for happier end users, for sure.

  3. Joe Pairman says:

    Rahel, I’m very much enjoying this series of posts. It’s great that you’re bridging the culture gap between the worlds of structured content (in the nuts-and-bolts, WYSIWYM sense) and Web content. Having worked on various DITA-to-Web implementations over the last few years, I’m keenly aware of this gap. In part, it arises from misunderstandings of what actually fills the holes in the template.

    Of course, Web CMS developers/customizers understand very well the idea of importing a simple piece of content with a title field and some keywords. But I can think of at least three difficult concepts in the DITA-to-Web approach:

    1. Compared with product info comprised of discrete data types and sourced from a relational database, structured XML content can be far more complex, requiring more elaborate parsing on the WMCS side.

    2. The WCMS needs to be able to import large batches of content and also perform appropriate actions when pages are modified or removed in subsequent updates (e.g. don’t lose user comments, don’t change the sequence if that’s important, and only remove the pages that are supposed to be removed).

    3. The content may have a greater impact on the architecture of the final generated pages. For example, a given paragraph may need to become the meta description element in the generated page (as a suggestion for search result preview), while other elements may become WCMS search / categorization metadata, and still others require careful CSS treatment in order that the formatting reflects the semantic structure of the source content. (The latter being a good argument for a distinct block of CSS to manage that particular hole in the template.)

    These are all solvable problems — very much so. But us structured content folk are going to have to continue explaining things for some time to come, I think.

  4. I have to admit that after signing up for the Linked In STC Single Sourcing SIG, I didn’t really revisit it. I will certainly rectify that mistake in the future. My experience with DITA and single-sourcing is limited but, purely from the standpoint of someone who appreciates efficiency and good sense, I am a total fan, maybe a groupie. Then I read an article such as this one and find our there are whole areas of use that I never even got to experience. Suddenly and content management system makes sense beyond a place to store all my bits and pieces.
    Your explanation of the holes in the template was elegant in its simplicity and I am humbled before you. I can’t wait to read through all of your other posts.

  5. Don Day says:

    Joe, I appreciate your candor about issues with DITA for the Web–I’m really trying to understand the concerns and not just be the DITA car salesman. At this point, if those of us from the structured content world are to have any influence on the future of content, our message is not about DITA but about the lessons we’ve learned from it and other structured content initiatives that can inform on future standards and conventions.

    I’ve just put up a test version of my nextgen XML server, so these impressions are very real to me:

    1. Regarding complexity, DITA is hardly the only bad boy in the room. I’ll share an example someone showed me of what the Web world is starting to accept–first look at the visual organization of this page, then hit the Edit > View Form option on the right to see the actual data organization: http://docs.webplatform.org/w/index.php?title=css/properties/font-family. If you’ve followed the buzz about NPR’s COPE architecture, their forms are right up there in complexity as well. And 5 lines of PHP (with appropriate XSL passed as a parameter) will handle most output rendering requirements for XML as source. The problem is that our expectations of what to do with data have gotten more complex, and we are all in the growing pains of learning to manage down the visibility of the problem. And we techies, whether from the Web or structured content world, have to do some hard refactoring of our systems to keep today’s problems for becoming far worse in 5 to 10 years.

    2. This observation is spot on, and it highlights the huge difference between disposable assets (product pages especially are notorious for link rot) and resources on the Web that are more true to the original vision of persistence at a fixed address. I’m willing to bet it is more of a policy matter than a technology matter. XML is just an asset–how are you going to manage it, or images, or email, or data on thumb drives, or myriad other squirming pieces in this content wasteland? By the way, my view is that renditions and behaviors may change (a la Geocities and Myspace pages), but content addresses (for the stuff that goes in the holes on the ever-changing page) should not–in an ideal world.

    3. I think this observation could be made about relational data models as well–our uses for data may change, and that will change how we map that data model to our changing visualizations, something artists have been doing for a long time. I think it is remarkable that WordPress proponents speak of their platform as a full-fledged CMS. I’m sorry, but blog content authored with typical web editors compares to DITA or DocBook in what meaningful semantic way? Structured content is a blessedly unsolved problem because we’ve got real room to keep tweaking out new uses for it.

    Does that mean there is job security in structured content? Well, we’ve got to keep selling it, and following onto your conclusion, keep the messages honest and show compelling value.

  6. Ellis Pratt says:

    It think the first sentence is meant to say
    it is likely

  7. rahelab says:

    Ash. Well take care of this when I am out of hospital.I

  8. Joe Pairman says:

    Rahel, glad to see you posting on Twitter again – was a little concerned after reading your comment here. Hope all’s OK.

    Don, I always enjoy reading your thoughts and explorations in this area. Over and above the technical part, I sense your excitement in doing truly groundbreaking stuff.

    My comments certainly weren’t intended to be pessimistic — the innovations and successes come through addressing these kinds of challenges on all levels, technical and psychological.

    To give more context to my comments, I was specifically addressing the DITA-to-Web scenario, where content is initially assembled (e.g. DITA-OT publishing to HTML) before being brought into the WCMS. There’s a difference in WCMS customizers’ perception of content complexity depending on whether the content has been developed within the WCMS or is imported from outside. The former uses familiar structures and the complexity may have gradually increased over time — the frog starts in cold water and doesn’t notice the gradual increase in temperature! In contrast, importing a bunch of highly-structured DITA-sourced content can be like putting the frog straight into hot water.

    Again in this context, content lifecycle management becomes not only a matter of managing that within the WCMS, with the attendant policy issues that you mentioned, but also interpreting the intended actions based on the content packages that are brought in. So in this way there can actually be technical challenges depending on the assumptions of the WCMS architecture and how easily it can be customized.

    Also on that second point of mine, you raise an interesting point about stable content addresses. For sure, they shouldn’t change. But what exactly should a single content address refer to? The answer may vary depending on just how much dynamic rendering you expect the WCMS to do. If you take the DITA-for-the-Web approach, where content is dynamically assembled on demand, it may well be possible to cover multiple product variations with a single topic and a single content address. In contrast, if you’re bringing in ready-assembled content blocks, there may well need to be different addresses for a single logical topic depending on the product. There are interesting implications for SEO — on the one hand you want people to be able to land on appropriately customized info straight away; on the other hand external search engines don’t deal particularly well with near-identical pages that differ only in small product-specific details.

    On my third point, yes, again it’s nothing new to use data in various ways, but it can still be a psychological barrier when taken together with the other requirements. You’re bringing in all this complex content, AND it’s in large batches that need careful lifecycle management, AND you want it to do a bunch of different things on the pages and the navigation!

    So those are some of the challenges for the DITA-to-Web scenario. There are plenty of advantages of doing things that way too, of course. I’ve gone on a bit long here, but very briefly, they include:
    1. Performance. When delivering content on a large scale, dynamic rendering of complex content can be a huge challenge.
    2. Content customizations that go above and beyond what can reasonably be expected using standard DITA reuse mechanisms (i.e. instead of conditions and keyrefs, you’re forced to use a different topic to fulfill the same kind of logical role).
    3. Separation of concerns on organizational or architectural levels. Sometimes it’s easier and more cost-effective to separate content delivery from content customization.

    Looking forward to hearing your further thoughts and also seeing Rahel’s next post in this very useful series.

  9. Don Day says:

    I’ll be a bit more brief, Joe, because I’m revising the “DITA as blog post” interface in expeDITA just to make sure it is ready for us to test some of these concerns against. 😉

    1. Performance of dynamic rendering of complex content on large scale: I completely grok this concern. I think it is workable on a smaller scale, but just how small is still a justifiable application? Most large scale content delivery applications are also highly optimized for very narrow ranges of service (WPEngine here in Austin is an example), but there are many general sites that manage their lower traffic levels just fine, like my own self-hosted DITA per Day blog. We need tools like expeDITA and SPFE to be available asap so that we can start to test those concerns and find which new communities might actually benefit from these exploratory investments.

    2. Content customizations: yes, these need to be completely re-explored taking into account ideas like Steve Pemberton’s “Invisible XML” (http://www.balisage.net/Proceedings/vol10/html/Pemberton01/BalisageVol10-Pemberton01.html) and all the clever functional markup coming from the Markdown and reStructured Text communities. Inserting non-text content and other linking associations ought to be far easier than current markup systems support. The new figure and figcaption elements in HTML5 are great for attaching regular styling to, but do we have a prayer of hope that an SME inserting an image will put the right elements in the right nesting with the right attributes? I’m a bit frustrated on these issues because I’ve worked with vendors to improve the behaviors of systems, but the development investment to make user actions seem simpler remains a high barrier to bringing simple interfaces to the masses. It’s an exquisitely quixotic conundrum. So how are projects like Open Web Platform and Drupal enabling distributed contributors to create functional, semantically rich, and effective content for their projects? If anything, “DITA for the Web” should imply that we need wide open and creative conversation on how to improve the smartness of content on the Web.

    3. Separation of concerns on organizational or architectural levels: yes, and yes. But in my mind, some of that is already behind us: the schemas that manage the data models in our XML stores are just a different encoding of the essentially equivalent schemas that manage the data models in our SQL databases. Both web and XML storage systems (by whatever name you call them) implicitly have the right genetics to instance that data into views in nearly equivalent ways (“xsl:template” is a view with holes for content, right?). Both systems offer rich tools for doing live content adaptation. Given this near equivalence, does XML deserve its reputation as a bad fit for direct to web publishing? Probably so, and for reasons that we should learn from so that we can avoid those potholes on the road to a converged architecture.

    Joe, let me ask you a question that was put to me this past week: What does the Web world need to know that the XML world knows, and vice versa? Those seem to be the talking points that really matter in this discussion. I’ve hit on a few from my perspective here, but I’ve also recognized some dogma potholes in my own working process that I need to lay aside as no longer as important as I once thought.

  10. Hi Rahel,

    Great post. I call this pattern the the CMS Decorator Pattern or Data Decorator pattern. Here is how I explain it: http://contenthere.net/2009/07/the-cms-decorator-pattern.html

  11. rahelab says:

    Seth, your post is so spot-on. It’s the difference between design-driven content and content-driven design. Thanks for pointing this out.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Back to Top ↑