Tuesday, September 1, 2015

Generating DocBook xml from C#.net

DocBook is a semantic markup language for technical documentation. As a semantic language, DocBook enables its users to create document content in a presentation-neutral form that captures the logical structure of the content; that content can then be published in a variety of formats, including HTML, XHTML, EPUB, PDF, man pages, Web help and HTML Help, without requiring users to make any changes to the source. The conversion from DocBook to other format may perform via XSLT transformations. Hence DocBook is quite useful in content authoring as an generic format to store data.

If maintainability and ease of implementation are considered, one of the good approach would be to use (Java) Jaxb equivalent implementation in C#.net. Using Jaxb, POJO (Plain Old Java Objects) classes can be generated from xml schema definition files. Generated POJOs can be used to serialze and deserialize data from/into xml files.

Following libraries and tools can be used to generate POCO (Plain Old C# Objects) from DocBook schema files.


In my attempt, the xsd.exe which is available with VS SDK was tried out to generate C# classes from DocBook xsd file. Here is how to use it.

  1. Download the xsd schema file from http://www.docbook.org/xml/5.0/xsd/docbook.xsd and place it in a folder. (I have used DocBook version 5.0 schema)
  2. Open Visual Studio command promot and cd to the folder which contains the docbook.xsd
  3. Run the command xsd docbook.xsd /c /l:CS

If you get following errors while generating classes from DocBook xsd; the easiest solution would be just to comment out the lines state in warnings and execute the same command with xsd.exe :)

Microsoft (R) Xml Schemas/DataTypes support utility
[Microsoft (R) .NET Framework, Version 4.0.30319.18020]
Copyright (C) Microsoft Corporation. All rights reserved.
Schema validation warning: The 'http://www.w3.org/1999/xlink:href' attribute is not declared. Line 46, position 6.
Schema validation warning: The 'http://www.w3.org/1999/xlink:type' attribute is not declared. Line 47, position 6.
Schema validation warning: The 'http://www.w3.org/1999/xlink:role' attribute is not declared. Line 48, position 6.
Schema validation warning: The 'http://www.w3.org/1999/xlink:arcrole' attribute is not declared. Line 49, position 6.
Schema validation warning: The 'http://www.w3.org/1999/xlink:title' attribute is  not declared. Line 50, position 6.
Schema validation warning: The 'http://www.w3.org/1999/xlink:show' attribute is not declared. Line 51, position 6.
Schema validation warning: The 'http://www.w3.org/1999/xlink:actuate' attribute is not declared. Line 52, position 6.
Schema validation warning: The 'http://www.w3.org/1999/xlink:label' attribute is  not declared. Line 7515, position 8.
Schema validation warning: The 'http://www.w3.org/1999/xlink:from' attribute is not declared. Line 7522, position 8.
Schema validation warning: The 'http://www.w3.org/1999/xlink:to' attribute is no t declared. Line 7523, position 8.
Warning: Schema could not be validated. Class generation may fail or may produce  incorrect results.
Error: Error generating classes for schema 'docbookV5'.
  - The attribute href is missing.
If you would like more help, please type "xsd /?".

Though I was able to generate POCOs from xsd.exe, the generated classes resulted in StackOverflowException when trying to initialize with XmlSerializer(). 

As a remedy, Xsd2Code can be successfully generate POCOs instead of xsd.exe.  The steps to use Xsd2Code can be found at http://xsd2code.codeplex.com/. Generated classes from Xsd2Code worked well without any issue.

After generating POCOs, you can include it in your Visual Studio project and use it to write into/ read from DocBook XML files by serializing/deserializing. Sysmet.Xml.Serialization.XmlSerialiser can be used to above purpose. Following utility class can be used for that.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Web;
using System.Xml.Serialization;

namespace DocBookXml
{
public static class XmlConverter
{
private static XmlSerializer _serializer = null;

///
/// Static constructor that initialises the serializer for this type
///
static XmlConverter()
{
_serializer = new XmlSerializer(typeof(T));
}

///
/// Deserialize the supplied XML into an object
///
///
///
public static T ToObject(string xml)
{
return (T) _serializer.Deserialize(new StringReader(xml));
}

///
/// Serialize the supplied object into XML
///
///
///
public static string ToXML(T obj)
{
using (var memoryStream = new MemoryStream())
{
_serializer.Serialize(memoryStream, obj);

return Encoding.UTF8.GetString(memoryStream.ToArray());
}
}

}
}

Reference documentation for DocBook XSL transforms
HTML edition of book explaining the use of DocBook XSL