Date Set for “Bristol Knowledge Unconference”Decided to get a desktop

# How Format-Agnosticism Enables the Semantic Web?

semweb, web 3.0 @ 24 July 2008

Sitting in my RSS feedreader was an article by Paul Wlodarczyk titled “How XML Enables the Semantic Web“, which I think is incredibly wrong. I think XML is just one option for the overall idea behind the Semantic Web, and it’s not the only option.

I actually think that we should see the Semantic Web (and more specifically the Linked Data Web) as Format Agnostic. The truth of the matter is that the average web user (and business client) does not care at all about the underlying formats, they just want something that is useful and works (the more useful and the more stable the better). Linked Data is all about interconnecting information, and so as long as that information is objective, meaningful and processable then it can be turned into whatever format/framework you like (whether its Linked Data RDF, Topic Maps, DITA or even Microformats/RDFa based XHTML) from whatever format/framework is available (whether its Linked Data RDF, Topic Maps, DITA or even Microformats/RDFa based XHTML)… this is what being Format Agnostic means!

So when Paul writes:

In fact, creating the Semantic Web might be as easy as authoring content in DITA.

I think that’s right, but I think it’s equally right to say one of the following:

  • In fact creating the Semantic Web might be as easy as authoring content in RDF/XML, RDF/N3 or RDF/Turtle
  • In fact creating the Semantic Web might be as easy as authoring content in (POSH) XHTML+RDFa
  • In fact creating the Semantic Web might be as easy as authoring content in (POSH) XHTML+Microformats
  • In fact creating the Semantic Web might be as easy as authoring content in XTM

The format is irrelevant, the key things we have to ask ourselves (as developers) when choosing a format/framework are:

  1. Does the format express the full and true meaning of this data?
  2. Does the format show the data in an objective manner?
  3. Does it allow the data to be interconnected across the web using Dereferenceable URI’s? (aka is it capable of providing Linked Data)
  4. Can it be sponged/scraped/transformed into another format/framework? (this might be by using technologies such as GRDDL, Fresnel, XSLT or something like a Virtuoso Sponger)

Let’s be pro-format-agnostic!

I invite your comments (especially from Paul, if you’re reading)

2 Responses to “How Format-Agnosticism Enables the Semantic Web?”

  1. Paul Wlodarczyk Says:

    Hi Daniel, thanks for your comments on my commentary. I agree with your assertion in favor of “format agnosticism” (although that is a bit of a mouthful - you may want to rethink the branding on that a bit :) ). I never meant for a moment to suggest that XML (let alone DITA) was the *only* way to create content ready for the semantic web. Instead, the point of the commentary was to point out to the legion of existing producers of structured content that semantic search is an emerging motivation for why they should start thinking about adding more semantic markup. Also, those current users of XML document technology are in an advantaged position to automate semantic markup in the content they are already generating in XML or as DITA. Users of DITA face a particular dilemma - authoring documents in chunks (DITA topics) requires classification of those topics at check-in time to a CMS. A classification web service is one means to addressing the need for automating specification of CMS metadata. So technology like OpenLink’s can meet a need in the Information Development market. Also, business users are “semantics agnostic” - they don’t understand or care about tagging entities. “Tag as you type” utilities - integrated with authoring tools - could improve the quantity, quality, and consistency of markup. Finally, when that XML content is rendered to HTML pages, one could (with a small modification to the DITA Open Tool Kit) automatically render all of the markup required for linked data. In summary, the point was to connect the dots between emerging practices in the document publishing world (which is getting more structured every day) and the web publishing world (which is getting more linked and more semantic every day). Cheers, Paul

  2. Paul Wlodarczyk Says:

    FYI the post is now being maintained here along with comments: https://thecontentguy.net/blog/2008/08/15/connecting-the-dots-how-xml-authoring-enables-the-semantic-web/

Leave a Reply