trees

Archive for 'technical'

Two Linked Data Seminars

To all my readers,

Many of you know that I’ve been working with/for for a while now as a Consultant in the realm of and particularly in the UK and Europe. Well, I would like to announce the launch of two that I will be running for OpenLink Software which will be running in late November 2011 in London (England):

  • Linked Data – Commercial Perspective for Strategic Decision Makers and Executives” is an exciting new seminar for Strategic Decision Makers, Executives, Investors, Directors, Management, CxOs etc. It will cover, in non-technical fashion, how Linked Data is simple, its rich and mature history, its business opportunities, its business challenges and its societal implications.
    • Visit http://www.eventbrite.com/event/2069248177 to register your interest in the “Linked Data – Commercial Perspective for Strategic Decision Makers and Executives” seminar. Late November 2011 in London.
  • Linked Data – Commercial Perspective for Technologists” is also an exciting seminar, which is specifically tailored for technologists of any level (Technical Directors, Senior or Junior Programmers, Analysts, Knowledge Engineers, Knowledge Managers, Information Architects, Web Developers). It will be introductory in style, and will cover the technical areas of the rich historic tapestry of Linked Data, it will cover some of the more technical issues that Linked Data solves and it also covers the simplicity of implementing Linked Data.

If you are interested in attending then please do register your interest on the eventbrite pages listed above. Once you have registered interest you will receive updates about the date, time, location and the cost of the seminars. Please note that as commercially-orientated seminars, these seminars will cost – registering interest however does not cost and will not commit you to purchasing a standard ticket.

If you would like to get in touch with me directly about these seminars, or any of my work with OpenLink Software, then please do so by email: dlewis@openlinksw.com

I look forward to hearing your thoughts on the course, and I hope to see some of you at my seminars in the future.

Daniel Lewis

  • Professional Services Consultant for OpenLink Software

Facebook, the home of profiles for People, Comments, Groups, Pages, Games and Interests. It has traditionally been a very closed wall system with only the ability to link internally (i.e. Many Facebook People Profiles link to a Facebook Group Profile), or link outward (i.e. this Person gave this comment about this website).

However, this is slowly changing, we now see:

  • “Facebook Like” buttons on websites around the web, allowing a Facebook user to easily say that they like a page.
  • “Facebook Comments” on websites rather than being only within Facebook itself. For instance, the commenting system on Techcrunch.com is provided by the Facebook Comments system.
  • The Social Graph API allowing developers to use the proprietary schema built using the JSON language. This effectively, using a few mappings, allows people to link into the Facebook system – and potentially grab the data (for data portability, or query purposes)

You see that all this Facebook data is starting to whirl around the world-wide-web in an increasingly “open” fashion. So we should start using it for our own good, and not just for the good of the Facebook Corporatation.

So how can we do this? The first way is to see what you can do with the new RDF/Turtle API interface that Facebook has developed. If you have curl installed then you’ll be able to do this:

curl -L -H ‘Accept: text/turtle’ graph.facebook.com/danieljohnlewis

Which returns the results:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix api: <tag:graph.facebook.com,2011:/> .
@prefix og: <http://ogp.me/ns#> .
@prefix fb: <http://ogp.me/ns/fb#> .
@prefix : <http://graph.facebook.com/schema/~/> .
@prefix user: <http://graph.facebook.com/schema/user#> .
</277003772#>
user:id "277003772" ;
user:name "Daniel Lewis" ;
user:first_name "Daniel" ;
user:last_name "Lewis" ;
user:link <http://www.facebook.com/danieljohnlewis> ;
user:username "danieljohnlewis" ;
user:gender "male" ;
user:locale "en_GB" .

There are of course other things you can do and “grab” once you’ve used authentication etc. I should also point out that neither the RDF/Turtle format nor the RDF framework is not actually required for “Linked Data”, it is only really the use of URIs/IRIs as dereference-able object identifiers.

However, this is all very developer centric, and wouldn’t make much sense to the average user. So why not plug the Linked Data from Facebook into a Data Exploration engine such as the Virtuoso Description Page view, see my version here:
http://linkeddata.uriburner.com/about/html/http/www.facebook.com/danieljohnlewis

For more information about exploring a Facebook Person Profile using Virtuoso and OpenLink Data Explorer see the documentation: http://ode.openlinksw.com/FacebookPersonProfile.html

In conclusion Facebook, which traditionally has been a data silo is becoming a linkable data set. This is a good thing, granted Facebook does still have many rough edges (particularly regarding privacy/security issues) but hopefully we will see more of a progression into a truly Distributed/Decentralised Data-orientated Web. Facebooks progression into “opening up”, should be a call to many other data-silo/walled-garden type data services to better Facebook by becoming truly user-friendly systems, by opening up their data which is rightfully owned by their users.

[UPDATE - Highly recommend you read the following]

On 30th September my good friend Kingsley Idehen summarised “Facebook and Linked Data” in a wonderfully understandable Google+ Post (available here: https://plus.google.com/112399767740508618350/posts/6cqa1Sxk5KV (last accessed: 13th October 2011 at about 3:30pm BST)). Kingsley highlights how Facebook has given the Linked Data Web a bit of an evolutionary bump, using its Graph URIs and accessibility functions.

I would very highly recommend reading through Kingsleys post, as it seems to make a lot more sense than my own quickly made “quick post” from earlier on.

[/UPDATE]

Quick Post: Whirling Databases

For the true technologist there is a clear progression from Relational Databases to Objective Databases (OO or ORM) to Graph Databases (including Linked Data Triple/Quad Stores). It is possible to “automatically devolve” (for want of a better phrase) newer data structures into the old data structures… but that’s not what I am trying to get to today.

I’m coming across many technologists who are forming cliques, and their language is becoming restricted to their cliques. This is worrying, because it forms islands which don’t trade (to use business terminology). Not only this, but it also restricts access to the average person in the street, the technologies and tools that these islands create can become more and more distant from their potential users.

The idea of “Whirling Databases” is not to see “Databases” in terms of a specific data structure or data management system, but to see databases as a generic repository for information, capable of inputting and outputting data in different formats and frameworks. In a Linked Data system, data needs to “whirl” around the web using “links” as their travelling routes. We should work together, collaboratively and collectively to achieve this.

As some of you know, I’ve recently been working quite closely with OpenLink Software to help them help others learn about Linked Data. Linked Data, as a generic term, is an incredibly powerful tool – and a tool that should never get bogged down in frameworks (such as RDF) or formats (such as RDF/XML), it should be applicable to all frameworks and formats capable of providing outbound links, and capable of receiving inbound links. I’ve been working with Virtuoso Universal Server solidly for over a year now (not just with OpenLink Software, but with other businesses too), and I truly believe that allows for this travelling via “links” in Linked Data for a variety of frameworks and formats – this is powerful stuff!

Note: This guide is part two of the previous blog post on Importing Linked Data into a Spreadsheet.

Introduction and Theory:

Say you don’t want your data in Google Spreadsheet, but would prefer it in Excel, OpenOffice, LibreOffice or some kind of standalone desktop application on your computer. There is still potential to work with dynamic Linked Data – via the powers of WebDAV (which is a technology allowing the establishment of an “online hard drive” over the protocols that power the world-wide-web).

A WebDAV URL is also a Data Source Name (aka an “address”), you see it is capable of being linked to as it is a URL – it is still Linked Data, and yet it can be treated as a store. This is the one of the many powers of the Linked Data Web.

Once the Method Part One is done there are two options for the tutorial, the first “Part 2″ is dealing with the data in LibreOffice (and I presume that the process is very similar in contemporary versions of OpenOffice), the second “Part 2″ is for dealing with the data in Excel (I’ve used version 2010 on Windows 7).

Prerequisites:

  • You will need a copy of Virtuoso on your machine up and running (the enterprise edition and the open source edition should both work). You must also have administrative access to it.
  • A new-ish version of LibreOffice, OpenOffice or Microsoft Excel
  • An operating system that can cope with WebDAV (which seems to be most of them these days – to varying degrees of success)

An Example Query:

Ideally this method is ideal for fast-paced data, the data that changes often – such as statistics or locations of crime etc. However, for now I’ve just used a simple lat-long search of those “areas” that touch my local area of “Long Ashton and Wraxall”.

SELECT DISTINCT ?TouchesAreaURI, ?TouchesName, ?TouchesAreaLat, ?TouchesAreaLong
WHERE {
<http://data.ordnancesurvey.co.uk/id/7000000000000770>  <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches>  ?TouchesAreaURI .
GRAPH ?TouchesAreaURI {
?TouchesAreaURI <http://www.w3.org/2000/01/rdf-schema#label> ?TouchesName;
<http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?TouchesAreaLat;
<http://www.w3.org/2003/01/geo/wgs84_pos#long> ?TouchesAreaLong
}
}

Method Part One: Generic

As mentioned in the prerequisites – you will need administrative access to Virtuoso in order to fully run through this tutorial. This is because we need to create folders which need to be attached to the SPARQL user in order for the /sparql endpoint to save to WebDAV.

  1. Administrative Setup:
    1. Login to Conductor
    2. Go to System Admin > User Accounts
    3. Click “Edit” next to the SPARQL user
    4. Change the following:
      • DAV Home Path: /DAV/home/QL/ (you could call the “QL” folder whatever you like – just remember what you’ve changed it to)
      • DAV Home Path “create”: Checked
      • Default Permissions: all checked
      • User Type: SQL/ODBC and WebDAV
    5. Click Save.
    6. Go to Web Application Server > Content Management > Repository
    7. Navigate the WebDAV to: DAV/home/QL (or whatever you named “QL”)
    8. Click the New Folder Icon (it looks like a folder with an orange splodge on the top-left)
    9. Make a new folder:
      • Name: saved-sparql-results (must not be different!)
      • Permissions: all checked
      • Folder Type: Dynamic Resources
    10. Click Create
  2. Query and Data Setup:
    1. Hit http://<server>:<port-usually-8890>/sparql
    2. Enter the SPARQL query (e.g. The Example Query above)
    3. Change the following:
      • Change to a grab everything type query – i.e. “Try to download all referenced resources (this may be very slow and inefficient)”. Or one of the other options – dependent on the locations, data and the query.
      • Format Results as: Spreadsheet (or CSV)
      • “Display the result and do not save” change this to “Save the result to the DAV and refresh it periodically with the specified name:”
      • Add a filename (with file extension). For example testspreadsheet.xls (or testspreadsheet.csv)
    4. Click “Run Query”
    5. You’ll know see a “Done” screen, with the URI of the result, this is a WebDAV accessible URL. Please take note of the URL of the “saved-sparql-results”, it should look a little like this: http://<server>:<port-usually-8890>DAV/home/QL/saved-sparql-results

Method Part Two A: For LibreOffice and OpenOffice users

You will need to do the following in order to connect to a WebDAV folder:

  1. Tools > Options > LibreOffice/OpenOffice > General
  2. and ensure that “Use LibreOffice/OpenOffice dialogue boxes” is turned on.

You will be linking dynamically from your spreadsheet to the resource on your WebDAV instance:

  1. Start a new spreadsheet, or load up a spreadsheet where you want the resource to go.
  2. Go to Insert > Link to External Data
  3. Click the “…” button
  4. Enter your “saved-sparql-results” URL (not including the filename itself!), and press enter
  5. You should now see your “saved-sparql-results” WebDAV directory. Select the file, and click insert. The program will then probably ask you for your dav login details (you may want to make the program remember the details), it may also ask you about the format of the file – just follow that through how you would normally when importing/opening a file. You may also have to select “HTML_all” if you chose the “Spreadsheet option” in the sparql interface.
  6. Check the “Update every” box, and change the time to a suitable time based on the data.
  7. Finally, press the “OK” button… and you’ll see your lovely Linked Data inside your spreadsheet. Then you’ll be able to do whatever you want to your data (e.g. create a graph, do some calculations etc etc) – and everything will update when the data is updated. Funky!

Method Part Two B: For Excel users

OK, so I’m not a native Windows user (I used Mac OS and Amiga OS in my childhood, before moving to Unix and Linux based operating systems in about 2001). What I have found is that Windows 7 and Excel go a little strange with WebDAV, they like certain configurations – so I’ll be showing you a reasonably bodgy way of doing this :-P

  1. Prerequisite: In step 2.c of Generic Method One – save the results as HTML, and make sure the file extension is also .html
  2. Open Excel (I’m using Excel 2010)
  3. Click on the Data menu
  4. Click “From Web”
  5. In the address bar enter your “saved-sparql-queries” URL, press enter – this will probably ask you to enter your dav username and password
  6. Click on your <filename>.html file – you should then see the HTML Table
  7. Press the Import button
  8. A dialogue will pop up asking about where you would like to place your data – for ease I use the default.
  9. You’ll see the data! The important thing to note is that this is Linked Data – however, it is not quite self-updating yet. In order to do that we need to set the connection properties… so…
  10. Select the imported data
  11. Click “properties” which is in the “Connections” subpanel of the “Data” menu
  12. Change the “Refresh Every”, and/or check the “Refresh data when opening the file”. Click ok.
  13. Self-updating Excel spreadsheets from Linked Data. Funky!

Documentation Resources

Software Resources

I hope that all works for me, and feel free to share any ideas or findings!

RESTful WebID Verification

What?

This article goes through the details of verifying a WebID certificate using REST built in a PHP client. It will connect to an OpenLink Virtuoso service for WebID verification.

WebID is a new technology for carrying your identity with you, essentially you store your identity in the form of a certificate in your browser, and this certificate can be verified against a WebID service. WebID is a combination of technologies (notably FOAF (Friend of a Friend) and SSL (Secure Socket Layer)). If you haven’t got yourself a WebID yet, then you can pick one up at any ODS installation (for instance: http://id.myopenlink.net/ods ) and you can find them under then Security tab when editing your profile. To learn more about the WebID standard please visit: http://www.w3.org/wiki/WebID For more information about generating a WebID through ODS please see: http://www.openlinksw.com/wiki/main/ODS/ODSX509GenerateWindows

REST (or Representational State Transfer) is a technique for dealing with a resource at a location. The technologies used are usually HTTP (HyperText Transfer Protocol), the resource is usually in some standardised format (such as XML or JSON) and the location is specified by a URL (Uniform Resource Locator). These are pretty standardised and contemporary tools and techniques that are used on the World Wide Web.

PHP is a programming/scripting language usually used for server-side development. It is a very flexible language due to its dynamic-weak typing and its capability of doing both object-oriented and proceedural programming. Its server-side usage is often “served” using hosting software such as Apache HTTP Server or OpenLink Virtuoso Universal Server. To learn more about PHP visit: http://www.php.net/

Virtuoso is a “Universal Server” – it contains within it, amongst other things, a database server, a web hosting server and a semantic data triple store. It is capable of working with all of the technologies above – REST, PHP and WebID – along with other related technologies (e.g. hosting other server-side languages, dealing with SQL and SPARQL, providing WebDAV etc etc). It comes in two forms: an enterprise edition and an open source edition, and is installable anywhere (including cloud-based servers such as Amazon EC2). To learn more about Virtuoso please visit: http://virtuoso.openlinksw.com/

ODS (OpenLink Data Spaces) is a linked data web application for hosting and manipulating personal, social and business data. It holds within it packages for profiling, webdav file storage, feed reading, address book storage, calendar, bookmarking, photo gallery and many other functions that you would expect from a social website. ODS is built on top of Virtuoso. To learn more about ODS please visit: http://ods.openlinksw.com/wiki/ODS/

Why?

Identity is an important issue for trust on the web, and it comes from two perspectives:

  • When a user accesses a website they want to know that their identity remains theirs, and that they can log in easily without duplicating effort.
  • When a developer builds a web application they want to know that the users accessing their site are who they say they are.

WebID handles this through interlinking using URIs over HTTP, profiling using the FOAF standard, and security using the SSL standard. From a development point of view it is necessary to verify a user, and this is the reason for writing this article.

How?

To make things a lot easier OpenLink Software have created a service built into their ODS Framework which verifies a certificate provider with an issued certificate. The URL for the web service is: https://id.myopenlink.net/ods/webid_verify.vsp

This webservice takes the following HTTP Get Parameter:

callback string

The callback is the URL that you want the success/failure information to be returned to. The cleverness actually comes from the fact that the service also tests your SSL certificate information which is stored in the header information that the browser sends across, this is a three agent system. The three agent system could be shown a bit like this:

So we can start to build up a picture of how a “Verification Requester” might look like:

  1. First Page: Send user to the “Verifier” with the relevant Callback URL
  2. Callback Page: Receive details from the verifier – details will be found in the HTTP Parameters.
    1. If a WebID URI is returned then you know everything is ok
    2. If an error is returned then the WebID has not been verified

Lets build something then, we shall build a simple single page script which does different things based on whether it has in the first pass through or the second….

(example code based on code written by OpenLink Software Ltd)…

<?php
  function apiURL()
  {
    $pageURL = $_SERVER['HTTPS'] == 'on' ? 'https://' : 'http://';
    $pageURL .= $_SERVER['SERVER_PORT'] <> '80' ? $_SERVER['SERVER_NAME'] . ':' . $_SERVER['SERVER_PORT'] : $_SERVER['SERVER_NAME'];
    return $pageURL . '/ods/webid_demo.php';
  }

	$_webid = isset ($_REQUEST['webid']) ? $_REQUEST['webid'] : '';
	$_error = isset ($_REQUEST['error']) ? $_REQUEST['error'] : '';
	$_action = isset ($_REQUEST['go']) ? $_REQUEST['go'] : '';
  if (($_webid == '') && ($_error == ''))
  {
    if ($_action <> '')
    {
      if ($_SERVER['HTTPS'] <> 'on')
      {
        $_error = 'No certificate';
      }
      else
      {
        $_callback = apiURL();
        $_url = sprintf ('https://id.myopenlink.net/ods/webid_verify.vsp?callback=%s', urlencode($_callback));
        header (sprintf ('Location: %s', $_url));
        return;
      }
    }
  }
?>

This first bit of code (above) simply deals with redirecting the user process to the Verifier service with the relevant (dynamic) Callback URL. You will notice that it only redirects when the “go” request is set – this is for demonstration purposes. We shall continue….

<html>
  <head>
    <title>WebID Verification Demo - PHP</title>

  </head>
  <body>
    <h1>WebID Verification Demo</h1>
    <div>
      This will check your X.509 Certificate's WebID  watermark. <br/>Also note this service supports ldap, http, mailto, acct scheme based WebIDs.
    </div>

    <br/>
    <br/>
    <div>
      <form method="get">
        <input type="submit" name="go" value="Check"/>

      </form>
    </div>
    <?php
      if (($_webid <> '') || ($_error <> ''))
      {
    ?>
      <div>
      	The return values are:
  	    <ul>

          <?php
            if ($_webid <> '')
            {
          ?>
  	      <li>WebID -  <?php print ($_webid); ?></li>
  	      <li>Timestamp in ISO 8601 format - <?php print ($_REQUEST['ts']); ?></li>

          <?php
            }
            if ($_error <> '')
            {
          ?>
  	      <li>Error - <?php print ($_error); ?></li>
          <?php
            }
          ?>
  	    </ul>

      </div>
    <?php
      }
    ?>
  </body>
</html>

This second part of the code is twofold:

  • Firstly, it displays a simple form with a “go” button – this is simply to demonstrate the “redirection” part of the code
  • Secondly, this is where we print out the results from what we’ve callback’d. You’ll see that we try to print out the WebID URI, the Timestamp and any Error message.

What is great about the above code is that this can be run on any server that has PHP installed, it doesn’t need to be installed specifically on Apache HTTP Server, nor on OpenLink Virtuoso – it could be installed on any HTTP server with PHP hosting. It could even be adapted to be the Ruby programming language, Python, Perl, ASP or any server-side language, scripting language (including Javascript), or standalone programming language.

Thing is this not only works with http: WebIDs it can work with ldap:, mailto:, or acct: WebIDs too! Kingsley Idehen demonstrates this to us in his twitpic

Grab the full code here

I’ve recently come into contact with the usefulness of the ImportXML() function (and the related ImportHTML() and ImportFeed() functions) found in Google Docs Spreadsheet app [1], and their usefulness is to do with Uniform Resource Locators (URLs). This was partly thanks to Kingsley Idehens example Google Spreadsheet on the “Patriots Players” (and his tweet dated 22nd June at 10:11am), and so I wanted to see how it was done and maybe make something that wasn’t about American Football.

URLs are the actual addresses for Data, they are the “Data Source/Location Names” (DSNs). With the ImportHtml() (et al) functions you just have to plug in a URL and your data is displayed in the spreadsheet. Of course this is dependent on the structure of the data at that address, but you get the picture.

Let me show you an example based on a SPARQL query I created many moons ago:

SELECT DISTINCT ?NewspaperURI ?Newspaper ?Stance WHERE {
?NewspaperURI rdf:type dbpedia-owl:Newspaper ;
rdfs:label ?Newspaper ;
dcterms:subject <http://dbpedia.org/resource/Category:Newspapers_published_in_the_United_Kingdom>;
<http://dbpedia.org/property/political> ?Stance .
FILTER (lang(?Stance) = "en") .
FILTER (lang(?Newspaper) = "en")
}
ORDER BY ?Stance

Briefly, what the above does is shows a Newspaper , a Newspaper name and the Newspapers political stance – and these are limited to just those newspapers published in the United Kingdom. The result is a little messy as not all of the newspapers have the same style of label, and some of the newspapers stances will be hidden behind a further set of URIs – but this is just as an example.

Now we can plug this in to a sparql endpoint such as:

  1. http://dbpedia.org/sparql or
  2. http://lod.openlinksw.com/sparql

We shall use the 2nd for now – if you hit that in your browser than you’ll be able to plug it into a nicely made HTML form and fire that off to generate an HTML webpage. [2]

However, it isn’t entirely useful as we want to get in into a Google Spreadsheet! So, we need to modify the URL that the form creates. Firstly, copy the URL as it is… for instance…

http://lod.openlinksw.com/sparql?default-graph-uri=&should-sponge=&query=SELECT+DISTINCT+%3FNewspaperURI+%3FNewspaper+%3FStance+WHERE+{%0D%0A%3FNewspaperURI+rdf%3Atype+dbpedia-owl%3ANewspaper+%3B%0D%0A++rdfs%3Alabel+%3FNewspaper+%3B%0D%0A++dcterms%3Asubject+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FCategory%3ANewspapers_published_in_the_United_Kingdom%3E%3B%0D%0A+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fpolitical%3E+%3FStance+.+%0D%0AFILTER+%28lang%28%3FStance%29+%3D+%22en%22%29+.+%0D%0AFILTER+%28lang%28%3FNewspaper%29+%3D+%22en%22%29%0D%0A}%0D%0AORDER+BY+%3FStance&debug=on&timeout=&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&save=display&fname=

Then where it says “text%2Fhtml” (which means that it is of text/html type) change it to “application%2Fvnd.ms-excel” (which means that it is of application/vnd.ms-excel type – in other words a spreadsheet-friendly table). In our example this would make the URL look like…

We then need to plug this into our Google Spreadsheet – so open a new spreadsheet, or create a new sheet, or go to where you want to place the data on that sheet.

Then click on one of the cells, and enter the following function:

=ImportHtml("http://lod.openlinksw.com/sparql?default-graph-uri=&should-sponge=&query=SELECT+DISTINCT+%3FNewspaperURI+%3FNewspaper+%3FStance+WHERE+{%0D%0A%3FNewspaperURI+rdf%3Atype+dbpedia-owl%3ANewspaper+%3B%0D%0A++rdfs%3Alabel+%3FNewspaper+%3B%0D%0A++dcterms%3Asubject+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FCategory%3ANewspapers_published_in_the_United_Kingdom%3E%3B%0D%0A+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fpolitical%3E+%3FStance+.+%0D%0AFILTER+%28lang%28%3FStance%29+%3D+%22en%22%29+.+%0D%0AFILTER+%28lang%28%3FNewspaper%29+%3D+%22en%22%29%0D%0A}%0D%0AORDER+BY+%3FStance&debug=on&timeout=&format=application%2Fvnd.ms-excel&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&save=display&fname=", "table", 1)

The first parameter is our query URL, the second is for pulling our the table elements, and the third is to grab the first of those table elements.

The spreadsheet will populate with all the data that we asked for. Pretty neat!

Now that the data is in a Google Spreadsheet you can do all kinds of things that spreadsheets are good at – one based on our example might be statistics of political stance, or if you modify the query a bit to pull out more than those in the UK then the statistics of those stances based on UK and how that relates to the political parties currently in power.

It also doesn’t have to specifically be data found in DBPedia, it could be anything (business data, science data, personal data etc etc). The key to all of this is the power of URLs, and how they allow the dynamic linking to hyperdata! This is the killer-power of Linked Data.

If you do the above tutorial then let us know if you find anything of interest, and share any experiences you may have.

Resources

  1. Google Documentation of using ImportXml, ImportHtml and ImportFeed functions in Google Spreadsheets: https://docs.google.com/support/bin/answer.py?answer=75507
  2. OpenLink Software Documentation of the SPARQL implementation in Virtuoso and its endpoint: http://docs.openlinksw.com/virtuoso/rdfsparql.html
  3. Kingsley has been adding to demos on his bookmarks page: http://www.delicious.com/kidehen/linked_data_spreadsheet_demo

Mobile Intelligent Software Agents

Back in 2005 when I was first learning about “Agent Oriented Development” I was taught that agents must Perform in some Environment which it manipulates using Actuators and perceives through its Sensors (this is known as PEAS).

What reminded me about this was Kingsley Idehens 2006 blog post on the Dimensions of the Web… which is very much still a valid concept. What came to me though was that if Intelligent Agents are to be truly Autonomous on a Linked Data Web they will probably have to be mobile – and I don’t mean mobile as in developed for mobile devices, what I mean is that they should probably be capable of moving themselves from one computer to another…. otherwise they will just become some kind of “clever” web service which is stored in one place and gets its data from other web services (i.e. we revert back to Dimension 2 in Kingsleys post).

But there is a problem with this model… it sounds a bit like a virus, and something initially quite good could potentially become a bad thing with an evolutionary technique such as Genetic Programming. An intelligent agent capable of adapting to its environment through genetic programming techniques is incredibly powerful, but with great power comes great responsibility.

I have mixed feelings about Singularity philosophy, and I am particularly wary about the Singularity University, however maybe we should be thinking about the ethical/security/identity implications of Autonomous Intelligent Agents on the Linked Data Web.

Some food for thought… on this lovely Monday morning… ;-)

Professional Memberships (Computing)

I could do with your opinion…

I’ve been a full Member of the British Computing Society (BCS) since 2007, and I have mentioned in the past to various people that I don’t feel like I don’t get much benefit from it. It is probably one of those things that if you put effort into it then you’ll get benefit back, however nothing incredibly suited to me happens in the Bristol or South West regions of the BCS. One benefit that I may eventually take up is becoming “Chartered” as either an Engineer (CEng) or as an IT Practitioner (CITP), which can be done through the BCS.

The fact that the BCS seems to be targeting the IT Business niche, and trying to keep fingers in a few other pies means that my interest in the society is lacking. Therefore I am seriously considering resigning from the BCS in July, particularly as the cost of maintaining membership is also quite high when I’m trying to save some money so that I can put it into other interests (for personal, business and family interests).

So if I decided to leave the BCS, then I will still be a member of the three year Journeyman Scheme with the Information Technologists’ Company (a Livery Company of the City of London), also known as WCIT. Although the WCIT is quite similar in niche to the BCS, it provides a framework of support and development for its members and is also backed up with lots of lovely history and tradition from the ancient Guilds and Liveries of London. So I shall maintain my affiliation with the WCIT, even though it does cost quite a bit.

If I did resign from the BCS, however, I would feel like I had lost my professional body (and I would not have the postnominals “MBCS” anymore). But there are some alternatives, which might be more suited to my interests, skills, style and political-views and are potentially a lot cheaper than maintaining a membership at the BCS:

  • The ACCU – originally a society for C and C++ developers, but has expended its interests into other areas of programming and software development.
  • The IEEE Computer Society – an international society with a lot more of a practical feel to it than the BCS, primarily because it is a subsidiary of the Electrical and Electronic Engineering society. It has a huge amount of free stuff for its members, and is good value for the price.
  • The ACM – an international society, but mainly based in the USA. It has more of an academic feel to it than the BCS. It has some nice benefits for its members, it isn’t incredibly cheap, but I think it might be cheaper than the BCS. I used to be a student member of the ACM, but decided that the BCS might be better for localised stuff when it came to full membership.
  • The IAP – an interesting British society, about the same price as the BCS. It has a very practical feel to it, and some nice simple benefits for its members (particularly those who are consultants).
  • The Association for Logic, Language and Information – a rather interesting European society for the bridges between Logic, Language and Information. Sounds quite me. It is a free to join, but it costs to receive their journal.

So, to the reader – what should I do?

  1. Stay with the BCS and the WCIT, don’t join anything else
  2. Stay with the BCS, the WCIT and join one of the above (which?)
  3. Stay with the BCS, the WCIT and join something else (which?)
  4. Leave the BCS, stay with the WCIT, but don’t join any professional body
  5. Leave the BCS, stay with the WCIT and join one of the above (which one?)
  6. Leave the BCS, stay with the WCIT and join something else (what?)
  7. Leave the BCS, stay with the WCIT, but come back to the BCS in a couple of years, particularly for the CEng/CITP status

What do you think? Do you have any experience of any of the above societies, or have something else to share? Has membership of a professional body helped you to attain/maintain work? Has it benefited you in other ways? Please do share – either publicly using the comments system or privately by email ( daniel [at] vanirsystems [dot] com ).

Thank you,

Daniel

The Web & Data Objects

Tim Berners-Lee conceptualised the web when he was at CERN. He built a directory of the people, groups and projects – this was the ENQUIRE project. This eventually morphed into the World Wide Web, or a “Web of Documents”. Documents are indeed “objects”, but often not structurally defined. A great book about this is “Weaving the Web” by Berners-Lee.

The boom in this way of thinking came when businesses started to use XML to define “objects” within their processes. Being able to define something and pass it between one system and another was incredibly useful. For instance, I was in a development team quite a while back where we were communicating with BT using XML to describe land-line telephone accounts.

At the same time, people starting developing and using a new flavour of the “web” which was a lot more sociable. We saw the rise of “profiles“, “instant messaging“, “blogging” and “microblogging/broadcasting”. All of which are very easily to understand in terms of “objects”: Person X is described as (alpha, beta, omega) and owns a Blog B which is made up of Posts [p] and a Twitter account T which is made up of Updates [u].

The first reaction to the use of objects for web communications, was to provide developers with object descriptions using XML via “Application Programming Interfaces” (API) using the HyperText Transfer Protocol (HTTP). This was the birth of “Web Services“. Using web services a business could “talk” over the web, and social networks could “grab” data.

Unfortunately this wasn’t enough, as XML does not provide a standardised way of describing things, and also does not provide a standardised way of describing things in a distributed fashion. Distributed description is incredibly useful, it allows things to stay up to date – and not only that – it is philosophically/ethically more sound as people/groups/businesses “keep” their own data objects. The Semantic Web provided the Resource Description Framework (RDF) to deal with this, and the Linked Data initiative extends this effort to its true potential by providing analogues in well-used situations, processes and tools. (For instance, see my previous blog post on how Linked Data is both format and model agnostic).

This is all good. However, we live in an age where Collaboration is essential. With Collaboration comes the ability to edit, and with the ability to edit comes the need to have systems in place that can handle the areas of “identification“, “authentication“, “authorisation” and “trust“. We are starting to see this in systems such as OpenID, OAuth and WebID – particularly when coupled with FOAF [1].

These systems will be essential for getting distributed collaboration (via Linked Data) right, and distributed collaboration should be dealt with by the major Semantic Web / Linked Data software providers – and should be what web developers are thinking about next. Distributed Collaboration is going to be incredibly useful for both business-based and social-based web applications.

Please do feel free to comment if you have any relevant comments, or if you have any links to share relevant to the topic.

Many thanks,

Daniel

Footnotes

  1. There is some nice information about WebID at the W3C site. Also see FOAF+SSL on the Openlink website, and the WebID protocol page on the OpenLink website.

I decided to give an overview of the best books that you can buy to make yourself well-rounded in terms of Linked Data, the Semantic Web and the Web of Data.

Knowledge Working

Let us first start off with what I like to call “Knowledge Working”, this is essentially the realm of technical knowledge-management and knowledge acquisition/modelling. There are three books that I would promote in this respect.

The first is Information Systems Development: Methodologies, Techniques and Tools written by AvisonFitzgerald. This book, although it gets down and dirty with some technical detail, can be incredibly useful for an overview to information systems and knowledge design in general.

The second is Knowledge Engineering and Management: The CommonKADS Methodology (A Bradford book) by Guus Schreiber et al (aka the CommonKADS team). I was taught from this book on one of my modules during my undergraduate degree, it is less “programmer-based” than the Avison & Fitzgerald book, but has very useful information about acquisition, modelling and analysis of knowledge. I highly recommend this, particularly as it is incredibly useful when coupled with Semantic Web technology.

The third book is a new book that we’re waiting to have the second edition of (at the time of writing it was estimated to be available 13th July 2011). This book is Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL by Allemang & Hendler (first edition here). Allemang and Hendler are heavyweights in the Semantic Web world, with Allemang being chief scientist at TopQuadrant and Hendler being one of the writers of the earliest and most well known articles on “The Semantic Web” (New Scientist: 2001 with Berners-Lee and Lassila). This book is quite technical in places, but it does focus on ontologies and metadata (and metametadata?).

Web Science

Web Science is another important area of interest, particularly in the earlier stages of Linked Data and Semantic Web development and actually applying the theory into practice.

The earliest official book by the Web Science Trust team was A Framework for Web Science (Foundations and Trends in Web Science) by Berners-Lee et al. This is quite an expensive book, and quite academic in style, but useful nonetheless.

You may want to look at something a little cheaper, something a little more practical too. This is where The Web’s Awake: An Introduction to the Field of Web Science and the Concept of Web Life by Tetlow comes in. This interesting book takes a common sense approach to Web Science, I would certainly recommend it.

Tools and Techniques: The Evolution from the Semantic Web to Linked Data

There are the classics such as Practical RDF by Powers and A Semantic Web Primer by Antoniou and van Harmelen. There are introductory books such as Semantic Web For Dummies by Pollock. These are all good foundational books which are recommended, but they often don’t get to the essence of the Semantic Web and especially not Linked Data.

For the essential Semantic Web and Linked Data we may want to look at: Programming the Semantic Web by Segaran (who also authored the wonderful Programming Collective Intelligence: Building Smart Web 2.0 Applications) et al. Semantic Web Programming
by Hebeler (of BBN Technologies) et al. Plus there are the quite useful, although very specific, books Linking Enterprise Data and Linked Data: Evolving the Web into a Global Data Space (Synthesis Lectures on the Semantic Web, Theory and Technology) (which is also available free at “Linked Data by Heath and Bizer” ( http://linkeddatabook.com/ ) ).

Other topics of interest

There are of course other areas of interest which are very relevant to true Linked Data and Semantic Web, these include:

  • Semantic Networks and Frames – from the fields of logic and artificial intelligence. This also inspired Object-Oriented theory.
  • Pointers and References – yep the ones from programming (such as in C++).
  • HyperText Transfer Protocol
  • Graph Theory

I hope this list of interesting and useful books is handy, please do comment if you have any other books that you wish to share with us.

Thank you,

Daniel