trees

Archive for July, 2011

Quick Snippet: DBPedia Categories Example

Just a quick example of a category-based search on DBPedia. It finds all “Latin Words and Phrases” on DBpedia, the results are the DBPedia URIs, the Labels and the Wikipedia URIs.

SELECT DISTINCT ?uri, ?label, ?WikipediaURI
WHERE {
    ?uri dcterms:subject category:Latin_words_and_phrases ;
        rdfs:label ?label ;
        foaf:page ?WikipediaURI .
    FILTER (lang(?label) = "en")
}
ORDER BY ?label

To make it work feel free to plug it into the DBPedia SPARQL endpoint.

Hopefully that’s nice and easy to understand, its nothing complicated at all. Enjoy! Feel free to comment.

Ciao BCS

A couple of months ago I wrote about “Professional Memberships“, and how I was considering leaving the BCS (British Computing Society). Well, I’ve got a few months left, and I have decided to definitely not renew my membership.

The 7 reasons for my leaving:

  1. It costs quite a bit of money, which in this current economic period isn’t great. Granted its not as much as some societies, but there are more reasons.
  2. The only thing that I really “get out” of my membership is the use of the “MBCS” letters after my name. Granted this does mean that other people can understand that I am at a certain level of professionalism, but I can be at that level of professionalism without the letters and without the membership.
  3. The BCS is far far too business orientated, its audience is primarily involved with the running of computing businesses. Granted it does have some technology based articles, and many of its interest-groups are academic in nature, but it just doesn’t fit with my niche.
  4. The BCS has no events that I’m particularly interested in, and nothing really happens in Bristol / North Somerset. Granted that its probably one of those societies that you only get things out of it if you can put stuff in, but putting stuff in seems incredibly strenuous.
  5. What I was trying to find in the BCS, I have found in the WCIT. The WCIT is practical, it has a better framework for personal development, it is both business and technology focused, it has some rather interesting panels and it is easy to “put stuff in” and “take stuff out” - therefore I am pleased to be a WCIT Journeyman.
  6. There are events out there which are no-membership required, that are more interesting and more suited to my (technical and non-technical) interests.
  7. I can rejoin BCS in the future if I wish to.

Of course, my reasons for leaving are clear for the here and now, but I should say that I’m sure that the BCS is useful and interesting for many. I do believe that the reason for the existence of the BCS is pure, and I do hope that I don’t sway people to leave it, or put off people from joining it just because of my current dissatisfaction.

At this present time I’ll be focusing on my computing-society-related involvement with the WCIT, but I may explore other areas in the future.

Feel free to comment!

Note: This guide is part two of the previous blog post on Importing Linked Data into a Spreadsheet.

Introduction and Theory:

Say you don’t want your data in Google Spreadsheet, but would prefer it in Excel, OpenOffice, LibreOffice or some kind of standalone desktop application on your computer. There is still potential to work with dynamic Linked Data - via the powers of WebDAV (which is a technology allowing the establishment of an “online hard drive” over the protocols that power the world-wide-web).

A WebDAV URL is also a Data Source Name (aka an “address”), you see it is capable of being linked to as it is a URL - it is still Linked Data, and yet it can be treated as a store. This is the one of the many powers of the Linked Data Web.

Once the Method Part One is done there are two options for the tutorial, the first “Part 2″ is dealing with the data in LibreOffice (and I presume that the process is very similar in contemporary versions of OpenOffice), the second “Part 2″ is for dealing with the data in Excel (I’ve used version 2010 on Windows 7).

Prerequisites:

  • You will need a copy of Virtuoso on your machine up and running (the enterprise edition and the open source edition should both work). You must also have administrative access to it.
  • A new-ish version of LibreOffice, OpenOffice or Microsoft Excel
  • An operating system that can cope with WebDAV (which seems to be most of them these days - to varying degrees of success)

An Example Query:

Ideally this method is ideal for fast-paced data, the data that changes often - such as statistics or locations of crime etc. However, for now I’ve just used a simple lat-long search of those “areas” that touch my local area of “Long Ashton and Wraxall”.

SELECT DISTINCT ?TouchesAreaURI, ?TouchesName, ?TouchesAreaLat, ?TouchesAreaLong
WHERE {
<https://data.ordnancesurvey.co.uk/id/7000000000000770>  <https://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches>  ?TouchesAreaURI .
GRAPH ?TouchesAreaURI {
?TouchesAreaURI <https://www.w3.org/2000/01/rdf-schema#label> ?TouchesName;
<https://www.w3.org/2003/01/geo/wgs84_pos#lat> ?TouchesAreaLat;
<https://www.w3.org/2003/01/geo/wgs84_pos#long> ?TouchesAreaLong
}
}

Method Part One: Generic

As mentioned in the prerequisites - you will need administrative access to Virtuoso in order to fully run through this tutorial. This is because we need to create folders which need to be attached to the SPARQL user in order for the /sparql endpoint to save to WebDAV.

  1. Administrative Setup:
    1. Login to Conductor
    2. Go to System Admin > User Accounts
    3. Click “Edit” next to the SPARQL user
    4. Change the following:
      • DAV Home Path: /DAV/home/QL/ (you could call the “QL” folder whatever you like - just remember what you’ve changed it to)
      • DAV Home Path “create”: Checked
      • Default Permissions: all checked
      • User Type: SQL/ODBC and WebDAV
    5. Click Save.
    6. Go to Web Application Server > Content Management > Repository
    7. Navigate the WebDAV to: DAV/home/QL (or whatever you named “QL”)
    8. Click the New Folder Icon (it looks like a folder with an orange splodge on the top-left)
    9. Make a new folder:
      • Name: saved-sparql-results (must not be different!)
      • Permissions: all checked
      • Folder Type: Dynamic Resources
    10. Click Create
  2. Query and Data Setup:
    1. Hit https://<server>:<port-usually-8890>/sparql
    2. Enter the SPARQL query (e.g. The Example Query above)
    3. Change the following:
      • Change to a grab everything type query - i.e. “Try to download all referenced resources (this may be very slow and inefficient)”. Or one of the other options - dependent on the locations, data and the query.
      • Format Results as: Spreadsheet (or CSV)
      • “Display the result and do not save” change this to “Save the result to the DAV and refresh it periodically with the specified name:”
      • Add a filename (with file extension). For example testspreadsheet.xls (or testspreadsheet.csv)
    4. Click “Run Query”
    5. You’ll know see a “Done” screen, with the URI of the result, this is a WebDAV accessible URL. Please take note of the URL of the “saved-sparql-results”, it should look a little like this: https://<server>:<port-usually-8890>DAV/home/QL/saved-sparql-results

Method Part Two A: For LibreOffice and OpenOffice users

You will need to do the following in order to connect to a WebDAV folder:

  1. Tools > Options > LibreOffice/OpenOffice > General
  2. and ensure that “Use LibreOffice/OpenOffice dialogue boxes” is turned on.

You will be linking dynamically from your spreadsheet to the resource on your WebDAV instance:

  1. Start a new spreadsheet, or load up a spreadsheet where you want the resource to go.
  2. Go to Insert > Link to External Data
  3. Click the “…” button
  4. Enter your “saved-sparql-results” URL (not including the filename itself!), and press enter
  5. You should now see your “saved-sparql-results” WebDAV directory. Select the file, and click insert. The program will then probably ask you for your dav login details (you may want to make the program remember the details), it may also ask you about the format of the file - just follow that through how you would normally when importing/opening a file. You may also have to select “HTML_all” if you chose the “Spreadsheet option” in the sparql interface.
  6. Check the “Update every” box, and change the time to a suitable time based on the data.
  7. Finally, press the “OK” button… and you’ll see your lovely Linked Data inside your spreadsheet. Then you’ll be able to do whatever you want to your data (e.g. create a graph, do some calculations etc etc) - and everything will update when the data is updated. Funky!

Method Part Two B: For Excel users

OK, so I’m not a native Windows user (I used Mac OS and Amiga OS in my childhood, before moving to Unix and Linux based operating systems in about 2001). What I have found is that Windows 7 and Excel go a little strange with WebDAV, they like certain configurations - so I’ll be showing you a reasonably bodgy way of doing this :-P

  1. Prerequisite: In step 2.c of Generic Method One - save the results as HTML, and make sure the file extension is also .html
  2. Open Excel (I’m using Excel 2010)
  3. Click on the Data menu
  4. Click “From Web”
  5. In the address bar enter your “saved-sparql-queries” URL, press enter - this will probably ask you to enter your dav username and password
  6. Click on your <filename>.html file - you should then see the HTML Table
  7. Press the Import button
  8. A dialogue will pop up asking about where you would like to place your data - for ease I use the default.
  9. You’ll see the data! The important thing to note is that this is Linked Data - however, it is not quite self-updating yet. In order to do that we need to set the connection properties… so…
  10. Select the imported data
  11. Click “properties” which is in the “Connections” subpanel of the “Data” menu
  12. Change the “Refresh Every”, and/or check the “Refresh data when opening the file”. Click ok.
  13. Self-updating Excel spreadsheets from Linked Data. Funky!

Documentation Resources

Software Resources

  • OpenLink Virtuoso
  • Virtuoso Universal Server - Proprietary Edition
  • Virtuoso Open Source Edition
  • LibreOffice
  • OpenOffice
  • Microsoft Office

I hope that all works for me, and feel free to share any ideas or findings!

RESTful WebID Verification

What?

This article goes through the details of verifying a WebID certificate using REST built in a PHP client. It will connect to an OpenLink Virtuoso service for WebID verification.

WebID is a new technology for carrying your identity with you, essentially you store your identity in the form of a certificate in your browser, and this certificate can be verified against a WebID service. WebID is a combination of technologies (notably FOAF (Friend of a Friend) and SSL (Secure Socket Layer)). If you haven’t got yourself a WebID yet, then you can pick one up at any ODS installation (for instance: https://id.myopenlink.net/ods ) and you can find them under then Security tab when editing your profile. To learn more about the WebID standard please visit: https://www.w3.org/wiki/WebID For more information about generating a WebID through ODS please see: https://www.openlinksw.com/wiki/main/ODS/ODSX509GenerateWindows

REST (or Representational State Transfer) is a technique for dealing with a resource at a location. The technologies used are usually HTTP (HyperText Transfer Protocol), the resource is usually in some standardised format (such as XML or JSON) and the location is specified by a URL (Uniform Resource Locator). These are pretty standardised and contemporary tools and techniques that are used on the World Wide Web.

PHP is a programming/scripting language usually used for server-side development. It is a very flexible language due to its dynamic-weak typing and its capability of doing both object-oriented and proceedural programming. Its server-side usage is often “served” using hosting software such as Apache HTTP Server or OpenLink Virtuoso Universal Server. To learn more about PHP visit: https://www.php.net/

Virtuoso is a “Universal Server” - it contains within it, amongst other things, a database server, a web hosting server and a semantic data triple store. It is capable of working with all of the technologies above - REST, PHP and WebID - along with other related technologies (e.g. hosting other server-side languages, dealing with SQL and SPARQL, providing WebDAV etc etc). It comes in two forms: an enterprise edition and an open source edition, and is installable anywhere (including cloud-based servers such as Amazon EC2). To learn more about Virtuoso please visit: https://virtuoso.openlinksw.com/

ODS (OpenLink Data Spaces) is a linked data web application for hosting and manipulating personal, social and business data. It holds within it packages for profiling, webdav file storage, feed reading, address book storage, calendar, bookmarking, photo gallery and many other functions that you would expect from a social website. ODS is built on top of Virtuoso. To learn more about ODS please visit: https://ods.openlinksw.com/wiki/ODS/

Why?

Identity is an important issue for trust on the web, and it comes from two perspectives:

  • When a user accesses a website they want to know that their identity remains theirs, and that they can log in easily without duplicating effort.
  • When a developer builds a web application they want to know that the users accessing their site are who they say they are.

WebID handles this through interlinking using URIs over HTTP, profiling using the FOAF standard, and security using the SSL standard. From a development point of view it is necessary to verify a user, and this is the reason for writing this article.

How?

To make things a lot easier OpenLink Software have created a service built into their ODS Framework which verifies a certificate provider with an issued certificate. The URL for the web service is: https://id.myopenlink.net/ods/webid_verify.vsp

This webservice takes the following HTTP Get Parameter:

callback string

The callback is the URL that you want the success/failure information to be returned to. The cleverness actually comes from the fact that the service also tests your SSL certificate information which is stored in the header information that the browser sends across, this is a three agent system. The three agent system could be shown a bit like this:

So we can start to build up a picture of how a “Verification Requester” might look like:

  1. First Page: Send user to the “Verifier” with the relevant Callback URL
  2. Callback Page: Receive details from the verifier - details will be found in the HTTP Parameters.
    1. If a WebID URI is returned then you know everything is ok
    2. If an error is returned then the WebID has not been verified

Lets build something then, we shall build a simple single page script which does different things based on whether it has in the first pass through or the second….

(example code based on code written by OpenLink Software Ltd)…

<?php
  function apiURL()
  {
    $pageURL = $_SERVER['HTTPS'] == 'on' ? 'https://' : 'https://';
    $pageURL .= $_SERVER['SERVER_PORT'] <> '80' ? $_SERVER['SERVER_NAME'] . ':' . $_SERVER['SERVER_PORT'] : $_SERVER['SERVER_NAME'];
    return $pageURL . '/ods/webid_demo.php';
  }
	$_webid = isset ($_REQUEST['webid']) ? $_REQUEST['webid'] : '';
	$_error = isset ($_REQUEST['error']) ? $_REQUEST['error'] : '';
	$_action = isset ($_REQUEST['go']) ? $_REQUEST['go'] : '';
  if (($_webid == '') && ($_error == ''))
  {
    if ($_action <> '')
    {
      if ($_SERVER['HTTPS'] <> 'on')
      {
        $_error = 'No certificate';
      }
      else
      {
        $_callback = apiURL();
        $_url = sprintf ('https://id.myopenlink.net/ods/webid_verify.vsp?callback=%s', urlencode($_callback));
        header (sprintf ('Location: %s', $_url));
        return;
      }
    }
  }
?>

This first bit of code (above) simply deals with redirecting the user process to the Verifier service with the relevant (dynamic) Callback URL. You will notice that it only redirects when the “go” request is set - this is for demonstration purposes. We shall continue….

<html>
  <head>
    <title>WebID Verification Demo - PHP</title>
  </head>
  <body>
    <h1>WebID Verification Demo</h1>
    <div>
      This will check your X.509 Certificate's WebID  watermark. <br/>Also note this service supports ldap, http, mailto, acct scheme based WebIDs.
    </div>
    <br/>
    <br/>
    <div>
      <form method="get">
        <input type="submit" name="go" value="Check"/>
      </form>
    </div>
    <?php
      if (($_webid <> '') || ($_error <> ''))
      {
    ?>
      <div>
      	The return values are:
  	    <ul>
          <?php
            if ($_webid <> '')
            {
          ?>
  	      <li>WebID -  <?php print ($_webid); ?></li>
  	      <li>Timestamp in ISO 8601 format - <?php print ($_REQUEST['ts']); ?></li>
          <?php
            }
            if ($_error <> '')
            {
          ?>
  	      <li>Error - <?php print ($_error); ?></li>
          <?php
            }
          ?>
  	    </ul>
      </div>
    <?php
      }
    ?>
  </body>
</html>

This second part of the code is twofold:

  • Firstly, it displays a simple form with a “go” button - this is simply to demonstrate the “redirection” part of the code
  • Secondly, this is where we print out the results from what we’ve callback’d. You’ll see that we try to print out the WebID URI, the Timestamp and any Error message.

What is great about the above code is that this can be run on any server that has PHP installed, it doesn’t need to be installed specifically on Apache HTTP Server, nor on OpenLink Virtuoso - it could be installed on any HTTP server with PHP hosting. It could even be adapted to be the Ruby programming language, Python, Perl, ASP or any server-side language, scripting language (including Javascript), or standalone programming language.

Thing is this not only works with http: WebIDs it can work with ldap:, mailto:, or acct: WebIDs too! Kingsley Idehen demonstrates this to us in his twitpic

Grab the full code here