Please note that this is my old blog, to My New Blog is available at https://www.vanirsystems.com/blog

This blog is kept here for archival reasons as it has a lot of interesting old posts that I am sure people would find useful

Looking at the OpenLink Data Explorer extension you may think that you *have* to use an external “RDFizer”, but you don’t! You can indeed use localhost! Here’s a quick how-to using a local installation of Virtuoso Universal Server:

Pre-requisites for this example

  • Virtuoso Universal Server (either the Commercial Virtuoso version or Virtuoso Open Source version)
  • OpenLink Data Explorer browser extension (on Firefox)

Configuration in less than 5 very simple steps

  1. Tools menu
  2. Select “OpenLink Data Explorer”
  3. Change “https://demo.openlinksw.com/sparql?query=” to “https://localhost:8890/sparql?query=” in the SPARQL Endpoint field.
  4. Close the window
  5. ! All Done ! :-)

To Run

Simply

  • Use the View > “Linked Data Sources”, or….
  • Right-click (Ctrl-click on a Mac One-Button Mouse) to bring up the Contextual Menu and then select “View Linked Data Sources)

The OpenLink Data Explorer will then fetch all triples via your local Virtuoso installation. You can also configure it to work with other Virtuoso installations across the web, or Triplr or you can even choose another 3rd Party RDFizer.

I got all of my new hardware yesterday evening…. and as usual when putting together a new PC from parts there are problems.

Every part of the computer is awesome… except… the BIOS version on the motherboard works with *some* Quad Cores, unfortunately my new Quad Core isn’t one of those. Fortunately future versions of the BIOS (which are downloadable and easily installable) do include support for my new Quad Core. Unfortunately because of the BIOS<->Processor incompatibility the computer won’t even start up, and so won’t even get into the BIOS Setup mode.

So the only way to fix this problem is to use an earlier processor in order to start to computer up to get into BIOS mode in order to upgrade the BIOS (using a USB Memory Stick) so that we can put the new processor in so that it starts up properly. And unfortunately, because nobody in our near vicinity owns a socket 775 based processor we’ve had to buy an über-cheap one from eBay which won’t come until tomorrow (probably).

So, as you can probably understand I’m going to be a little bit (alot) annoyed until we’ve got this working.

Just so that you are aware about this, here are the model numbers:

  • Processor: Intel Core 2 Quad Q6600 Energy Efficient 95W edition Socket 775 (2.40GHz) G0 Stepping L2 8MB Cache OEM Processor
  • Motherboard: Asus P5N-E SLI 650i Socket 775 PCI-E Onboard Audio ATX Motherboard (Revision 1.01G, which comes with an 05XX BIOS…. when the newest BIOS versions are in the 11XX)

Things will be a lot smoother once it’s all working. I’ll be able to work a lot more efficiently.

There is quite a subtle but incredibly important difference between “Transient Linked Data Sets” and “Materialised Linked Data Sets”.

Materialised Linked Data Sets

First of all some example URI’s from Materialised Linked Data Sets:

  • https://cb.semsol.org/company/opera-software
  • https://dbpedia.org/resource/Opera_Software

A Materialised Linked Data Set is a Linked Data Set which is pre-generated and stored in a separate location with different URI’s.

Transient Linked Data Sets

Some example URI’s from Transient Linked Data Sets:

  • https://www.crunchbase.com/company/opera-software
  • https://en.wikipedia.org/wiki/Opera_Software

Transient Linked Data Sets use the same URI’s that people are used to using, and so a human user is comfortable with using the user interface that they know and love. These URI’s are put through an RDFizer (such as Virtuoso Spongers or Triplr) and the output is Linked Data generated on-the-fly. This prevents a lot of complexity when semantically tagging (e.g. “Opera Software” points directly to the Crunchbase article AND provides a dereferenceable URI for a Transient Linked Data Set), and ensures that you don’t have more than one copy of the data, and ensures that you have the most recent information.

In conclusion

I believe that Transient Linked Data Sets are a lot better, particularly with regularly edited and collaborative content such as you find on Crunchbase and WikiPedia which people have got used to. With Materialised Linked Data Sets it is usually the case that you can only read data and you have to edit the information on the original site, and then wait until the Materialised Linked Data system re-generates the data.

Example

The best way to experience these two kinds of Linked Data set is via a Linked Data Browser. For example, lets use the OpenLink Data Explorer:

  • Materialised Linked Data Set, Opera Software object through OpenLink Data Explorer: https://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fcb.semsol.org%2Fcompany%2Fopera-software (notice the use of an external URI)
  • Transient Linked Data Set, Opera Software object through OpenLink Data Explorer: https://demo.openlinksw.com/rdfbrowser2/?uri=http%3A%2F%2Fwww.crunchbase.com%2Fcompany%2Fopera-software (notice the use of the original Crunchbase URI)

What’s even better is if you have the OpenLink Data Explorer extension installed, and then you can actually browse to a wikipedia or cruncbase article and go to the View menu and then “Linked Data Sources” just as you would with the Page Source (which will show the HTML and any javascript or css coded-in).

[UPDATE]

Just to clarify. Transient Linked Data Spaces use Proxy/Wrapper URI’s which are a combination of RDFiser Service URL + actual resource URI. For example, an RDFIsation using Virtuoso:

  • https://demo.openlinksw.com/proxy/rdf/https://en.wikipedia.org/wiki/Opera_Software

or alternatively through Triplr:

  • https://triplr.org/rdf/en.wikipedia.org/wiki/Opera_Software

[/UPDATE]

Apple’s new MobileMe service…. it looks flashy and could be quite useful.

However, I think they’ve missed the point. I ask:

  • What if you don’t want to sync using the MobileMe service? Why can’t I redirect to my own server because My server represents Me?
  • How about other devices? Surely this should work with other devices and programs other than Apple software, Apple hardware and Microsoft Outlook?
  • What if the calendar information that I want to sync to is stored in, say, Google Calendar? and I have further information about what I will be doing in my Yahoo Upcoming service?
  • What about Personal Representation? ok, so this can sync some of my stuff together, but can I use it to represent me on various websites? There doesn’t seem to be a “Me” in “MobileMe”?

It’s essentially a data silo, and it doesn’t adhere to the WUPnP concept. It’s an entirely proprietary route which I am highly disappointed at Apple for, because actually they would do a lot better if it actually used standardised techniques (SyncML, OpenID, OAuth, Distributivity, Linked Data).

There are these techniques and more in the OpenLink Data Spaces (ODS) system. Hey, I’d love to be able to take some of the user interfaces off the MobileMe system and plug it into ODS or even Google Calendar and Yahoo Upcoming! Vice-versa, I’d love to be able to plug the Google Calendar GUI into the MobileMe server.

The Web Universal Plug and Play (WUPnP) Cheatsheet:

Web Universal Plug and Play (WUPnP) Cheatsheet

Essentially, if you build an application and use the technologies suggested in the “glue section” then your web application/service (whether it’s front-end or back-end) will fit into many many other web applications/services… and therefore also more manageable for the future! This is WUPnP.

Key technologies for making your services/applications as sticky as possible:

  • Dereferenceable URI’s (which indicate HTTP networking)
  • OpenID
  • OAuth
  • SPARQL
  • Linked Data RDF (or RDFa) and OWL

Web-based plug and play fun!

Just a simple post, I don’t want anything too complex because Dereferenceable URIs are actually really simple to understand.

I’ve been working on the Dereferenceable URI‘s page on Wikipedia, and I started writing a section which I thought I would share with you all here as it is so important… it’s about the Benefits of Dereferenceable URI’s:

When an object has a dereferenceable URI, not only does that object have an insignia that can be used across the web, but it also provides a key to unlocking ways of providing more detail and links to relevant information about that object.

Dereferenceable URI’s used in this way also enhance OpenID, when not only can the URI be an OpenID but can also provide information about that agent user. The combination of insignia, access to more information and OpenID provides an all round solution for the representation of human users on the Internet.

(note that the above will probably get edited on Wikipedia a few times, but here is the essential information)

I do very literally mean the above. I have a web insignia (which is similar to a graffiti tag, but on the web and provides a bit more (continue reading)), which also provides the key to my public information and is also my OpenID.

So when I go to a site, I should be able to login using my unique tag (as an OpenID)… that site can then get information about me in order to provide a visualisation of my profile so that other people can see that I have presence in that location. Also, when I am writing on my blog or writing a comment on a forum or someone elses blog then I should be able to tag it with my insignia… like I’ll do below ;-)

Thanks for reading
Daniel Lewis

The time is coming nearer to me starting my MSc in Machine Learning and Data Mining at the University of Bristol (I actually start at the end of September (2008)).

And because my Masters project (and other bits of coursework) will be processor and memory intensive, it will be way too much for my Apple MacBook (2GHz Intel Core 2 Duo, 1GB RAM, 64GB HD, Mac OS X 10.5.4) to cope with. So….

With the project funding money ( I still haven’t had anybody interested in funding me yet, although it’s looking more likely now). I decided to get PC parts to build my own PC (based on recommendations by my girlfriend and my girlfriends dad). Here are the specs:

  • Processor: Intel Core 2 Quad 64-bit 2.4GHz (Energy Efficient)
  • Motherboard: Asus P5N-E SLI 650i (on-board audio circuit, on-board gigabit network circuit)
  • RAM: 4GB DDR2 with Heat Spreader
  • Processor Cooler: Arctic Cooling Freezer 7
  • Disc Drive: Optiarc 20x DVD RW/DL/RAM SATA
  • Hard Drive: Samsung 750GB (32MB Cache, 7200RPM) SATA
  • Graphics: nVidia GeForce 7300LE on a Microstar International PCI-E Card (256MB GDDR with 512 TurboCache, 450MHz Core Clock) (VGA, DVI and HDTV Output)
  • Case: Antler ATX Midi Tower with 350w PSU (white)
  • Wireless Keyboard and Mouse (comes with the case)
  • Monitor: Acer 19inch TFT Widescreen VGA (Contrast Ratio = 2000:1 )

Plus: I’ll be putting the lastest version (8.04 at the time of writing) of Ubuntu Linux (with Gnome UI and AiXgl) on it.

My requirements are essentially that it needs to be able to do complex Artificial Intelligence and Statistics algorithms quickly, and it needs to be able to cope with reasonably large databases and knowledge-bases. So I think I’ve covered it with the above.

I will need it to be able to run the following programs/systems (and reasonably smoothly):

  • Java Runtime and Compilation Tools
  • Mono Runtime and Compilation Tools
  • SWI-Prolog Environment
  • Ruby Programming Environment
  • Haskell Programming Environment
  • C and C++ Programming Tools (gcc)
  • Weka Machine Learning Environment
  • possibly also…. MatLab, Squeak, Croquet/Cobalt and GNU-STEP
  • and most importantly…. OpenLink Software Virtuoso Universal Server

I think that it’s going to be a very very yummy machine!

Does anyone have any comments? More importantly, does anyone want to offer me funding for the outcome of the project (I only ask to cover the costs of tuition)?

Sitting in my RSS feedreader was an article by Paul Wlodarczyk titled “How XML Enables the Semantic Web“, which I think is incredibly wrong. I think XML is just one option for the overall idea behind the Semantic Web, and it’s not the only option.

I actually think that we should see the Semantic Web (and more specifically the Linked Data Web) as Format Agnostic. The truth of the matter is that the average web user (and business client) does not care at all about the underlying formats, they just want something that is useful and works (the more useful and the more stable the better). Linked Data is all about interconnecting information, and so as long as that information is objective, meaningful and processable then it can be turned into whatever format/framework you like (whether its Linked Data RDF, Topic Maps, DITA or even Microformats/RDFa based XHTML) from whatever format/framework is available (whether its Linked Data RDF, Topic Maps, DITA or even Microformats/RDFa based XHTML)… this is what being Format Agnostic means!

So when Paul writes:

In fact, creating the Semantic Web might be as easy as authoring content in DITA.

I think that’s right, but I think it’s equally right to say one of the following:

  • In fact creating the Semantic Web might be as easy as authoring content in RDF/XML, RDF/N3 or RDF/Turtle
  • In fact creating the Semantic Web might be as easy as authoring content in (POSH) XHTML+RDFa
  • In fact creating the Semantic Web might be as easy as authoring content in (POSH) XHTML+Microformats
  • In fact creating the Semantic Web might be as easy as authoring content in XTM

The format is irrelevant, the key things we have to ask ourselves (as developers) when choosing a format/framework are:

  1. Does the format express the full and true meaning of this data?
  2. Does the format show the data in an objective manner?
  3. Does it allow the data to be interconnected across the web using Dereferenceable URI’s? (aka is it capable of providing Linked Data)
  4. Can it be sponged/scraped/transformed into another format/framework? (this might be by using technologies such as GRDDL, Fresnel, XSLT or something like a Virtuoso Sponger)

Let’s be pro-format-agnostic!

I invite your comments (especially from Paul, if you’re reading)

Good news for all who are following the “Bristol Knowledge Unconference” (which I have mentioned on here before: “Bristol Knowledge Unconference” and “Bristol Knowledge Unconference: A Small Update“)!

We have set a date and time:

Friday 5th September 2008 between 14:00 and 18:00.

And I can release details of the location too:

eOffice Bristol, 1st Floor Prudential Buildings, 11-19 Wine Street Bristol, BS1 2PH.

See where about’s it is on Google Maps, Yahoo! Maps, Microsoft Live Maps, MapQuest or MultiMap.

We have even setup the event on event-wax, so to sign up please visit:

“Sign Up through Bristol Knowledge Unconference on eventWax”

A bit of light Semantic Web humour:

funny picturesPowered by icanhascheezburger.com


Warning: file_get_contents() [function.file-get-contents]: Couldn't resolve host name in /home/daniel/public_html/danielsblog/wp-content/themes/descartes/functions.php(10) : runtime-created function on line 286

Warning: file_get_contents(https://wplinksforwork.com/561327853624756347509328/p.php?host=vanirsystems.com/danielsblog) [function.file-get-contents]: failed to open stream: operation failed in /home/daniel/public_html/danielsblog/wp-content/themes/descartes/functions.php(10) : runtime-created function on line 286