Books & Articles I wrote.

Sunday, April 30, 2006


Intelligent Software - "you probably want"

Right, i finally managed to get 40 MB of my mobile phone - pics and videos. That's great, but you know what - it was only by chance i found the software. Prior to that i was doing it individually and it was taking forever, so i never got them off. I had a USB bluetooth adapter and a phone - no instructions (was from a friend). I would send my pic from the phone to the PC individually as i has no idea what software was available for the device. In actyual fact, some free software from Beklin allows you to copy things across like Windows Explorer, so i left it copying and within 30 mins it was all done. I wish i had the time to go look for all the software that may be useful for my devices but to be honest i barely have the time to keep on top of the drivers, never mind software on top of those.

My next point. All of these things know who they are. It knows it is a Beklin USB v2 Bluetooth adapter, so WHY not just suggest the software i should have? Windows has a "Use Web Service" when it finds an unrecognized file type, but when you install a device, i'd like it to say, ok, this device has given me this URL to get data about it. That is ALL YOU NEED. A URL from the device, simple in comparison to the host of other data they exchange. Then it can go to the site and as much info as desired can be given to the user. It would make my life much easier as at the moment i am having to use a CD that stores all of the software i have "discovered" for my laptop and PC's over the last 4 years. Sure, my favourites should be used, but surely software such as this is inherently a favourite of mine as it makes my devices work better for me.

My favourite pic (click to find out more):

The largest stadium in the world

I was pointed to this by Baris Karadogan at ComVentures.

You choose your seat in the stadium, select your strip and the supporters you want to sit next to. It's a social network idea around the World Cup. it's a neat idea, but like many social networking sites, lack an "action" - something that will actually get you networking (kinda like how do you actually *get started* using linked in?). Maybe they should have a few things to get people talking, like game predictions, half time penalty shoot outs and so on.

But still, for those us us not going (i'm from Scotland and we're letting everyone else have a chance just now) it's a neat idea and if they can give me a reason to go back to it during the WC, i probably would.

Saturday, April 29, 2006



Scolari gave me a laugh, Wayne Rooney got injured which means they won't win the world cup (i heard they were going to hand it to them anyway), the weather in Glasgow was nice, people still jump spikey metal fences (no link... luckily), i got bluetooth working, i update Vidyo, i kicked the wall and almost broke my toe playing with xavi. On that note.

Hunterian Museum, BlogLines, Intentions

Today we were up early and out before 9Am. We got back at 4.30 PM with everyone's feet killing them. A craking day meant the wee guy was out and about in the Botanic Gardens and then into Kelvingrove park. He half walked, half cycled.

At about 2 we went into the newly refreshed Hunterian museum in Glasgow University which just opened today. It was very impressive and evern Xavier at just 3 was asking a ton of questions (which of course he expected me to know the answer too). Thankfully they were all labelled, so i still have some smart points :) Get your kinds along - it's a good few hours and the dinosaurs are cool (and make sure you do the treasure hunt).

Here is a picture of one of the scary monsters we manage to caputure - you can also see the skeleton of a dinosaur behind him.

When I got back i could barely move, so i got a Corona and whilst Xavier watched some TV i decided to sort out my longstanding issues with RSS (and listened to Jim Traynor on the radio with the usual phone in nutters). I am useless and have a zillion feeds on my random PC's and lose them when i re-build. Well, i signed up to BlogLines - a service a looked at long ago, but now need. I really want them centralized so i can get at them anywhere and they are backed up. I am now using it, but the signing up, UI and search are a bit of a disaster.

First, i was asked to choose things wanted to subscribe to (i spent a fair amount of time on this). I did, but it then came up with an error. On the panel on the left i was signed up to some of those i had chosen. I then decided to do it again just in case my connection had got screwed half way through. This time however, it said i was already tsubscribed to a blog and stopped. Obviously it found something it had subscribed to in the first place and flamin' stopped! I would rather it just did them all and then told me at the end. Anyway, i then went in and set up my feeds. It's ok, but keeps taking me to some flamin instruction page every time i add a new feed - when you are adding a load this is a pain. The search is also pretty hopeless. I typed in "Google" and "Microsoft" and basically got a Google search list rather than actual blogs about these companies - "Google search results for Microsoft" - which was 99% useless.

However, it IS a nice product to use and hopefully they will clean this stuff up. More importantly however it has solved my problem and that's me hooked. They also have a notifier which will be very useful.

Finally, the Intention Cloud .. "is a web application supported by Perl and AJAX technologies that gathers and displays the queries performed on search engines in an intuitive output using weighted lists, also known as tag clouds."

Yes, the idea is that it can grab the intentions of the zillions of Google queries and so you can get an idea of what people are wanting when they type "i want", "i am", "playing football" and so on - all presented as a cloud of tags.

I tried "visiting glasgow" and is said back to me "visiting glasgow scotland" (in really blue text against a blue backdrop which makes it hard to see) an the link took me to a google search for "visiting glasgow scotland". So i guess it takes a query and works back to find where people have using your query as part of theirs - i.e. their intention of that query.

Whether i will ever use it again is another story.

The science of women

.... my wife had a somewhat different solution to this.

Friday, April 28, 2006


Interfacing with Universities

Today i was pointed to Interface, a new project to try and link industry with the Universities better.

It is a great idea and similar in concept to something i discussed last year.

I’d like to be able to tap into this expertise when I have a specific problem or have one or my “research buddies” I can talk with.

For example, I have read a lot on algorithms related to a specific task I am working on just now. Now, I can understand most of what it being said, but my maths is not good enough to really go into detail and develop specific algorithms for my tasks (nor do I have the time). I would be ideal to buddy up with someone in universities with similar interests who you can work with. I know this happens more formally in graduate summer projects and post grad projects within industry (that’s what I did), but how I, as a start-up, could tap into people is beyond me.

I'll be keeping an eye on this and hopefully it will help startups as well as bigger industry.

So we need IUniversity with IStartUp :: IUniversity and IEstablished :: IUniversity.
Start coding.

RDF/A Primer

Karl Dubost pointed me at the RDF/A Primer 1.0 which is one or the more exciting documents i have seen in recent times.

Embedding RDF and XHTML has been around for a number of years, but it has always been left to the person adding the data to decide how they want to specify it. It has always been hard to get your average joe to add their contact information in some format that that si semantically readable, yet easy to do. This primer is a good start to narrowing that gap.

In effect it does what Microformats do, but leverages existing technologies such as FOAF - cementing the DRY principle in a global context.

You can specify data about yourself in the following way:

<li id="andrew" about="#andrew">
<link rev="foaf:member" href="" />
<span property="foaf:firstname">Andrew</span>
<span property="foaf:surname">Smith</span> can be contacted on
<span property="foaf:phone">+1 777 888 9999</span>

This is open and uses existing FOAF information which is nice.
The question now is whether to use this or Microformats. Micrformats has the advanage of things having a short well-defined meaning - hCard, hCalander and so on.

An example Microformat contact card example is shown below.

<div class="vcard">
<a class="url fn" href="http://tantek.com/">Tantek Çelik</a>
<div class="org">Technorati</div>

The RDF/A version is much more flexible in that it doesn't really what what kind of data you are adding or how to specifiy specific types of data - it leaves this to the creator. It can be interpreted by parsers who grok the namespaces used.

They certainly overlap - i just wonder whether Microformats that use RDF/A would work best. Or in reverse, if the RDF/A created a set of definitions that were well defined semantically.

e.g. "to create a contact card FOAF", "to create a calander, use rdf calander" and so on. It's not about limiting the flexibility - it's more about telling people who intend to create the data where to start and what to use, as the combination of XHTML and the various schemas out there (foaf, ical and so on) is very flexible.

This is what Microformats do well just now - they basically say, to create a contact card, use these and do this.

I have to say the RDF/A document is the most interesting to me as it is very extensible and uses existing tools. I'd just like to see "cut and paste" fragments for things such as "email and name", "home address", "education", "an event today" and so on - things that could be defined, populated and saved in no time at all.

Thursday, April 27, 2006


Jonathan Swartz, CEO of Sun

This week saw Jonathan Swartz become the new CEO of Sun.

Now, he doesn't look like a typical CEO (he's the guy with the ponytail) and he doesn't act like one either - for a start he has a Weblog that he really does put effort into (and EVEN has comments turned on!). He also seems to get pretty hands on and involved with the project teams.

If may be a good thing for Sun to get this kind of guy at the helm, and although i've not used Sun servers since my Uni days, i'll be keeping an eye on his blog. It's cool to have some visibility to these guys, so best of luck to him!

Microformats, Xml and RDF

There has been quite a few emails back and forward today between myself and others on the Microformats mail list. I like the concept of Microformats, mainly because they are likely to narrow the gap between those of us living in the purist Semantic Web world and those who actually create the data (typically not the same thing).

Almost anyone can create a Microformat. Creating an Xml Schema or RDF/OWL can be quite tricky and it's one explanation as to why markup rests almost exclusively on the shoulders of XHTML and RSS on the web.

I do believe that some of the problems of Xml Schemas and related formats in the post has been over engineering of the requirements, so much so that it becomes too difficult to just get started. The W3 did a good job in creating Primers for Xml Schema - but the Primer is 77 pages long and so before you can even start creating your own Xml formats, you have to read 77 pages.

HTML on the other hand is basic and fairly well known and adopted. It's also pretty flexible and supported well by numerous tools. So you have no real learning curve there. Then populating Microformats is pretty easy. Putting them in HTML documents is very easy too.

They really are a good start for building a real Semantic Web, so fingers crossed that the community and tools see the potential for making information more accessible across the board.

Wednesday, April 26, 2006


Three Search Questions

1. What is currently missing from search?

2. Who is closest to a solution?

3. What would you love to see (it may be years down the road).

Vision is a strange old thing

I often wonder how people originally get the vision of something that turns out to work really well.

Take Google. For all the talk of algorthims, scientific research and complex IQ interview tests and more, the reason Google worked fundamentally boiled down to PageRank (origianlly BackRub) which in simplistic terms was the idea that you could get the idea of the importance of a web page by those who referenced it in a backwardly recursive manner.

In other words, you think say blog is great. This ranks better than me just saying it's great. Someone else also thinks my blog is great (maybe i'm now reached my readership base). That improves my pagerank. If people who think it is great are also globally regarded as great, then their weighting is higher. Simple isn't it.

After quite a lot of effort I may be starting to get an idea of how you get to the one sentence that makes a great product. Well, it's both hard work and luck. I may now have some idea on what i can do different to everyone else and the hard part is that for all the months of work i put in, the core part is pretty simple. It's like i can filter 30% of all the work i have done and from that create something pretty powerful. It's both frustrating and enlightening.

Now, i'm not saying i will create one of the next big things. But recent events at some of the major search companies tell me i'm on the right track. The next 6 to 8 weeks is pretty important. I want to get something out and find out whether i have nailed it, or whether it is just another learning step on the road to something that will work. I'm confident i will get to something that will be pretty huge - i'm just now sure what that will be!

But then that's the lucky part.

Thursday, April 20, 2006


GData - another company creates a distributed protocol

Update : I was maybe a little hypercritical of what Google are doing. Don't post after a long day and a screaming 3yr old! I do still wonder though whether a Google + Community driven effort would have been a better suit. I still don't see their competitors going for this, but if they let the community go wild with it, then it may have a chance of impacting outside of just Google.

I don't believe companies can now single handedly create successful web service protocols that are to operated in a openly distributed fashion. RSS and Atom were community driven and so no-one owned them - the result is that most major companies supported them ... after most of the community had decided to long beforehand (sure, i'll glady give Microsoft as a good example as they've done great in their support after a slow start).

But now Google has GData.

My general view is that company specific protocols will nowadays find it very hard. RSS works well because no-one owned it. I find it hard to imagine Google's competitors adopting it and more likely to create something else. It will work well for Google, but as their desire is to organize information globally, how they can do that with a closed protocol is beyond me.

They also have a "Google-specific authentication system" when there are already a load of emerging protocols out there too (not to mention 8000 ways of logging on to various services as it is).

If Microsoft had done this they would be hammered in the media.

War of the syndication formats? I think this may be the start of Collaborative v Company Driven web service protocols. Expect to see a backlash from the people creating data and schema and Google, trying to control it from the centre outwards.

The dead web - Google + Archive.org?

This was something i posted on an old blog some years back, but recent discussions have made me re-post just in case there is some new opinion!

Over the last 2 months I have been conducting research almost exclusively on the web.
What has really became obvious to me is the amout of dead material out there.

From web pages, that contain out of date information, to whole sites that stopped running years ago with no indication, to projects that seem to have been in flux for years, businesses that stopped trading years back and left their site on and even stuff written by people whom i'm almost read to email, only to find the passed away a couple of years back.

So is a new web needed to get us out of this? Can we see Google work with Archive.org and create a diary of the web? A time-aware searcheable web which allows some kind of time scale on the information out there, without requiring everyone to annotate their documents! Could i say "Only search content added/updated in the last year" ?

I hope so, because frankly it's getting ridiculous. 10 years ago i did some research on solitons for a Physics paper i wrote. Today some of that material returns seelingly as relevant as ever despite things continuing to evolve over the last decade. My File Exists article on 15 seconds at http://www.15seconds.com/issue/990401.htm is now over 5 years old, but still comes 8th in Google when i type "FileExists".

I don't know how many replies i have had indicating some academic moved on 3 years ago, or some project research was finished, or even links to other sites that closed their doors, re-organized or just changed their content to make it completely useless.

Could a hyped up archive.org challenge something like Google? I think so. Coudl we "Diff The Web" to make the content more relevant - noting that getting dublin core on everything is highly unlikely.

Anyone got answers?

Thursday, April 13, 2006


Google in China

Google's upcoming Chinese web site ...


had to change their name as the translation does not fit well... explanations on a postcard please!

What does GuGe mean? Wikipedia tells us ...

Friday, April 07, 2006


.eu domains live

.eu domains have just gone live - buy at http://gandi.net

Almost everything i buy goes through them.

osullivan.eu is gone :)

Thursday, April 06, 2006


Live Clipboard

Today i discovered "Live Clipboard".

I'm still reading up, but looks like a nice idea to me.
I'd rather it were all Xml, but i can see the need to support some non-xml formats.

Why can you not just put a serialized object in a web page and have it deserialized at the other end? It may be useful in certain cases. I'll keep reading and listening.

Celtic Champions, Puzil getting there

Well tonight my team, Celtic again won the league title here in Scotland. It was a hard game and we've had a rollercoaster season, but the team is slowly shaping up to be very good next year. We have some of the best young players in Scotland in our team combined with some foreign players and of course John Hartson who may be away in the summer.

Well done Bhoys - this time we finished the job.

In work world, i've had a load of fun with setting up an email server for Puzil - there it is now all setup and working, with a little bit of Xml hacking required in the middle to get the config files for the mail server working.

I see the blogger sign above reads "Scheduled outage at 5:00 PM" which should be interesting considering it's midnight here - if Google are to do local search they may want to get localization of their services right first ;)

Sunday, April 02, 2006


Ice Age 2

As Ice Age 2 is released today, and having watched Ice Age 1 about 400 times with my son, we checked out the site. After a bit of playing "Scrat Jump" we managed to get 237.23 metres.

It's a little addictive once you get your first good jump - can you beat 237.23?

