Steve Flinter’s blog

Musing on science and technology in Ireland

Steve Flinter’s blog header image 4

Alpha Release of SemanticTweet

June 25th, 2009 by steve
Respond

Recently, I’ve been playing a little with the Twitter REST API, and with Sinatra, the new Ruby web framework that all the cool kids are into.

On the back of said playing, I’ve just released a pre-alpha (if there is such a thing) version of SemanticTweet.

Basically, SemanticTweet is a simple web service that generates a FOAF RDF document for you from your list of Twitter friends and followers. It does this using the Twitter REST API. This service uses public Twitter data only, and so doesn’t need your Twitter username or password.

FOAF, which stands for friend-of-a-friend, if you’re unfamiliar with it, is a semantic web representation of your list of friends. It’s typically represented in a semantic web format known as RDF: resource description framework. To give you an idea of what a FOAF document looks like, here’s my one, as generated by SemanticTweet.

One of the benefits of this approach is that it ensures that you don’t have to build and maintain your FOAF file by hand (or using a service like FOAF-a-matic), which is a real pain. This service will dynamically generate the FOAF file each time its queried. The second big benefit is that it turns your friends’ Twitter pages into dereferenceable URIs, which means that a semantic web browser or search engine can traverse from link-to-link, just like a standard web page, and all without having to explicitly call the Twitter API.

One way you can use this service/document is by embedding it in your blog/website. Just add a line to the <head> section of your template which reads:


<link rel="meta"
  type="application/rdf+xml"
  title="FOAF"
  href="http://semantictweet.com/your-twitter-screen-name" />

This approach is what Tim Berners-Lee refers to as Linked Data. Check out his excellent talk at TED to get a better idea of this movement.

There’s plenty more to do, and plenty of ways in which Twitter data can be presented in a semantic webby way, to allow more interesting documents to be produced, so watch this space.

So run, don’t walk, over to semantictweet.com, and check it out. You too can have that FOAF document you’ve always wanted but were afraid to ask for. Let me know if you have any comments or observations.

You can follow developments on blog.semantictweet.com and @semantictweet.

Tags: No Comments.

Google’s play in the translation space

June 11th, 2009 by steve
Respond

Over the past few years, Google have been moving more and more into the machine translation (MT) space – see, for example, their language tools page, which allows you to translate an arbitrary webpage, or a snippet of text from one language to another.

Google’s approach to machine translation is what’s called statistical machine translation (SMT).  Essentially, they take the millions of human translated webpages that their search engine has already indexed, and align them – that is, they match sentences in one language (let’s say English) with their counterpart sentences in the second language (let’s say Spanish).

By doing this process across millions and millions of webpage they can build up pretty robust statistical methods of guessing a particular phrase’s correct translation.

This approach had been proposed relatively early on in the development of machine translation – as far back as the 1940’s or early ’50’s, indeed, but until recently, it could not compete with the other major school of thought in the area: rule-based machine translation. Google’s innovation, of course, was that because of their enormous web index they could bring several orders of magnitude more data (translated web pages) to the party than any other previous approach to SMT. In so doing, they showed that an abundance of data can lead to a significant improvement in the quality of the resulting translation.

Why is all of this relevant now? For two reasons: firstly, SFI is funding the Centre for Next Generation Localisation CSET (one of the grants in my portfolio), part of whose work includes machine translation. Second, by way of TechCrunch, I learned of the newly released Google Translator Toolkit. This toolkit is designed to work with the existing Google translation system, but also to allow human translators to add or correct the translations as they see fit.

Of course, there are many professional software tools to support human translation of software packages, websites, documents, etc., but the new Google Translator Toolkit appears to be aimed more at crowd-sourced translations. This is the latest development in website localisation (in particular), led by companies such as Facebook, where the casual (as opposed to professional) translator can translate some of the content of a site into another language. Indeed, crowd-sourced translation is also one of the areas of particular interest to CNGL.

This is a very hot area, and with the release of this toolkit, looks likely to get hotter. It’ll be interesting to see what impact this has, on the translation research community, the amateur/enthusiast translator, and indeed, also the professional translation business.

Tags: 2 Comments

No iPhone 3GS for O2 yet

June 9th, 2009 by steve
Respond

There’s no mention of the new Apple iPhone 3G S on the O2’s iPhone page yet. Let’s see how long it will take for the 3G S to make it to these shores.

Hopefully, O2 will fully support the new tethering option in the next rev of the iPhone OS.

Tags: No Comments.

Reasons why Ireland rocks for telecoms innovation

April 22nd, 2009 by steve
Respond

The Enterprise Ireland Silicon Valley blog has a post on why Ireland rocks for telecoms innovation.  Number 5 is:

Irish universities are active in telecoms research. A number have linked with industry to develop novel products. For example, UCC are working on encryption techniques; Maynooth are working on wireless Antenna control.

There are a bunch of world class research centres in Ireland investigating different aspects of networks and telecommunictaions, including:

Another reason I would add to the list is ComReg, the Commission for Communications Regulation. While ComReg regularly comes in for criticism from the Irish blogging community for not pushing hard enough to get broadband rollout, it does deserve credit for its enlightened view over allocation of spectrum for telecoms research. Because we’re an island, Ireland is an ideal test-bed for research and development of wireless technologies which can be tested and developed without polluting the spectrum of our neighbours.

Tags: 1 Comment

Categories of Turing Award winners

April 15th, 2009 by steve
Respond

Turing Award Winners Recently, I was looking through the list of Turing Award winners on Wikipedia.  I knocked up a quick mindmap of the winners, categorising them by sub-discipline.  To paraphrase George Box, all categorisations are wrong, but some are useful.

What stands out for me in this map is the preponderance of award winners from the areas of programming languages and theoretical computer science.  This is understandable in the context of laying the foundations for computing and computer science.

There are very few, however, that could be called application areas.  For example, one in graphics for Sutherland, and one in networking for Cerf & Kahn.

There are no awards (yet) in hugely interesting and important areas of computing, such as social networking/web (Tim Berners-Lee a prime candidate, perhaps?), mobile, data-mining, geo-technologies, machine vision, machine translation, massively parallel computing, sensors…….

Tags: No Comments.

Implications for SFI in the recent budget

April 15th, 2009 by steve
Respond

Minister Jimmy Devins recently issued a statement detailing the implication that the recent budget will have on the science & technology research sectors funded by the Government.

A quote as it pertains to SFI:

The Science Foundation Ireland funding will be €170.5 million. In addition a sum of €5.5 million in capital carry over will be available, giving a total budget of €176 millions which is a 3.2% increase over last year’s spend year. The allocation of €176m will enable SFI to continue to build a critical mass of internationally competitive research teams in the sciences and engineering underpinning Biotechnology, Information and Communication Technologies, and Sustainable Energy and Energy Efficient Technologies.

Read the full statement for details.

Tags: No Comments.

Kawasaki on commercialisation

April 8th, 2009 by steve
Respond

Check out what the inimitable Guy Kawasaki has to say about The Art of Commercialisation – one of the things that we at SFI continually grapple with…

Tags: 2 Comments

Guardian to switch publish all articles in twitter

April 1st, 2009 by steve
Respond

On the Guardian website: Twitter switch for Guardian, after 188 years of ink.

A mammoth project is also under way to rewrite the whole of the newspaper’s archive, stretching back to 1821, in the form of tweets. Major stories already completed include “1832 Reform Act gives voting rights to one in five adult males yay!!!”; “OMG Hitler invades Poland, allies declare war see tinyurl.com/b5×6e for more”; and “JFK assassin8d @ Dallas, def. heard second gunshot from grassy knoll WTF?”

Classic.  Hat tip to Brien for the pointer.

Tags: No Comments.

Steve Collins

March 18th, 2009 by steve
Respond

Steve Collins was quoted in a recent Sunday Business Post article as “slamming universities for being ‘afraid’“.

Interesting reading.

Tags: No Comments.

Echodio taking wings

March 18th, 2009 by steve
Respond

Back in November, I blogged about my mate Niall Smart looking for co-founders for his new startup.

Well, things have been going very well for Niall.  Not only did he secure the Y-Combinator funding that he was looking for, but he recently launched his company, now called Echodio, at the SXSW Festival.

As the icing on the cake, Echodio was featured on TechCrunch yesterday.

Tags: No Comments.