The Daily Telegraph is in the middle of a 20-week serialisation of an online book created by author Alexander McCall-Smith, his first such project. New Media Knowledge caught up with the organisers to discuss ‘Corduroy Mansions’.
moreGoogle has announced it will incentivise advertisers on its video properties as well as launching research programmes into how Web users consume Internet video material. New Media Knowledge spoke to a number of industry players to gauge their views on where the video advertising market is going.
moreA social network aimed at providing information for ex-pats living in London has been established. New Media Knowledge met the site’s co-founder to find out more.
moreBusinesses today rely increasingly on the Internet for promoting themselves and selling their wares. Consumers have too, making e-commerce a fact of our everyday lives. Dr. Lawrence Roberts, a key founder of the original Internet called ‘ARPANET’ believes that 99% of us will be online by 2018, and that everyone will possess a mobile wireless Internet device. more
Semantic search is poorly understood and leading to claims for its powers that lie beyond the bounds of what computers are able to do, says Charlie Hull, MD of Lemur Consulting.
moreWhile the definition of Web 2.0 has been argued between digital specialists for some time now, the same key themes prevail. According to Wikipedia, Web 2.0 technology enhances "creativity, information sharing, and, most notably, collaboration among users". The definition of Web 3.0 however is much more difficult to define. more
A survey has revealed that the American public is shunning traditional media such as newspapers and TV as their primary source of news. The Internet has become the main channel of information for nearly half (48 per cent) of Americans - an increase of 8 per cent from one year ago.
moreWhile Web 2.0, user-generated content sites perform less well than traditional sites when it comes to advertising conversions, the cost of using such sites is proportionally low. more
The annual British Comupter Society Roger Needham lecture delivered by Ian Horrocks at the Royal Society on 7th December 2005 delivered some key insights into the reasoning behind - and challenges of - the semantic web, reports Deirdre Molloy...
Some of this talk – the annual British Computer Society
Roger Needham Lecture delivered by Professor Ian Horrocks at the
Royal Society on 7 December 2005 – went a little over my head,
but other parts percolated the grey matter and I learned a lot
more than I usually do at public events...
By Deirdre Molloy
[Register and post your own comments
on this article below...]
By way of introduction, Dr Peter Kay, Assistant Director of
Microsoft Research (UK) mentioned that the late Professor Roger Needham (1935-2003), in whose
honour the lecture is held, created the concept of “clumps”, the
“clusters” of today.
Horrocks began by asking what is the semantic
web and he defined it by first looking at some of the
problems and limitations of the current web.
The current web isn’t much more than distributed hypermedia. The
information on it is designed to be consumed by human beings but
is not accessible to automated processes.
Tim Berners-Lee’s original vision of the web was more ambitious,
Needham explained, proffering this TBL quote about the WWW to
support his assertion: “A set of connected applications forming
a consistent, logical web of data… given well-defined
meaning.”
The messy, syntactic web
It’s hard work using this “syntactic web”, as Horrocks termed
it. A search for his colleague Alan Rector on Google Images turns up images
of someone called ‘Reverend Alan’. What else is impossible to
find using the syntactic web? Complex queries involving
background knowledge; locating information in data repositories
– eg. travel enquiries, prices, goods and services; the results
of human genome experiments.
It’s also hard to find and use web services eg. Given a DNA
sequence, identify its genes, determine the proteins, etc… its
very difficult if not impossible to get a coherent pathway to
this information.
The problem here lies in the fact that mark-up is all about
presenting the information, not about the semantic content. The
web page is very accessible to us but very difficult for a
machine to understand. Even worse, text is sometimes buried
inaccessibly inside images and graphics.
Delegating complex tasks to web agents
So what is the proposed solution? To add semantic annotation to
web resources. In the case of the pictures, semantic information
along the lines of: “Dr Alan Rector is a Professor of Computer
Science at the University of Manchester”
What does giving semantic annotation to web resources mean?
Horrocks explained that it involves firstly an external
agreement on the meaning of annotations, for instance the Dublin Core
Metadata Initiative. It must have limited flexibility and
extensibility; and a limited number of things that can be
expressed. It uses ontologies to specify the meaning of
annotations.
He then elaborated on the role of ontology in information
science. An ontology is an engineering artefact. It’s also a
vocabulary used to describe (a particular view of) some domain.
And it’s an explicit specification of the intended meaning of
the artefact category.
Horrocks listed some applications of ontologies currently in
operation, such as e-Science (bioinformatics); in medicine where
they are building and maintaining terminologies such as Snomel,
NCI and Gaten. Ontologies are also used for organising complex
and semi-structured information, such as that held by the UN,
NASA, Ordinance Survey, General Motors and Lockheed
Martin.
What are ontology languages?
Next he turned to the Semantic Web itself, and began by considering
‘ontology languages’. If ontology languages are to serve their
purpose for the web, they need to be agreed. One of the first to
be agreed was the RDF schema, but it’s very weak and doesn’t
allow us to explain many things. It’s also very high order,
difficult to explain and difficult to provide support for.
Two languages have been developed to address the deficiencies
and problems of RDF. OIL was developed by a group of largely
European developers, and DALMONT was developed by a group of
largely US-based researchers. They merged to produce DAML+OIL.
This was submitted to the W3C as OWL.
Both were based on the same underlying description logics, a
family of knowledge logic-based knowledge representation
formalisms. These were descendants of the semantic networks
KL-one. They describe the domain in terms of concepts (classes),
roles (properties, relationships) and individuals. The operators
allow for composition of complex concepts. And names can be
given to complex concepts.
Eg “happy parent” = Parent-child-smart-fit.
Reasoning & logic
Semantics and reasoning are distinguished by formal semantics
(which are typically model theoretic) and decidable fragments of
FOL (often contained in C2). [note: at this point I was totally
lost and I must apologise if my recap contains errors! But
Horrocks gradually began to talk in less specialist language].
The provision of inference services is guided by decision
procedures for key problems (satisfyability, subsumption, etc)
and by highly optimised implemented systems.
But why the description logic? OWL exploits the results of 15
years of DL research and is based on well-defined (mood
theoretic) semantics. Its formal properties are well understood
(complexity; decidability). We know the reasoning algorithms,
and there is an implemented system – Cerebra.
Why all the strange names? Description logics are a family of KR
formalisms – mainly distinguished by available operators. The
available operators are indicated by letters in the name eg.
S,H,O,I,N. OWL had to come up with a web-friendly syntax.
Ontologies for the working web
In turn Needham explained why they chose ontology reasoning.
Given the key role of ontologies in many applications it’s
essential to provide tools and services to help users, and
therefore to design and maintain high quality ontologies that
are meaningful – ie. All names classes have instances.
What were the research challenges? Firstly increasing expressive
power, which involved complex role inclusion axioms; concrete
domains / data types; database style keys; rule language
extensions.
The second challenge was improving scalability, requiring
optimization techniques; reduction to disjunctive datalog, and
hybrid DL-DB. Tools and infrastructure was another challenge, as
was design methodologies (based on foundational
ontologies).
Q&A with the audience:
The question of the commercial barriers to asking high-quality
questions on the WWW was raised, and the delegate wondered if
the commercial cost of annotating the source info would be met.
In turn, he continued, some people and organisations won’t want
the source information so annotated to then be given away for
free, suggesting that it won’t find its way onto the open
web.
Horrocks responded that he hoped the process will bootstrap
itself, and interest in ontology annotation will grow and become
a groundswell.
Elsewhere in the audience David Holsworth noted that Google’s
translation service from French into English had recently
improved, and said this indicated improving semantic
algorithms.
[NB:Notes on the lecture can also be downloaded here]
Further thoughts on this lecture can be found on the
Beers & Innovation blog
About Professor Ian Horrocks:
Ian is Professor in the School of Computer Science at the
University of Manchester. His primary research interest is
Knowledge Representation; in particular ontologies and ontology
languages, tableaux algorithms for Description Logics (DLs),
optimisation techniques for such algorithms, and the application
of all of the above to e-Science and the Semantic Web. He was a
member of the W3C WebOntology working group that developed the
OWL language (now a W3C recommendation), and is an author
of the SWRL Semantic Web Rules Language proposal. For more
information visit his web page
Further reading:
The British
Computer Society
The
Semantic Web Community portal
The
Semantic Web: An Introduction – article by Sean B
Palmer
What Is the Semantic Web? – article from
Altova
Tim Berners-Lee started blogging in January 2006 on the DIG research journal group blog
Comments
You must be logged in to comment.