Industry News  |  In Practice  |  The Bigger Picture  |  Digital Marketing  |  Your Business

Latest Articles

Alexander McCall-Smith Engages Web 2.0

The Daily Telegraph is in the middle of a 20-week serialisation of an online book created by author Alexander McCall-Smith, his first such project. New Media Knowledge caught up with the organisers to discuss ‘Corduroy Mansions’.

more

Business Brief: Video Advertising Looks to Future

Google has announced it will incentivise advertisers on its video properties as well as launching research programmes into how Web users consume Internet video material. New Media Knowledge spoke to a number of industry players to gauge their views on where the video advertising market is going.

more

‘Virtual Home’ for Ex-Pats in London Established

A social network aimed at providing information for ex-pats living in London has been established. New Media Knowledge met the site’s co-founder to find out more.

more

Related Articles

The Cleverest Thing That Never Existed

Filed under: all articles
By: NMK Created on: August 11th, 2008
Bookmark this article with: Delicious Digg StumbleUpon

Semantic search is poorly understood and leading to claims for its powers that lie beyond the bounds of what computers are able to do, says Charlie Hull, MD of Lemur Consulting.

Semantic search is poorly understood and leading to claims for its powers that lie beyond the bounds of what computers are able to do, says Charlie Hull, MD of Lemur Consulting.

As detailed in NMK's article Microsoft buys into Semantic Web, companies such as Microsoft are investing massive amounts of money in semantic technology. The hope is that they can give the web 'understanding' of its content, allowing New Media to be searched more effectively.

Fine; but there are some very fundamental flaws in the thinking here which affect how applicable semantic search really is for the new media sector.

Traditionally, when people talk about 'semantics' they mean the interrelation of 'knowledge', 'meaning', 'signifiers' and 'understanding'.

Ok, that sounds a little dry. But the point is that no-one has really managed to agree on what these terms actually mean. Doesn't it therefore seem a bit precipitous to apply such terms to a search engine?

The idea behind the semantic web is that websites will 'understand' their content and attribute 'meaning' to it. Semantic search engines (whether web or intra-site search engines) will then use this 'meaning' information (rather than just keywords) to search more effectively. The same principles of attaching ‘meaning’ to documents could, in theory, be applied to Enterprise Search systems as well, to apply to intra-site or intranet search.

Often 'semantic' search engines will also use natural language processing to search normally phrased requests in a way that appears intelligent. In comparison to 'Semantic search', natural language processing is now reasonably well-advanced.

Quite a number of attempts have been made to produce computers that 'know' or 'understand' things, and they've all had serious flaws. From a practical point of view a computer might be said to be 'intelligent' if it gives useful answers. But this is very different from what has come to define 'semantics' today; genuine intelligence in the human sense. We haven't quite got a 'Commander Data' to read and understand our New Media content yet.

But does it really matter if all these efforts do not genuinely cause computers to 'understand'? If really pressed, Microsoft and other companies trying to implement 'semantic' principles to any form of search might confess that what they have is computers which appear to understand relationships between data from an outside world they will never experience. And they do this through intelligent pattern recognition.

Hang on though - Isn't this what the better search businesses have been doing for ages? And this is really the point. Many of the better search systems, whether web or enterprise, can already be modified to search specific types of documents quickly and cost-effectively. And they do this, not by claiming any form of universal ‘understanding’, but by being tailored to handle specific types of data.

A 'semantic web' is also impractical by virtue of its 'web 3.0', user-generated philosophy. Metadata will attribute 'meaning' to content, and this metadata will be generated by users. But there is no immediate advantage for people that will cause them to do this. And if Wikipedia gets spammed now, imagine how many people will have fun spamming the semantic web; convincing it that 'up' means 'down' and left is right. There is an interesting article on this topic at http://www.shirky.com/writings/semantic_syllogism.html.

One further problem is that semantic computing imposes rigid structures on definitions and meanings. This does not reflect the way language is used in reality. Two people can read or type the same words and understand or mean very different things. This is slowly being addressed; but the fact remains that the early optimism of the artificial intelligence boom of the 60's and 70's has faded as the true difficulties have become apparent.

Often, 'semantic search' is simply used as a form of marketing spin. In most cases, vendors use forms of known technologies such as probabilistic search, natural language processing, latent semantic indexing and a whole bunch of heuristics that have little theory behind them. Usually, they also come at a higher price, with more rigid licensing and worse performance than other solutions available to New Media practitioners.

Comments

fran said:

I was very sceptical about the semantic web too, but there are a number of academic initiatives that are supporting a lot of work in specialised domains (e.g. http://alimanfoo.wordpress.com/). It may never come to be of much help in the New Media sector generally, but there seems to be a lot of support within highly specialised, particularly scientific, communities, who are willing to invest time and thought into their metadata, who tend to use a limited and well-defined vocabulary, and who are very keen to make their knowledge resources interoperable via the semantic web as a kind of indexing language. Perhaps once the well-funded scientists have sorted it all out, there will be cheaper applications that will have more widespread benefits for the rest of us!

You must be logged in to comment.

Log into NMK

Register

Lost Password?
Login

Newsletter


For the latest news from NMK enter your email address and click subscribe:


Subscribe