Semantic Web: How would it happen?

Yesterday at DevCamp, one interesting session was about Semantic Web and skepticism around it. I listened different perspectives. I realized, some part of discussion was around:

  • How to make web more semantic?
  • Does annotating all data (existing or future) using microformats, rdf, etc make sense?
  • Should we really need to annotate everything ourselves?
  • What about non-technical users who use/contribute-to web?
  • What about smart algorithms/tools that can make meaning out of existing web without us doing much?

I have been advocating/using microformats for sometime, which is totally useful but above questions also make sense? Yesterday’s discussion made me think more about it and I started finding about different perspectives/realities. I found an informative article series on ReadWriteWeb.

Let’s help some web more structured, as said in readwriteweb article above, by tagging, annotating, etc as much as we can. We can also spread the word. Most importantly, we developers can write smart tools that can capture and expose the data in structured-way. Every bit helps and that’s what I said in yesterday’s discussion, the idea is to make things better. Sooner or later, we would have a perfect web, well there is nothing perfect but you got what I mean :-)
BTW! If you want to feel how tools/user-experience would get better with Semantic Web, Check out of Adaptive Blue extensions (BlueOrganizer and SmartLinks) to browser (s). These tools recognize data/object/information in a web-page and provide you very contextualized choices (action items). They are doing things using existing, mostly unstructured web, imagine – how things would be structured web i.e. semantic-web?
You can explore it but let me give a simple example. Suppose you are reading movie review of Transformers on IMDB. Adaptive Blue tool would deduct about director, movie-name, casts, etc and provide you relevant choices/options like:-

  • add to twitter, bookmark, delicious, facebook, etc
  • comparisons from various sources
  • Transformers pages on wikipedia, amazon, netflix, etc
  • Rent from netflix, etc
  • Michael Bay (the director of the movie)’s movies on amazon, images on flickr, blogs on google, etc
  • list of stars and related links

There are so many contextualized choices, I don’t want to list all of them. However, you can see the screenshot and check out the extensions.
I am wondering, is it something that can cause privacy concern? If they are collecting page-info on their server?
Update (Feb 11, 2008): Reuters has announced OpenCalais service API, which analyzes the data (text, html, xml), processes it to a semantic content by returning RDF/XML and also stores the processed data in it’s server for future searches, it can be disabled via API params. You can check out this sample app.
Technorati tags: , , , ,

  • barry.b

    these are good questions.
    I suppose two things to keep in mind are who are the end users, both systems (eg: Google) and users (eg: your grandmother).
    I won’t call myself a Google power user but I’ve already hit some limits with searching for information, most of which is caused by not being able to inject enough context into the search.
    one thing is for sure, we’ll only get out what we put in…

  • @Barry: Yeah both would be end-users. Google is trying top-down approach, where as W3C/microformats etc trying bottom-up approach..
    Reuters OpenCalais API is trying middle-ground, it’s pretty useful, hopefully more and more people/system use it which means after couple of years, there would be enough data indexed there… One can actually write a semantic-processor for his site using OpenCalais API, that’s how semantic would be added to all old/new data automatically…
    I can see hope OpenCalais can be used and some adapters/transformers can be written to output to microformatted content…
    Thanks for your comment…