Billions of pages that compose the WWW are growing faster and faster. Since a few months, this phenomenon has been speed up as social media democratisation (ex : blog) allows anyone to produce and broadcast content.
This great amount of information make specific content identification more and more difficult as today’s search engine approach is general and exhaustive. Content identification on Google or MSN is based on keywords correspondence.
Moreover, the first ranks of their organic list of results are now trusted by merchant content. Slowly but surely, earch Engine Marketing (SEM) and Search Engine Optimisation strategies did their job making more difficult for user to find information that nobody has been promoted.


1. At first there was The Semantic Web

Remedies exist. Tim Berners Lee imagined how it was possible to add "sense" to the search engine index. He recommended to use semantic. This was in 1999, another century in the WWW. Tim wanted to add a semantic layer on the search engine to facilitate links creation between documents, using concept tags (based on sense), rather than limiting this on hyperlinks, as Google does (the famous Page Rank).

“We are going from a Web of connected documents to a Web of connected data.” Nova Spivack, RADAR NETWORKS

This new approach also add another objective : reinforce the contextual dimension of information searching.
Documents will be classified and put into contexts. Those contexts will themselves be linked to other documents. This approach was known as “The Semantic Web”. Behind this idea, Tim thought that publishers and content producers would manually create tags allowing internet users to surf in a structured semantic universe. Latter, the social media and Web 2.0 revolution did the job : Bloggers create tags for the content they are publishing or Digg users manually describe with their own word each website they find interesting to share with others.
But this approach is making a lot of issues :
1/ For a perfect work, each publisher need to share and use the same universal and structured vision of the world in order to make sure to use same words for same concept.
2/ Each Publisher should be able, and willing (take the time to do it) to describe its content in each of its dimension
3/ SEO and SEM temptation will still be possible

For each of these reasons, the semantic web remains a concept, and its industrialisation, an utopia. Finally start-up reinvented this approach, under the Web 3.0 concept.
The principle is based on the automatic information processing using semantic search engines in order to extract concepts and to link them together. As long as this approach is automatic, this task can be spread to a wide amount of content, as this is not only the work of publishers or readers, as before.
We can take here some examples :

- Extract date from a document to place it on a time axis - Extract company names - Extract executive names and details - Extract places...


But to be really efficient, this approach should be done in a finished world. Web 3.0 will thus need to be vertical.


2. The born of Vertical search engine

A new generation of vertical search engine was born these last months. Each of them follow the web 3.0 trend. They address homogeneous users’ needs (airline tickets search, industry report search..) and create value-added services from public information gathered from the open web (and free, as a consequence).

1/ Vertical index
The specific approach of these engines allows them to build a specialized index, and to delete peripheral contents, not directly linked to the thematic. Doing so, they eliminate all the noise experienced in the general search engine.


EXAMPLE : Their specialization allow them to index document from the deep web, document that are not present in the general search engine results.


EXAMPLE : In the travel industry, recent search engine such as Sprice.com or Farechase.com automatically and simultaneously query numbers of airlines website to offer user the best prices without searching these websites one by one.

2/ Vertical search features
These new search engines also offer users search and filtering features dedicated to the specific needs they answer.
Travel Industry : Sprice.com allows users to compare airlines prices by amount, length, departure and arrival time, type of flight.....
Executive search : Zoominfo allows users to search among enterprise executive, by industry, geographic zone, company revenues ...
Consumer Electronics : Retrevo.com helps users to identify for each product (Mp3 reader..) each web resources by type (Documentation, consumer reviews...)
Market Research : Reportlinker.com allows users to identify open access market research and dynamically organizes them by industry, geographic zone, date... and to preview a document before downloading it.
Specific information processing is done to build these vertical features : - Index visible and invisible web - Analyse and extract each concept of each document in each of the dimension the user can search through the tool key features - Information uniformisation from heterogeneous sources

3/ Vertical context
Information contextualisation is the third axes if these value-added vertical search services. To generate new information, the idea is to extract and cross each piece of information with a different one, classified under the same concept.
EXAMPLE : Using press release, company website and other free Publication, Zoominfo automatically generate a genuine company profile.
The amount of information published over the internet is more and more important, but also more and more fragmented. Each piece of information has a limited value compared to the value of each of them linked together. So the understanding that each Web 3.0 search engine has from its environment allow them a sharp exploitation of semantic technologies, to offer user genuine and innovative added-value services.