Reportlinker's blog : Web 3.0, Vertical Search, Information Industry and Market Research News

To content | To menu | To search

Tag - search engines

Entries feed

Friday 2 November 2007

How Web 3.0 new search services help information professionals mining the web for valuable open access information ? 1/2

We are uploading for our readers chapter extracted from an conference of the 2007 Online Information Show. The last chapter is on hos way : How Web 3.0 new search services help information professionals mining the web for valuable open access information ?

Billions of pages that compose the WWW are growing faster and faster. Since a few months, this phenomenon has been speed up as social media democratisation (ex : blog) allows anyone to produce and broadcast content.

This great amount of information make specific content identification more and more difficult as today’s search engine approach is general and exhaustive. Content identification on Google or MSN is based on keywords correspondence.

Moreover, the first ranks of their organic list of results are filled in by merchant content. Slowly but surely, Search Engine Marketing (SEM) and Search Engine Optimisation (SEO) strategies did their job making more difficult for user to find information that nobody has been promoted.

In 5 years, average length of business information searches on the Internet get 40 % longer Outsell Inc says. This evolution made to the detriment of analysis time has an estimated costs for the companies of € 300 M worldwide.

The birth of tools allowing identification and diffusion of business information is a natural consequence of today’s situation, and is necessary for every company, however large it may be (Multinational or SME).

Specialised search engines, such as vertical ones, were the first one to simplify their index. They choose to cover a restricted scope of information, but with much better search features that can offer any generalist search engine.

We will present today their main features and identify technologies. We will present two of these new search engines in the business information market (Zoominfo and Reportlinker) next week.

A/ Why are vertical search engines emerging ?

If the amount of information published over the Internet is more and more important, it is also more and more fragmented. Each piece of information has a limited value compared to the value of each of them linked together.

So the understanding that a vertical search engine has from its environment allows it a sharp exploitation of semantic technologies, to offer users genuine and innovative added-value services.

Specific information processing is done to fit their vertical features :

  • Index visible and invisible web
  • Analyze and extract each vertical specific concept (company names, market segment, executives names…)
  • Information made uniform from heterogeneous sources

Vertical search engines aim at addressing homogeneous users’ needs (business executives search, industry reports search..) and create value-added services from their knowledge of restricted users scope of expectations.

That’s the best formula to reduce search time from Internet users. In doing so, they provide users friendly application that allows users to more efficiently mine the web for value added information.

B/ Vertical search engines features and technologies : One vertical axis, three features

Vertical search engines offer 3 mains features : a vertical index, vertical search features and a vertical contextualisation tools

vertical search engine

C/ The semantic is back to the hearth of the Internet

Three technologies characterize web 3.0 search engines : Semantic, Thesaurus and Concept Extraction.

Semantic Search Engine :

“A semantic search engine is a search engine that takes the sense of a word as a factor in its ranking algorithm or offers the user a choice as to the sense of a word or phrase.”

Taking care of the meaning of a text corpus, semantic analyses enable the pre-treatment and filtering of search results :

  • Search results clustering into thematic categories (categorization)
  • Automatically adds tags to document description
  • Displays additional stories linked to the document, even when the same keywords are not present

Semantic technologies play a very important role into vertical search engines as it allows to precisely organise the information among a finished number of dimensions.

For instance Farechase.com, the travel search engine, organise its result among the following dimensions :

  • Prices
  • Airlines
  • Departure Time
  • Flight duration
  • Direct flight or not

A sharp management of document context is almost impossible in a general search engine as it would be necessary to create as many index as specific point of views users would like to have to analyze data (example : Webfountain Technology from IBM).

Thesaurus

Thesaurus Semantic analyses are based on thesaurus, a structured organisation of keywords. Thesaurus building and management will allow the definition of a semantic dimension of a document. This structured organisation is hierarchic, but also transversal. Link between concepts is established in thesaurus. That's how general sense of a document is understood by artificial intelligence.

Concept extraction

Semantic technologies are able to automatically recognize and extract concepts, based on different elements of a sentence : syntax, grammar, meaning, context... Thus, it is able to recognize specific entities, such as : - date - place - people - company - ........

Thursday 25 October 2007

Methodology step by step : How to find out Automotive market reports n the Internet without Reportlinker.com

As explained we are uploading for our readers chapter extracted from an conference of the 2007 Online Information Show in December. We will present you a methodology step by step to identify Automotive market research on the European Market, without using Reportlinker.com

1/ Identify who can produce information you are looking for

You have two different ways to begin your search :

  • if you know the type of information you are looking for (company, country or industry), just refer to the chapter 1C of the present paper
  • if you prefer to search directly by country, then I advise you to use the Statbel website (http://statbel.fgov.be/port/cou_en.asp). Statbel is the main official statistical institution in Belgium and offers a detailed directory of websites giving figures and facts about every country in the world.


2/ Let the web inspire your industry keywords list

Where can you find market research vocabulary ? Inside market publications of course !

You can use any news aggregator, such as Google news, launch the query “Automotive market”, and read the first articles. I extracted on the first article I found the following keywords :

  • car sales
  • new passenger cars
  • Volkswagen
  • passenger market
  • Light Commercial Vehicle segment
  • new car market

You can also use any market research aggregator like Allbizreport and look into the table of content made of keywords you can use.

3/ The three most important components of your query to identify Market Research

Three elements compose the semantic of any market research :

  1. the industry keywords
  2. the geographical keywords
  3. the dates

Thus, using the first keyword we identified previously, here is the query you will be launching : “car sales” + “Western Europe” + 2007

4/ Type a query and limit it to file format

The best thing to do is to begin by limiting search to a specific file format. That can help you to easily identify who is publishing documents answering your question.

In our case, we posted the following query to Google (05/09/07)

“car sales” + Western Europe + 2007 europe filetype:pdf

Free market research 1


5/ Type a query and limit it to one domain

Obviously, the first results are coming from a website publishing regularly information on the European Automotive Market. To be more efficient, it is possible to limit your searches to this website, by typing the following query :

“car sales” + Western Europe + 2007 europe filetype:pdf site:www.acea.be

You will get here 5 results. Among them you will find the 2006 European Automotive Industry Report and 2007 first semester statistics by market and manufacturer.


6/ Relaunch your query with a new keyword

Now that you have information on sales, you can begin to search for information on player’s strategy

The best source for that is definitely the annual report (See chapter 1C).

To quickly get the best page, just type a query such as

"VOLKSWAGEN AG" "investor relations"

You are redirected to : http://www.volkswagenag.com/vwag/vwcorp/content/en/investor_relations.html

Now you can play in limiting your queries to this specific domain, as for example :

site:http://www.volkswagenag.com strategy 2007 filetype:pdf

In that case, the first two results are presentations made during “The Detroit Auto Show” and a Deutsche Bank conference. Each of them gives insight into the group global strategy, roadmap for the group objectives and outlook by market.

Free Market research 2


Advantages and Limits of business information research through general search engine

Free Market research 3

Information professional can gain time in several steps of the methodology we presented in this chapter.

  • In querying a single source of data, and using filters dedicated to the industry they are studying
  • In assessing the content they identify, through sources certifying the type of content they will find

The very last Vertical Search Engine, also called “Web 3.0” can help them to spare time, and assure the same level of information. We will present you now how this search engines are querying the web database to bring users the very best public content available.

Sunday 22 April 2007

Search Engine add news in their results

Two news were published last week, telling that search engine are mixing "classical" with news results into their results pages.

The start up Hakia (Founded in 2004, based in New York City) has developed a Web’s new meaning-based search engine, utilizing a semantic approach to deliver search results. Basically, Hakia is running its search engine since several months, and announced this week an increases of News Content Volume by 100%. Thus, Hakia is the First Search Engine to Mix News with Web Search Results !

Google also announced its desire to do so. At the moment, as Google news bot crawl sites fastest that Googlebot, Google will add news results in a top box. But news and other results will be merge in a near future.

So we can wonder why search engines are today using this strategy ? The answer is detailled in the Hakia blog. Using this strategy is the "only proper way of handling long tail searches". That also gives a clue to bloggers that are wondering if Google use semantic technology into its results. Because matching web pages with news, without over ranking one of this type of content, can only be done with a semantic search engine....

Saturday 21 April 2007

Search Engine Meeting 2007 begins on Monday

The Search Engine Meetings bring together commercial search engine developers, academics and corporate professionals to learn from each other.

This annual meeting provides a forum and point-of-reference for all those interested in the intricacies of Search and Retrieval. The meeting draws those with a professional interest in search engines (such as search engine designers and developers) and those interested in applying search engines in their own professional environments. Search is at the heart of information retrieval; and the Search Engine Meeting provides an annual point of reference as to what is happening in this fast-moving and exciting field.