Reportlinker's blog : Web 3.0, Vertical Search, Information Industry and Market Research News

To content | To menu | To search

Tag - business information

Entries feed

Friday 2 November 2007

How Web 3.0 new search services help information professionals mining the web for valuable open access information ? 1/2

We are uploading for our readers chapter extracted from an conference of the 2007 Online Information Show. The last chapter is on hos way : How Web 3.0 new search services help information professionals mining the web for valuable open access information ?

Billions of pages that compose the WWW are growing faster and faster. Since a few months, this phenomenon has been speed up as social media democratisation (ex : blog) allows anyone to produce and broadcast content.

This great amount of information make specific content identification more and more difficult as today’s search engine approach is general and exhaustive. Content identification on Google or MSN is based on keywords correspondence.

Moreover, the first ranks of their organic list of results are filled in by merchant content. Slowly but surely, Search Engine Marketing (SEM) and Search Engine Optimisation (SEO) strategies did their job making more difficult for user to find information that nobody has been promoted.

In 5 years, average length of business information searches on the Internet get 40 % longer Outsell Inc says. This evolution made to the detriment of analysis time has an estimated costs for the companies of € 300 M worldwide.

The birth of tools allowing identification and diffusion of business information is a natural consequence of today’s situation, and is necessary for every company, however large it may be (Multinational or SME).

Specialised search engines, such as vertical ones, were the first one to simplify their index. They choose to cover a restricted scope of information, but with much better search features that can offer any generalist search engine.

We will present today their main features and identify technologies. We will present two of these new search engines in the business information market (Zoominfo and Reportlinker) next week.

A/ Why are vertical search engines emerging ?

If the amount of information published over the Internet is more and more important, it is also more and more fragmented. Each piece of information has a limited value compared to the value of each of them linked together.

So the understanding that a vertical search engine has from its environment allows it a sharp exploitation of semantic technologies, to offer users genuine and innovative added-value services.

Specific information processing is done to fit their vertical features :

  • Index visible and invisible web
  • Analyze and extract each vertical specific concept (company names, market segment, executives names…)
  • Information made uniform from heterogeneous sources

Vertical search engines aim at addressing homogeneous users’ needs (business executives search, industry reports search..) and create value-added services from their knowledge of restricted users scope of expectations.

That’s the best formula to reduce search time from Internet users. In doing so, they provide users friendly application that allows users to more efficiently mine the web for value added information.

B/ Vertical search engines features and technologies : One vertical axis, three features

Vertical search engines offer 3 mains features : a vertical index, vertical search features and a vertical contextualisation tools

vertical search engine

C/ The semantic is back to the hearth of the Internet

Three technologies characterize web 3.0 search engines : Semantic, Thesaurus and Concept Extraction.

Semantic Search Engine :

“A semantic search engine is a search engine that takes the sense of a word as a factor in its ranking algorithm or offers the user a choice as to the sense of a word or phrase.”

Taking care of the meaning of a text corpus, semantic analyses enable the pre-treatment and filtering of search results :

  • Search results clustering into thematic categories (categorization)
  • Automatically adds tags to document description
  • Displays additional stories linked to the document, even when the same keywords are not present

Semantic technologies play a very important role into vertical search engines as it allows to precisely organise the information among a finished number of dimensions.

For instance Farechase.com, the travel search engine, organise its result among the following dimensions :

  • Prices
  • Airlines
  • Departure Time
  • Flight duration
  • Direct flight or not

A sharp management of document context is almost impossible in a general search engine as it would be necessary to create as many index as specific point of views users would like to have to analyze data (example : Webfountain Technology from IBM).

Thesaurus

Thesaurus Semantic analyses are based on thesaurus, a structured organisation of keywords. Thesaurus building and management will allow the definition of a semantic dimension of a document. This structured organisation is hierarchic, but also transversal. Link between concepts is established in thesaurus. That's how general sense of a document is understood by artificial intelligence.

Concept extraction

Semantic technologies are able to automatically recognize and extract concepts, based on different elements of a sentence : syntax, grammar, meaning, context... Thus, it is able to recognize specific entities, such as : - date - place - people - company - ........

Wednesday 19 September 2007

Who produces and publishes open access content ?

As explained last week, we are uploading for our readers chapter extracted from an conference of the 2007 Online Show in December. We will present you what are the so called “open access market research reports” and who is publishing them.

Nowadays, large Anglo Saxon and American groups such as Dun and Bradstreet, Factiva or Thomson Corp dominate the business information market. These players were born during the first wave of industry information providers. Launched in the early 80’s, they focus on financial and economic information aggregation. Their database, accessible by subscription, mainly targets Fortune 1000 and multinational corporations.

Beside these giants around which the information industry gets structured, the market is completely atomised. His peculiarity is the large number of private publishers. This is especially true on the market research segment. Most famous ones are Datamonitor or Euromonitor. These last distribute reports sold from € 1500 to 3000. Outsell estimates companies will spend more than € 12 Billion in 2010 in market research reports purchasing. This market grows by 11% a year in average.

These publishers’ global offer is estimated to 150 000 documents. Specialized in distance selling, they make a great usage of opportunities offered by Internet Search Engine for their document promotion. They also distribute their content trough market research aggregator, such as Reportlinker.com in Europe or Marketresearch.com in the USA. This last, worldwide leader, references approximately 100 000 documents.

But the Internet is full of treasures. Apart form these reports, many free market research reports are published on the Internet, by public organisations such as Embassies, Ministries or Trade Unions. We estimate than 10 million of these free reports are available online today.



Definition : What do we call Open Access Content dedicated to business research ?

Wikipedia defines the “Open access (OA) as immediate, free and unrestricted online access to digital (…) material, primarily peer-reviewed research articles in journals.”

OA content business research can be described as:

  • Information published for free on the Internet
  • Information giving insight into markets or industries, countries or companies
  • Information that has been structured and documented : source, date, author….

We can divide the Open Access Business Information into two categories :

1/Public Domain Information

  • The public domain information are the so called Public Sector Information (PSI).
  • Definition of “public sector body” given in the Mepsir report (June 2006) : The definition of “public sector body” is taken from Directive 92/50/EEC as “State, regional or local authorities, bodies governed by public law, associations formed by one or more of such authorities or bodies governed by public law” where “Body governed by public law” means “Any body that is established for the specific purpose of meeting needs in the general interest, not having an industrial or commercial character, and having legal personality and financed, for the most part, by the State, or regional or local authorities, or other bodies governed by public law; or subject to management supervision by those bodies; or having an administrative, managerial or supervisory board, more than half of whose members are appointed by the State, regional or local authorities or by other bodies governed by public law”.

2/Private Papers

  • The private paper can also be defined as …..all the other documents !!
  • Business information published by private companies to promote their services companies or know-how (Analyst research reports, Annual Reports…)

Which kind of open access content can you find on the Internet ?

Ubiquick estimates that more than 10 Million free industry reports are available on the Internet. Obviously, and for commercial reasons, most of them are published by organisation financed by States. But publishing information on a sector can also be a promotional tool for many private companies.

Type and Source for Open Access Industry Information

1/ Type

  • Annual Reports
  • Press Releases (?)
  • White Papers
  • Investment reports
  • Country reports
  • Economic indicators
  • Trade Statistics
  • Patent and trademarks information

2/ Source

  • Embassy
  • National Institute
  • Competition Commission
  • Ministry
  • Local representation of state / federation
  • Trade Union
  • Customs
  • National and Central Bank
  • International Institutions
  • Stock exchange
  • Consulting firm
  • Banks
  • Chamber of Commerce



Summary : Type of public domain information by sources

Table OA