Federated search

Federated search can be used to integrate disparate information resources within a single large organization ("enterprise") or for the entire web.

As described by Peter Jacso (2004[3]), federated searching consists of (1) transforming a query and broadcasting it to a group of disparate databases or other web resources, with the appropriate syntax, (2) merging the results collected from the databases, (3) presenting them in a succinct and unified format with minimal duplication, and (4) providing a means, performed either automatically or by the portal user, to sort the merged result set.

These individual information sources send back to the portal's interface a list of results from the search query.

Some portals will merely screen scrape the actual database results and not directly allow a user to enter the information source's application.

Federated search need not place any requirements or burdens on owners of the individual information sources, other than handling increased traffic.

And the metasearch approach, like the underlying search engine technology, only works with information sources stored in electronic form.

It is difficult to maintain the performance, the response speed, of a federated search engine as it combines more and more information sources together.

One implementation of federated search that has begun to address this issue is WorldWideScience, hosted by the U.S. Department of Energy's Office of Scientific and Technical Information.

Another application Sesam running in both Norway and Sweden has been built on top of an open sourced platform specialised for federated search solutions.

To personalize vertical orders in federated search, LinkedIn search engine[2] exploits the searcher's profile and recent activities to infer his or her intent, such as hiring, job seeking and content consuming, then uses the intent, along with many other signals, to rank vertical orders that are personally relevant to the individual searcher.

It includes pre-built connectors to popular open source search engines, and re-ranks results using cosine vector similarity.

Some of this challenge of mapping to a common form can be solved if the federated resources support linked open data via RDF.

Development groups should typically not hit live, production systems as they do regular work, much less intensive load testing.

federated search engine
Federating across three search engines