Cranfield experiments

The Cranfield experiments were extremely influential in the information retrieval field, itself a subject of considerable interest in the post-World War II era when the quantity of scientific research was exploding.

Its influence was considerable over a forty-year period before natural language indexes like those of modern web search engines became commonplace.

The now-famous July 1945 article "As We May Think" by Vannevar Bush is often pointed to as the first complete description of the field that became information retrieval.

The article describes a hypothetical machine known as "memex" that would hold all of mankind's knowledge in an indexed form that would allow it to be retrieved by anyone.

It was at this meeting that Cyril W. Cleverdon "got the bit between his teeth" and managed to arrange for funding from the US National Science Foundation to start what would later be known as Cranfield 1.

The four systems were: In an early series of experiments, participants were asked to create indexes for a collection of aerospace-related documents.

The outcome of this approach was revolutionary at the time; it suggested that the search terms be left in their original format, what would today be known as a natural language query.

However, this was not typical of an actual query; a user looking for information on aircraft landing gear might be happy with any of the collection's many papers on the topic, but Cranfield 1 would consider such a result a failure in spite of returning relevant materials.

In the second series, the results were judged by 3rd parties who gave a qualitative answer on whether the query generated a relevant set of papers, as opposed to returning a specified original document.

In particular, Cranfield 2's methodology, starting with natural language terms and judging the results by relevance, not exact matches, became almost universal in following experiments in spite of many objections.

For instance, the mid-range IBM System/360 Model 50 shipped with 64 to 512 kB of core memory[12] (tending toward the lower end) and its typical hard drive stored just over 80 MB.

[13] As the capabilities of systems grew through the 1960s and 1970s, the Cranfield document collection became a major testbed corpus that was used repeatedly for many years.