Question answering

More commonly, question-answering systems can pull answers from an unstructured collection of natural language documents.

LUNAR answered questions about the geological analysis of rocks returned by the Apollo Moon missions.

The strength of this system was the choice of a very specific domain and a very simple world with rules of physics that were easy to encode in a computer program.

It had a comprehensive, hand-crafted knowledge base of its domain, and it aimed at phrasing the answer to accommodate various types of users.

Specialized natural-language question answering systems have been developed, such as EAGLi for health and life scientists.

[13] The system finds answers by using a combination of techniques from computational linguistics, information retrieval, and knowledge representation.

The system takes a natural language question as an input rather than a set of keywords, for example: "When is the national day of China?"

In the example above, the subject is "Chinese National Day", the predicate is "is" and the adverbial modifier is "when", therefore the answer type is "Date".

A tagger and NP/Verb Group chunker can verify whether the correct entities and relations are mentioned in the found documents.

For questions such as "Who" or "Where", a named-entity recogniser finds relevant "Person" and "Location" names from the retrieved documents.

[15] MathQA takes an English or Hindi natural language question as input and returns a mathematical formula retrieved from Wikidata as a succinct answer, translated into a computable form that allows the user to insert values for the variables.

It is claimed that the system outperforms a commercial computational mathematical knowledge engine on a test set.

The "ARQMath Task" at CLEF 2020[17] was launched to address the problem of linking newly posted questions from the platform Math Stack Exchange to existing ones that were already answered by the community.

[18] The lab was motivated by the fact that 20% of mathematical queries in general-purpose search engines are expressed as well-formed questions.

The PhysWikiQuiz physics question generation and test engine retrieves mathematical formulae from Wikidata together with semantic information about their constituting identifiers (names and values of variables).

Subsequently, the variables are substituted with random values to generate a large number of different questions suitable for individual student tests.

[2] The open source framework Haystack by deepset combines open-domain question answering with generative question answering and supports the domain adaptation[clarification needed] of the underlying[clarification needed] language models for industry use cases[vague].

[34][35] Large Language Models (LLMs)[36] like GPT-4[37], Gemini[38] are examples of successful QA systems that are enabling more sophisticated understanding and generation of text.