Tree kernel

They find applications in natural language processing, where they can be used for machine-learned parsing or classification of sentences.

Such comparisons can be performed by computing dot products of vectors of features of the trees, but these vectors tend to be very large: NLP techniques have come to a point where a simple dependency relation over two words is encoded with a vector of several millions of features.

[1] It can be impractical to represent complex structures such as trees with features vectors.

An example application is classification of sentences, such as different types of questions.

In this example "A" and "a" are the same words, and in most of the NLP applications they would be represented with the same token.