These networks were first introduced to learn distributed representations of structure (such as logical terms),[1] but have been successful in multiple applications, for instance in learning sequence and tree structures in natural language processing (mainly continuous representations of phrases and sentences based on word embeddings).
In the simplest architecture, nodes are combined into parents using a weight matrix (which is shared across the whole network) and a non-linearity such as the
This architecture, with a few improvements, has been used for successfully parsing natural scenes, syntactic parsing of natural language sentences,[2] and recursive autoencoding and generative modeling of 3D shape structures in the form of cuboid abstractions.
[3] RecCC is a constructive neural network approach to deal with tree domains[4] with pioneering applications to chemistry[5] and extension to directed acyclic graphs.
[7][8] Recursive neural tensor networks use a single tensor-based composition function for all nodes in the tree.
[9] Typically, stochastic gradient descent (SGD) is used to train the network.
The universal approximation capability of RNNs over trees has been proved in literature.