[1] Applications include language translation, image captioning, conversational models, and text summarization.
[1][6] Tomáš Mikolov claims to have developed the idea (before joining Google Brain) of using a "neural language model on pairs of sentences... and then [generating] translation after seeing the first sentence"—which he equates with seq2seq machine translation, and to have mentioned the idea to Ilya Sutskever and Quoc Le (while at Google Brain), who failed to acknowledge him in their paper.
[7] Mikolov had worked on RNNLM (using RNN for language modelling) for his PhD thesis,[8] and is more notable for developing word2vec.
The encoder is responsible for processing the input sequence and capturing its essential information, which is stored as the hidden state of the network and, in a model with attention mechanism, a context vector.
The context vector is the weighted sum of the input hidden states and is generated for every time instance in the output sequences.
The decoder takes the context vector and hidden states from the encoder and generates the final output sequence.
The attention mechanism is an enhancement introduced by Bahdanau et al. in 2014 to address limitations in the basic Seq2Seq architecture where a longer input sequence results in the hidden state output of the encoder becoming irrelevant for the decoder.
The company claimed that it could solve complex equations more rapidly and with greater accuracy than commercial solutions such as Mathematica, MATLAB and Maple.
An LSTM neural network then applies its standard pattern recognition facilities to process the tree.