Progress in Machine Translation - Pt. 1

Category: NLP

Before NMT Systems

As can be seen from the post (link word embedding blog), it is possible to represent words in a mathematical vector form which represents their semantic meaning. Machine Translation is however, a task which expands way beyond the understanding of words. The model needs to learn not only the laws and syntax of multiple languages, but also how to convert from one to another. As expected, this task is highly demanding from a computational perspective. Owing to this, Translation systems were mostly phrase-based. This means that translation was based on small units called ‘phrases’ only. The Model would translate a phrase from one language to another, given similar samples based on simple probabilistic rules.

As you would expect though, these systems though reliant for basic translation, were not syntactically very accurate. Multiple reasons can be pointed out.

To correct these cons for phrase-based systems, came the NMT or Neural Machine Translation Systems. These systems centre around RNNs or Recurrent Neural Networks. The RNN Endoder-Decoder Architecture has been central to the development of not only Translation but a significant amount of Language and Sequence Modelling tasks over the years.

Understanding the RNN Encoder-Decoder

Any Endoder-Decoder Architecture centres around the idea that an intermediate representation can be learnt using the Encoder, which is sufficient to encode every bit of information presented to us, and the Decoder is capable of decoding this representation back to some meaningful form.

An RNN based Encoder-Decoder works on similar lines. One RNN encodes a sequence of symbols into a fixed lenght vector representation, and another RNN decoded representations into another sequence of symbols. Esentially, the encoder and decoder together are trying to jointly minimize the conditional probability of a target sequence, given a source sequence.