Hi!
I am recently reimplementing the Sequence-to-Sequence Learning model with attention mechanism for neural machine translation. I once used Theano for it and usually I use GRU.
Now I am very interested in torch and I think we can try the similar thing in it. However, I found the memory usage is too large if we use a nngraph-based model. I found that SeqLSTM is very fast and quite suitable for my purpose. However, I am curious if anyone has implemented similar model for GRU? Or I need to do it myself based on SeqLSTM.
Thank you very much!
Hi!
I am recently reimplementing the Sequence-to-Sequence Learning model with attention mechanism for neural machine translation. I once used Theano for it and usually I use GRU.
Now I am very interested in torch and I think we can try the similar thing in it. However, I found the memory usage is too large if we use a nngraph-based model. I found that SeqLSTM is very fast and quite suitable for my purpose. However, I am curious if anyone has implemented similar model for GRU? Or I need to do it myself based on SeqLSTM.
Thank you very much!