GRU cells by guillitte · Pull Request #20 · jcjohnson/torch-rnn

guillitte · 2016-03-02T19:07:49Z

I added the possibility to use GRU cells.

jcjohnson · 2016-03-02T19:15:02Z

Wow, this looks amazing - thanks a bunch! There's even a unit test! I want to look through it in a bit more detail before merging, and I probably won't have time to do so today.

guillitte · 2016-03-02T20:30:33Z

Thanks. It could certainly be further optimized, but, at least, it seems to work fine.

JoostvDoorn · 2016-05-04T09:02:50Z

Any update on this?

test

guillitte · 2016-05-04T13:29:22Z

For those interested, I also added a gridgru adapted from http://arxiv.org/abs/1507.01526 in the Dev branch

guillitte · 2016-05-05T09:43:31Z

Running a small benchmark using 1000 iterations on tiny Shakespeare (Epoch 3.8), I got the following results :

LSTM :

{"i":1000,"val_loss_history":[1.6292053406889],"val_loss_history_it":[1000],"forward_backward_times":{},"opt":{"max_epochs":50,"checkpoint_every":1000,"batch_size":50,"memory_benchmark":0,"init_from":"","grad_clip":5,"model_type":"lstm","lr_decay_every":5,"print_every":1,"wordvec_size":64,"seq_length":50,"input_json":"data/tiny-shakespeare.json","num_layers":3,"input_h5":"data/tiny-shakespeare.h5","reset_iterations":1,"rnn_size":800,"dropout":0,"checkpoint_name":"cv/lstm","batchnorm":0,"learning_rate":0.0005,"speed_benchmark":0,"gpu_backend":"cuda","lr_decay_factor":0.5,"gpu":0}

GRU :

{"i":1000,"val_loss_history":[1.4681989658963],"val_loss_history_it":[1000],"forward_backward_times":{},"opt":{"max_epochs":50,"checkpoint_every":1000,"batch_size":50,"memory_benchmark":0,"init_from":"","grad_clip":5,"model_type":"gru","lr_decay_every":5,"print_every":1,"wordvec_size":64,"seq_length":50,"input_json":"data/tiny-shakespeare.json","num_layers":3,"input_h5":"data/tiny-shakespeare.h5","reset_iterations":1,"rnn_size":800,"dropout":0,"checkpoint_name":"cv/gru","batchnorm":0,"learning_rate":0.0005,"speed_benchmark":0,"gpu_backend":"cuda","lr_decay_factor":0.5,"gpu":0}

GRIDGRU :

{"i":1000,"val_loss_history":[1.4313773946329],"val_loss_history_it":[1000],"forward_backward_times":{},"opt":{"max_epochs":50,"checkpoint_every":1000,"batch_size":50,"memory_benchmark":0,"init_from":"","grad_clip":5,"model_type":"gridgru","lr_decay_every":5,"print_every":1,"wordvec_size":800,"seq_length":50,"input_json":"data/tiny-shakespeare.json","num_layers":3,"input_h5":"data/tiny-shakespeare.h5","reset_iterations":1,"rnn_size":800,"dropout":0,"checkpoint_name":"cv/gridgru","batchnorm":0,"learning_rate":0.0005,"speed_benchmark":0,"gpu_backend":"cuda","lr_decay_factor":0.5,"gpu":0}

NB : for GRIDGRU, wordvec_size is the size of the network along depth, so it should be about the same as rnn_size

JoostvDoorn · 2016-05-16T11:35:58Z

+    cur_gates[{{}, {2 * H + 1, 3 * H}}]:addmm(next_h, Wh[{{}, {2 * H + 1, 3 * H}}]) -- hc += Wh * r . prev_h
+    local hc = cur_gates[{{}, {2 * H + 1, 3 * H}}]:tanh() --hidden candidate : hc = tanh(Wx * x + Wh * r . prev_h + b)
+    next_h:addcmul(prev_h,-1, u, prev_h)
+    next_h:addcmul(u,hc)  --next_h = (1-u) . prev_h + u . hc   


A small note: the original paper http://arxiv.org/pdf/1406.1078v3.pdf has it the other way around, see Equation 7.

It is true.
As always, there are many small variations for the same algorithm.
For the definition of GRU, I used the code in Karpathy's char-rnn and I didn't chek the original article.

AlekzNet · 2016-11-06T14:40:48Z

@guillitte I wonder how fair this comparison is. GRIDGRU has as twice as more parameters than LSTM, and 2.5 times more parameters, than GRU. 3x800 GRIDGRU has roughly the same amount of parameters as, say, 3x1070 LSTM or 3x1250 GRU. So, in this comparison, GRU wins hands down.

binary-person · 2019-02-27T22:39:19Z

This has been open for a while, mind if one of the contributors merge this?

JoostvDoorn · 2019-02-28T06:37:07Z

@scheng123 An equivalent implementation is also merged into https://github.com/torch/rnn/ with the name SeqGRU.

guillitte added 5 commits March 2, 2016 19:19

Create GRU.lua

7fdc179

Create GRU_test.lua

478046e

Update GRU.lua

0f5666a

Update LanguageModel.lua

50f0228

Update GRU.lua

bbdfaf9

guillitte added 4 commits March 5, 2016 18:15

Update LanguageModel.lua

45d16ea

Update LanguageModel.lua

fa3edc1

Update VanillaRNN.lua

688f14a

Update DataLoader.lua

c33a3dc

JoostvDoorn mentioned this pull request May 2, 2016

Should we also need an implementation of SeqGRU? Element-Research/rnn#235

Closed

Merge branch 'master' of https://github.com/jcjohnson/torch-rnn

4cfb3c0

test

Update GRU.lua

62b0ebc

JoostvDoorn reviewed May 16, 2016
View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GRU cells#20

GRU cells#20
guillitte wants to merge 11 commits into
jcjohnson:masterfrom
guillitte:master

guillitte commented Mar 2, 2016

Uh oh!

jcjohnson commented Mar 2, 2016

Uh oh!

guillitte commented Mar 2, 2016

Uh oh!

JoostvDoorn commented May 4, 2016 •

edited

Loading

Uh oh!

guillitte commented May 4, 2016

Uh oh!

guillitte commented May 5, 2016

Uh oh!

JoostvDoorn May 16, 2016

Uh oh!

guillitte May 16, 2016

Uh oh!

AlekzNet commented Nov 6, 2016 •

edited

Loading

Uh oh!

binary-person commented Feb 27, 2019

Uh oh!

JoostvDoorn commented Feb 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

guillitte commented Mar 2, 2016

Uh oh!

jcjohnson commented Mar 2, 2016

Uh oh!

guillitte commented Mar 2, 2016

Uh oh!

JoostvDoorn commented May 4, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guillitte commented May 4, 2016

Uh oh!

guillitte commented May 5, 2016

Uh oh!

JoostvDoorn May 16, 2016

Choose a reason for hiding this comment

Uh oh!

guillitte May 16, 2016

Choose a reason for hiding this comment

Uh oh!

AlekzNet commented Nov 6, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

binary-person commented Feb 27, 2019

Uh oh!

JoostvDoorn commented Feb 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

JoostvDoorn commented May 4, 2016 •

edited

Loading

AlekzNet commented Nov 6, 2016 •

edited

Loading