walker_stand critic loss explosion

Hi, thank you for quality code. but I wonder why walker_stand task critic loss is too high(up to 1e+3) in my experiment. In my case, I used your `conda.yaml` and changed `env :walker_stand` and `action_repeat : 2` and `batch_size : 512` as you mentioned in paper. how can I get stable critic loss?(for example, reward scaling)

Thank you for reading.