Skip to content

oscillations of eval score #6

@WangJinCheng1998

Description

@WangJinCheng1998

Hi nik!

I have trained the walker2d and other environments several times. The settings and hyper parameters are all followed by original DT. And I found some strange points. Most of the environments, DT has the best score when the training steps between 10000 and 20000, and no obvious increase after 20000 steps.Sometimes, a trough also happened during that period. Would you mind give me some clues about these things?
walker2d_50

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions