add instructions for distributed training
This commit is contained in:
@@ -112,6 +112,8 @@ $ python train.py task=walker-walk obs=rgb
|
||||
|
||||
We recommend using default hyperparameters for single-task online RL, including the default model size of 5M parameters (`model_size=5`). Multi-task offline RL benefits from a larger model size, but larger models are also increasingly costly to train and evaluate. Available arguments are `model_size={1, 5, 19, 48, 317}`. See `config.yaml` for a full list of arguments.
|
||||
|
||||
**As of Jan 7, 2024 the TD-MPC2 codebase also supports multi-GPU training for multi-task offline RL experiments**; use branch `distributed` and argument `world_size=N` to train on `N` GPUs. We cannot guarantee that distributed training will yield the same results, but they appear to be similar based on our limited testing.
|
||||
|
||||
----
|
||||
|
||||
## Citation
|
||||
|
||||
Reference in New Issue
Block a user