dreamerv3-torch

Author	SHA1	Message	Date
Unai Ruiz	a8456e95bc	Fix missing advantage computation when reward_EMA is disabled This PR fixes an issue where reward_EMA=False caused adv to be undefined in _compute_actor_loss. Previously, adv was only computed inside the reward_EMA branch, which resulted in a runtime error when the option was disabled.	2026-03-03 16:34:52 +01:00
NM512	be5e5ecf40	apply mean to log items for consistency	2026-02-21 20:58:43 +09:00
NM512	7433d1e877	avoid ".to(device)"	2024-09-28 07:58:15 +09:00
NM512	a4fdfad938	bug fix for onehot distribution	2024-01-14 21:55:34 +09:00
NM512	7f66ed5333	erased unused options	2024-01-05 23:23:09 +09:00
NM512	a27711ab96	limit action values in sampling stage	2024-01-05 11:42:45 +09:00
NM512	a9e85e8b7c	modified weight initialization	2024-01-05 10:46:54 +09:00
NM512	78e86703f4	modified loss calculation	2024-01-05 10:44:04 +09:00
NM512	e0487f8206	merged action head into MLP and modified configs	2024-01-05 10:26:48 +09:00
NM512	e0f2017e28	unified the place to initialize the latents	2024-01-05 10:09:13 +09:00
NM512	16635df3e4	removed scheduling function	2023-09-26 20:58:55 +09:00
NM512	3f6659d365	changed treatment of obs shape in minecraft	2023-08-03 08:12:44 +09:00
NM512	9c58ab62c0	introduced return used in author's code	2023-06-17 16:59:40 +09:00
NM512	f7c505579c	erased unnecessary lines	2023-06-17 15:27:09 +09:00
NM512	02c3d45fcf	modification of expl.	2023-05-21 08:17:47 +09:00
NM512	b984e69b6e	added state input capability	2023-05-14 23:38:46 +09:00
NM512	0eb66997fb	learnable initial state options for RSSM	2023-04-29 07:54:03 +09:00
NM512	2a8b44eb0c	erased unnecessary code	2023-04-27 07:42:08 +09:00
NM512	628b856c63	changed the discount head to predict terminal	2023-04-22 09:34:23 +09:00
NM512	942eae10a9	updated result, requirements and torch version	2023-03-24 07:51:57 +09:00
NM512	6273444394	modified based on author's implementation	2023-03-18 08:38:23 +09:00
NM512	fb5c21557a	Initial Commit	2023-02-12 22:35:25 +09:00

22 Commits