Fix missing advantage computation when reward_EMA is disabled

This PR fixes an issue where reward_EMA=False caused adv to be undefined in _compute_actor_loss. Previously, adv was only computed inside the reward_EMA branch, which resulted in a runtime error when the option was disabled.
2026-03-03 16:34:52 +01:00
parent 6253f988fe
commit a8456e95bc
1 changed files with 2 additions and 0 deletions
--- a/models.py
+++ b/models.py
@@ -409,6 +409,8 @@ class ImagBehavior(nn.Module):
            metrics.update(tools.tensorstats(normed_target, "normed_target"))
            metrics["EMA_005"] = to_np(self.ema_vals[0])
            metrics["EMA_095"] = to_np(self.ema_vals[1])
        else:
            adv = target - base
        if self._config.imag_gradient == "dynamics":
            actor_target = adv