fix: average reward over agents in compute_system_rewards by Dreamstick9 · Pull Request #7 · SakanaAI/LanguageEvolution

Dreamstick9 · 2026-06-22T15:47:51Z

Fixes: #4

Bug: `compute_system_rewards` returns sum instead of average

Population.compute_system_rewards is supposed to return the average of per-agent morphology statistics, but always returns the raw sum, inflating every value by a factor of N (number of agents).

Root Cause

The averaging loop has two bugs:

Wrong variable — divides agent_reward (local per-agent dict) instead of reward (accumulated total)
Wrong scope — sits inside the for agent loop, so the result is overwritten and discarded on the next iteration

# before (buggy) — core/population.py  [header-1](#header-1)
reward = defaultdict(float)  
for agent in self.agents:  
    agent_reward = agent.compute_morphology_statistics()  
    for k, v in agent_reward.items():  
        reward[k] += v  
    # average  
    for k in agent_reward.keys():  
        agent_reward[k] = agent_reward[k] / len(self.agents)  # wrong variable, wrong scope

Fix

Move the averaging loop outside the for agent loop and apply it to reward:

# after (fixed)  [header-2](#header-2)
reward = defaultdict(float)  
for agent in self.agents:  
    agent_reward = agent.compute_morphology_statistics()  
    for k, v in agent_reward.items():  
        reward[k] += v  
# average  [header-3](#header-3)
for k in reward.keys():  
    reward[k] = reward[k] / len(self.agents)

Impact

All keys returned by compute_system_rewards — paradigms, stem_alternate_patterns, phonetic_non_confusability, stem_alternation_entropy, complexity, transfers, and the derived total — are N× too large in any run with more than one agent.

The existing test (test_compute_system_rewards_presence_and_bounds) only checks key presence and that values are finite, so this was not caught automatically.

The averaging loop was inside the for-agent loop and modified agent_reward (the local per-agent dict) instead of reward (the accumulated total). This meant reward always held the raw sum of all agents' stats, inflating all values by N (num agents). Fix: move the averaging loop outside the for-agent loop and apply it to reward instead of agent_reward.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: average reward over agents in compute_system_rewards#7

fix: average reward over agents in compute_system_rewards#7
Dreamstick9 wants to merge 1 commit into
SakanaAI:mainfrom
Dreamstick9:fix/compute-system-rewards-averaging

Dreamstick9 commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Dreamstick9 commented Jun 22, 2026

Fixes: #4

Bug: compute_system_rewards returns sum instead of average

Root Cause

Fix

Impact

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bug: `compute_system_rewards` returns sum instead of average