Curiosity & Empowerment

Intrinsic Motivation — Curiosity & Empowerment

Two complementary drives for self-motivated learning, independent of external reward.

Curiosity (Prediction Error)

Curiosity reward = normalized prediction error. The creature is rewarded for encountering surprising states. Running-mean normalization prevents habituation. Boredom detection triggers exploration bursts after 200 steps without novelty.

normalized_PE = (PE - running_mean) / sqrt(running_var)
if normalized_PE > 0.1: intrinsic_reward = min(PE, 1.0)
if boredom_counter > 200: intrinsic_reward = 0.5  # forced exploration

Empowerment (Action→State MI)

Empowerment measures how much the creature’s actions influence the world. High empowerment = “my actions have predictable consequences.” Approximated by comparing state variance under high vs low action magnitudes.

Combined Reward

total = (1 - α) × extrinsic + α × (curiosity + empowerment)
α = 0.3 default (30% intrinsic)

References

Schmidhuber (1991). Curious model-building control systems. IJCNN
Oudeyer & Kaplan (2007). Intrinsic motivation systems. Frontiers in Neurorobotics
Klyubin, Polani & Nehaniv (2005). Empowerment. Adaptive Behavior

API Reference

CuriosityDrive(config: CuriosityConfig)

compute_intrinsic_reward(prediction_error) → float

Normalized PE reward with boredom detection.

get_neuromodulator_signals() → dict

Returns novelty (0–1), boredom (0–1) for NE/DA modulation.

EmpowermentDrive(config: EmpowermentConfig)

record(action, next_state)

Store action-state pair.

compute_reward() → float

Weighted empowerment score.

CuriosityConfig

alpha: 0.3   boredom_steps: 200   max_reward: 1.0