Task Prediction Error — How Loss Aversion Teaches Navigation
The global world model prediction error (~0.004) tells the SNN “you predicted the world correctly” — but correctly predicting that you walk straight past the ball is not the goal. MH-FLOCKE introduces a task-specific prediction error inspired by Cortical Labs’ DishBrain principle and Kahneman’s loss aversion.
The Problem
In early training runs (v034_1773952032, 50k steps), the Go2 consistently approached the ball but walked past it every time. The approach-overshoot-walkaway pattern repeated across all 7 ball episodes. The SNN had no mechanism to stay near the ball — R-STDP reinforced distance reduction but not proximity maintenance.
State-Based Task PE
Instead of step-delta PE (per-step distance change is ~0.001m, indistinguishable from noise), we use absolute state PE:
dist_PE = (ball_dist − ref_dist) / ref_dist Ball at 1m → PE = −0.67 (very good) Ball at 3m → PE = 0.0 (neutral, starting distance) Ball at 6m → PE = +1.0 (very bad) Ball at 8m → PE = +1.67 (terrible)
This gives strong contrast every single step. The SNN feels: “I’m close to the ball = calm = consolidate” vs “I’m far from ball = chaos = change behavior.”
Asymmetric PE (Issue #79b)
Walking away from the ball generates 2× the PE of approaching. This creates a ratchet effect — biological loss aversion (Kahneman & Tversky 1979):
if ball_dist > prev_ball_dist: # Walking away
dist_PE *= 2.0 # Losses hurt more than gains help
Proximity Brake PE (Issue #79c)
When the creature is within 0.5m of the ball, any departure generates a strong penalty:
if ball_dist < 0.5m:
dist_PE = max(dist_PE, 0.0) # Don't reward further approach
if departing:
dist_PE = departure × 10.0 # STRONG penalty for leaving
DishBrain Vision Boost
When task PE is positive (failing), the last 16 input neurons (8 heading + 8 distance = vision channels) receive extra current proportional to PE. This forces the SNN to process the ball signal — it cannot ignore its "eyes":
if task_PE > 0.05:
boost = task_PE × 0.5
snn.V[vision_neurons] += boost
Results (Before vs After)
| Metric | Before (50k) | After 50k | After 150k |
|---|---|---|---|
| Min ball distance | 31.67cm | 24.49cm | 32.37cm |
| Frames < 1.0m | ~100 | 881 (17.6%) | 1,904 (19.0%) |
| Longest streak < 1.0m | ~5-10 | 457 frames | 673 frames |
| Streaks > 100 frames | 0 | 0 | 6 |
| Falls | 0 | 0 | 0 |
| Behavior | approach-overshoot-walkaway | sniff/walk/play | chase/sniff/trot |
References
- Kagan, B.J. et al. (2022). In vitro neurons learn and exhibit sentience. Neuron (DishBrain)
- Kahneman, D. & Tversky, A. (1979). Prospect theory. Econometrica
- Friston, K. (2010). The free-energy principle. Nature Reviews Neuroscience
API Reference
Location: CognitiveBrain.process() in cognitive_brain.py
Task PE is computed inline during the cognitive cycle (step 3, after World Model). Not a separate class.
Key Variables
_task_prediction_error float Current task PE (-2 to +2) _prev_ball_dist float Ball distance at previous step
Input (from extra_sensor_data dict)
ball_distance float Center-to-center distance (meters) ball_heading float Normalized heading (-1 to +1)
Output
Replaces the global World Model PE as the learning signal for R-STDP. Also boosts vision neuron membrane potential when PE > 0.05.
Integration with R-STDP
# In apply_rstdp():
if |prediction_error| > 0.05:
combined = 0.1 * reward + 0.9 * (-prediction_error)
# prediction_error HERE is task_PE, not world model PE
CPG Proximity Brake (train_v032.py, line ~1196)
if ball_dist < 1.0:
proximity_amp_scale = max(0.1, 0.3 + 0.7 * (ball_dist / 1.0))
# Applied to CPG compute() as amp_scale parameter