MH-FLOCKE MH-FLOCKE
HomeDocsGitHubBlogPaperYouTubeReddit𝕏

Task PE & Ball Interaction

Task Prediction Error — How Loss Aversion Teaches Navigation

The global world model prediction error (~0.004) tells the SNN “you predicted the world correctly” — but correctly predicting that you walk straight past the ball is not the goal. MH-FLOCKE introduces a task-specific prediction error inspired by Cortical Labs’ DishBrain principle and Kahneman’s loss aversion.

The Problem

In early training runs (v034_1773952032, 50k steps), the Go2 consistently approached the ball but walked past it every time. The approach-overshoot-walkaway pattern repeated across all 7 ball episodes. The SNN had no mechanism to stay near the ball — R-STDP reinforced distance reduction but not proximity maintenance.

State-Based Task PE

Instead of step-delta PE (per-step distance change is ~0.001m, indistinguishable from noise), we use absolute state PE:

dist_PE = (ball_dist − ref_dist) / ref_dist

  Ball at 1m → PE = −0.67 (very good)
  Ball at 3m → PE = 0.0   (neutral, starting distance)
  Ball at 6m → PE = +1.0  (very bad)
  Ball at 8m → PE = +1.67 (terrible)

This gives strong contrast every single step. The SNN feels: “I’m close to the ball = calm = consolidate” vs “I’m far from ball = chaos = change behavior.”

Asymmetric PE (Issue #79b)

Walking away from the ball generates 2× the PE of approaching. This creates a ratchet effect — biological loss aversion (Kahneman & Tversky 1979):

if ball_dist > prev_ball_dist:  # Walking away
    dist_PE *= 2.0              # Losses hurt more than gains help

Proximity Brake PE (Issue #79c)

When the creature is within 0.5m of the ball, any departure generates a strong penalty:

if ball_dist < 0.5m:
    dist_PE = max(dist_PE, 0.0)      # Don't reward further approach
    if departing:
        dist_PE = departure × 10.0    # STRONG penalty for leaving

DishBrain Vision Boost

When task PE is positive (failing), the last 16 input neurons (8 heading + 8 distance = vision channels) receive extra current proportional to PE. This forces the SNN to process the ball signal — it cannot ignore its "eyes":

if task_PE > 0.05:
    boost = task_PE × 0.5
    snn.V[vision_neurons] += boost

Results (Before vs After)

Metric Before (50k) After 50k After 150k
Min ball distance 31.67cm 24.49cm 32.37cm
Frames < 1.0m ~100 881 (17.6%) 1,904 (19.0%)
Longest streak < 1.0m ~5-10 457 frames 673 frames
Streaks > 100 frames 0 0 6
Falls 0 0 0
Behavior approach-overshoot-walkaway sniff/walk/play chase/sniff/trot

References

  • Kagan, B.J. et al. (2022). In vitro neurons learn and exhibit sentience. Neuron (DishBrain)
  • Kahneman, D. & Tversky, A. (1979). Prospect theory. Econometrica
  • Friston, K. (2010). The free-energy principle. Nature Reviews Neuroscience

API Reference

Location: CognitiveBrain.process() in cognitive_brain.py

Task PE is computed inline during the cognitive cycle (step 3, after World Model). Not a separate class.

Key Variables

_task_prediction_error   float   Current task PE (-2 to +2)
_prev_ball_dist          float   Ball distance at previous step

Input (from extra_sensor_data dict)

ball_distance    float   Center-to-center distance (meters)
ball_heading     float   Normalized heading (-1 to +1)

Output

Replaces the global World Model PE as the learning signal for R-STDP. Also boosts vision neuron membrane potential when PE > 0.05.

Integration with R-STDP

# In apply_rstdp():
if |prediction_error| > 0.05:
    combined = 0.1 * reward + 0.9 * (-prediction_error)
    # prediction_error HERE is task_PE, not world model PE

CPG Proximity Brake (train_v032.py, line ~1196)

if ball_dist < 1.0:
    proximity_amp_scale = max(0.1, 0.3 + 0.7 * (ball_dist / 1.0))
    # Applied to CPG compute() as amp_scale parameter