Cerebellar Learning

Marr-Albus-Ito Cerebellar Architecture

The cerebellum is a forward model that predicts the sensory consequences of motor commands. When the prediction is wrong (a “climbing fiber error”), the cerebellar circuit adjusts its weights to improve future predictions. In MH-FLOCKE, this translates to real-time balance corrections during locomotion.

Biological Architecture

The cerebellar circuit follows the Marr-Albus-Ito theory (Marr 1969, Albus 1971, Ito 1984):

Mossy Fibers (sensory input)
  → Granule Cells (expansion, sparse coding)
    → Parallel Fibers → Purkinje Cells (learned weights)
  ← Golgi Cells (inhibitory feedback, enforces sparseness)
  
Climbing Fibers (error signal from inferior olive)
  → Purkinje Cells (triggers LTD at active PF-PkC synapses)
  
Purkinje Cells (inhibitory output)
  → Deep Cerebellar Nuclei (motor correction)

Mathematical Formulation

The core learning rule is Long-Term Depression (LTD) at the parallel fiber → Purkinje cell synapse:

Δw_PF→PkC = −lr × CF_activity × PF_activity

where:
  CF_activity = climbing fiber magnitude (balance error)
  PF_activity = parallel fiber spikes (GrC output)
  lr = cerebellar learning rate

This is a supervised learning rule: the climbing fiber provides the teaching signal (what went wrong), and the parallel fibers provide the context (what was the motor state when it went wrong).

Granule Cell Sparseness

The expansion from ~150 mossy fibers to 4,000 granule cells creates a sparse, high-dimensional code. Sparseness is enforced by Golgi cell inhibition:

sparseness = 1 − (mean(GrC_activity)² / mean(GrC_activity²))

Target sparseness: 0.05-0.20 (only 5-20% of GrC active at any time). This ensures the PkC weights learn specific motor contexts rather than average responses.

Balance Error (Climbing Fiber)

The climbing fiber signal is computed from vestibular feedback — the difference between the expected upright orientation and the actual orientation:

CF = |expected_upright − actual_upright|

where actual_upright is derived from the quaternion:
  upright = 1 − 2(qx² + qy²)

DCN Motor Correction

The Deep Cerebellar Nuclei (DCN) receive inhibitory input from Purkinje cells. Their output is the motor correction that gets added to the CPG baseline:

motor_output = CPG × cpg_weight + cerebellum_correction × (1 − cpg_weight)

Results

The cerebellar forward model achieves PF→PkC weight convergence at ~0.55 within 50k steps, with correction magnitudes stabilizing at ~0.02. In the 10-seed ablation study, the full SNN+Cerebellum condition (B1) achieved 45.15±0.67m versus PPO baseline at 12.83±7.78m — a 3.5× improvement with dramatically lower variance.

References

Marr, D. (1969). A theory of cerebellar cortex. Journal of Physiology
Albus, J.S. (1971). A theory of cerebellar function. Mathematical Biosciences
Ito, M. (1984). The Cerebellum and Neural Control. Raven Press
Wolpert, D.M. & Ghahramani, Z. (2000). Computational principles of movement neuroscience. Nature Neuroscience

API Reference

CerebellarLearning(snn, n_actuators, config, device)

Marr-Albus-Ito forward model for real-time motor correction.

set_populations(mf_ids, grc_ids, goc_ids, pkc_ids, dcn_ids)

Initialize after SNN builder defines populations. Creates PF→PkC weight matrix (Xavier init), eligibility traces, Purkinje multi-compartment layer.

update(creature, sensor_data: dict) → dict

One learning step: read GrC spikes, adapt Golgi threshold, compute PkC via PF→PkC, compute CF from InferiorOlive, apply LTD/LTP with DA modulation, update DCN. Returns dict with loss, grc_sparseness, dcn_activity.

compute_corrections(snn_controls: list, upright: float) → np.ndarray

DCN → motor corrections. Vestibular gate scales by uprightness. Ramps over snn_ramp_steps. Returns [n_actuators].

InferiorOlive(n_actuators, config)

Climbing fiber error generator: 7 error channels (roll, pitch, height, yaw, lateral, velocity, vestibular) + forward model PE + navigation VOR gain.

compute_cf_signal(sensor_data) → np.ndarray[n_purkinje]

Per-PkC climbing fiber activation. Pulsed ~4Hz (balance) and ~2Hz (velocity) for LTP recovery.

get_steering_gain_correction() → float

Cerebellar VOR gain modifier (-0.5 to +0.5). Positive = under-steering.

CerebellarConfig (key parameters)

n_granule:       4000    n_golgi:      200
mf_per_granule:  4       pf_pkc_prob:  0.4
ltd_rate:        0.001   ltp_rate:     0.001
cf_threshold:    0.05    snn_mix_end:  0.35
snn_ramp_steps:  3000    dcn_tonic:    0.5