A 560-Neuron Spiking Network Steers a Quadruped Toward Light
For the first time, MH-FLOCKE’s robot dog actively navigates toward a light source in simulation — driven by hardwired reflexes and learned neural adaptations. No external reward signal. No reinforcement learning. Just body signals.
What You See in the Video
A Freenove Robot Dog (€100, Raspberry Pi, 12 servos) walks across a flat surface in MuJoCo simulation. A light source (yellow dot on the mini-map) is placed ahead and to the side. The dog detects the light gradient and steers toward it using a VOR (Vestibulo-Ocular Response) reflex — a hardwired brainstem circuit that turns the body toward a visual target.
The mini-map in the bottom-left corner shows the robot’s trail (green line) and the light waypoint (yellow glow). You can see the dog arcing toward the light rather than walking straight. This arc is not a software limitation — it reflects the physical turning radius of the Freenove’s CPG gait, which produces roughly 12% steering asymmetry between left and right legs.
Hardwired vs. Learned: The Biological Design
MH-FLOCKE follows the same principle as biological motor development. A newborn puppy doesn’t learn to walk from scratch — its spinal cord has CPG circuits that produce rhythmic leg movements from birth. The cerebellum calibrates these movements through experience. The brainstem provides reflexes like the VOR. Learning refines what reflexes provide.
Hardwired components (present from “birth”):
- CPG (Central Pattern Generator) — Mathematical oscillator producing rhythmic gait. The SNN does not generate the gait pattern; CPG provides the baseline.
- VOR (Vestibulo-Ocular Response) — Reflexive steering toward the light target. Hardwired, like in a real animal’s superior colliculus.
- Run-and-Tumble — A bacterial-inspired navigation state machine (Berg & Brown, 1972). Alternates between running straight and turning (tumbling) when the gradient changes.
- Spinal reflexes — Righting reflex, cross-extension reflex, terrain compensation.
Learned through training (emerges from experience):
- SNN weights (R-STDP) — Reward-modulated spike-timing-dependent plasticity adapts 560 neuron connections based on intrinsic reward (vestibular comfort, prediction error, curiosity).
- Cerebellar correction (Marr-Albus-Ito) — The cerebellum learns forward-model corrections. Correction magnitude grows from 0.0006 to 0.034 over training — the strongest cerebellar signal ever measured in MH-FLOCKE.
- CPG-to-SNN handoff — The CPG starts at 90% control and fades to ~45% as the SNN proves it can maintain stable locomotion. The SNN earns control through competence, not through a timer.
The Numbers
- 33,000 steps, 9.7 minutes training time (57 sps on CPU)
- 0 falls, perfect upright streak
- 2 light targets reached (sf:2) through active VOR-guided steering
- VOR signal up to +0.54 — strong, sustained steering toward the light
- 4 Run-and-Tumble events — the navigation state machine triggered naturally
- Cerebellar correction: 0.008 — real Marr-Albus-Ito learning
Why the Dog Arcs Around the Light
You’ll notice the dog doesn’t walk straight to the light — it takes a wide arc. This is not a bug. The Freenove’s CPG produces approximately 12% amplitude asymmetry between left and right legs when steering. This gives the robot a turning radius of roughly 5 meters. The VOR reflex fires correctly, and the CPG responds — but the body can only turn as fast as the legs allow.
This is exactly what happens with real quadrupeds. A horse can’t make the same tight turns as a cat. The steering intention is there; the biomechanics set the limit.
Performance Breakthrough: 6× Speedup
This session also resolved a critical performance bug. Step-time was growing from 20ms to 800ms over 100k steps — making long training runs impossible. The root cause: an O(N²) clustering operation in the Synaptogenesis module that processed 5,000 accumulated experience patterns without clearing the buffer.
The fix (buffer.clear() after consolidation + max_size reduction) brought step-time back to a stable 18ms across 100k steps. Training speed went from 7 sps to 54 sps — a 6× improvement that makes all future development viable.
What’s Next
- Hardware phototaxis — The same VOR steering with a real camera (cv2) on the Freenove, following a flashlight on the floor.
- Autonomous loop — Instead of pre-placed waypoints, the dog chooses its own targets based on curiosity, exploration drive, and episodic memory. All the modules exist; they need a conductor.
- Paper 2 — Sim-to-real transfer + phototaxis results for Frontiers in Neurorobotics or CoRL workshop.
MH-FLOCKE is an open-source project by Marc Hesse — independent researcher, Potsdam, Germany. Named after Flocke, my late dog.
Code: github.com/MarcHesse/mhflocke (Apache 2.0)
Paper: aiXiv Preprint