FCES is a zero-memory evolutionary optimizer that replaces AdamW for neural network training, saving 100% of optimizer VRAM. This repository provides a native C++ implementation with libtorch integration for maximum performance.

Features

Zero-State Overhead: No per-parameter momentum/variance buffers (unlike Adam)
Neuro-Evolutionary Search: Population of fuzzy controllers evolved via genetic algorithms
Spectral Sensing: Grokking-aware rank minimization
Born Quantized: Quantization-aware evolutionary training
libtorch Integration: Drop-in replacement for PyTorch optimizers
Python Extension: pip-installable via pybind11 for seamless PyTorch interop

Architecture

┌──────────────────────────────────────────────────────┐
│                  FCESOptimizer                        │
│  ┌────────────────┐  ┌─────────────────────────────┐ │
│  │ EvolutionMgr   │  │ Parameter Update (libtorch) │ │
│  │ ┌────────────┐ │  │  • sign(grad)               │ │
│  │ │ Population │ │  │  • trust region clipping     │ │
│  │ │ ┌────────┐ │ │  │  • weight decay              │ │
│  │ │ │Ctrl[0] │ │ │  └─────────────────────────────┘ │
│  │ │ │Ctrl[1] │ │ │  ┌─────────────────────────────┐ │
│  │ │ │  ...   │ │ │  │ SpectralSensor              │ │
│  │ │ │Ctrl[N] │ │ │  │  • rank tracking             │ │
│  │ │ └────────┘ │ │  │  • grokking detection        │ │
│  │ └────────────┘ │  └─────────────────────────────┘ │
│  │ • crossover    │  ┌─────────────────────────────┐ │
│  │ • mutation     │  │ OscillationDetector (FFT)   │ │
│  │ • selection    │  └─────────────────────────────┘ │
│  └────────────────┘                                  │
└──────────────────────────────────────────────────────┘

Building

Prerequisites

C++17 compiler (GCC 9+, Clang 10+, MSVC 2019+)
CMake 3.18+
libtorch (PyTorch C++ distribution)

Build Steps

# Download libtorch (Linux, CUDA 12.1)
wget https://download.pytorch.org/libtorch/cu121/libtorch-cxx11-abi-shared-with-deps-2.3.0%2Bcu121.zip
unzip libtorch-*.zip

# Configure & Build
mkdir build && cd build
cmake .. -DCMAKE_PREFIX_PATH=/path/to/libtorch
cmake --build . --config Release -j$(nproc)

# Run tests
ctest --output-on-failure

Windows (MSVC)

mkdir build; cd build
cmake .. -DCMAKE_PREFIX_PATH="C:/path/to/libtorch" -G "Visual Studio 17 2022"
cmake --build . --config Release
ctest -C Release --output-on-failure

Python Extension

cd python
pip install .

import torch
from fces_native import FCESOptimizer

model = MyModel()
optimizer = FCESOptimizer(model.parameters(), lr=1.6e-3, population_size=200)

for batch in dataloader:
    loss = model(batch)
    loss.backward()
    optimizer.step()
    optimizer.update_fitness(loss.item())
    optimizer.zero_grad()

Performance

Expected speedup over the Python implementation:

Component	Python	C++	Speedup
`evolve()` (200 controllers)	~2ms	~20μs	~100x
`decide_update()`	~0.5ms	~5μs	~100x
End-to-end `step()` overhead	~4ms	~0.15ms	~25x

Scientific Background

See FCES SCIENCE.md for the full scientific log documenting the evolution from V1.0 (hardcoded fuzzy rules) through V49.0 (born-quantized training).

License

MIT License — See LICENSE for details.

FCES (Python) — Original Python implementation

README.md

FCES-native