- CMakeLists.txt with libtorch, GoogleTest, GoogleBenchmark, OpenMP, pybind11 - Header files: config, controller, population, fitness, evolution, spectral, oscillation, telemetry, optimizer - Source implementations: controller (full micro-MLP forward pass, mutation, crossover), fitness (Welford's algorithm), oscillation (DFT), spectral (SVD rank), optimizer (sign-SGD stub) - Tests: controller, population, fitness, optimizer (Google Test) - Benchmarks: evolve throughput, optimizer step (Google Benchmark) - Examples: simple optimization, PyTorch/libtorch integration - Python extension: pybind11 bindings with setup.py - README with architecture diagram and build instructions
115 lines
4.5 KiB
Markdown
115 lines
4.5 KiB
Markdown
# FCES-native
|
|
|
|
**High-performance C++ reimplementation of the Fuzzy Controlled Evolutionary Search (FCES) optimizer.**
|
|
|
|
FCES is a zero-memory evolutionary optimizer that replaces AdamW for neural network training, saving 100% of optimizer VRAM. This repository provides a native C++ implementation with libtorch integration for maximum performance.
|
|
|
|
## Features
|
|
|
|
- **Zero-State Overhead**: No per-parameter momentum/variance buffers (unlike Adam)
|
|
- **Neuro-Evolutionary Search**: Population of fuzzy controllers evolved via genetic algorithms
|
|
- **Spectral Sensing**: Grokking-aware rank minimization
|
|
- **Born Quantized**: Quantization-aware evolutionary training
|
|
- **libtorch Integration**: Drop-in replacement for PyTorch optimizers
|
|
- **Python Extension**: pip-installable via pybind11 for seamless PyTorch interop
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌──────────────────────────────────────────────────────┐
|
|
│ FCESOptimizer │
|
|
│ ┌────────────────┐ ┌─────────────────────────────┐ │
|
|
│ │ EvolutionMgr │ │ Parameter Update (libtorch) │ │
|
|
│ │ ┌────────────┐ │ │ • sign(grad) │ │
|
|
│ │ │ Population │ │ │ • trust region clipping │ │
|
|
│ │ │ ┌────────┐ │ │ │ • weight decay │ │
|
|
│ │ │ │Ctrl[0] │ │ │ └─────────────────────────────┘ │
|
|
│ │ │ │Ctrl[1] │ │ │ ┌─────────────────────────────┐ │
|
|
│ │ │ │ ... │ │ │ │ SpectralSensor │ │
|
|
│ │ │ │Ctrl[N] │ │ │ │ • rank tracking │ │
|
|
│ │ │ └────────┘ │ │ │ • grokking detection │ │
|
|
│ │ └────────────┘ │ └─────────────────────────────┘ │
|
|
│ │ • crossover │ ┌─────────────────────────────┐ │
|
|
│ │ • mutation │ │ OscillationDetector (FFT) │ │
|
|
│ │ • selection │ └─────────────────────────────┘ │
|
|
│ └────────────────┘ │
|
|
└──────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Building
|
|
|
|
### Prerequisites
|
|
|
|
- C++17 compiler (GCC 9+, Clang 10+, MSVC 2019+)
|
|
- CMake 3.18+
|
|
- libtorch (PyTorch C++ distribution)
|
|
|
|
### Build Steps
|
|
|
|
```bash
|
|
# Download libtorch (Linux, CUDA 12.1)
|
|
wget https://download.pytorch.org/libtorch/cu121/libtorch-cxx11-abi-shared-with-deps-2.3.0%2Bcu121.zip
|
|
unzip libtorch-*.zip
|
|
|
|
# Configure & Build
|
|
mkdir build && cd build
|
|
cmake .. -DCMAKE_PREFIX_PATH=/path/to/libtorch
|
|
cmake --build . --config Release -j$(nproc)
|
|
|
|
# Run tests
|
|
ctest --output-on-failure
|
|
```
|
|
|
|
### Windows (MSVC)
|
|
|
|
```powershell
|
|
mkdir build; cd build
|
|
cmake .. -DCMAKE_PREFIX_PATH="C:/path/to/libtorch" -G "Visual Studio 17 2022"
|
|
cmake --build . --config Release
|
|
ctest -C Release --output-on-failure
|
|
```
|
|
|
|
## Python Extension
|
|
|
|
```bash
|
|
cd python
|
|
pip install .
|
|
```
|
|
|
|
```python
|
|
import torch
|
|
from fces_native import FCESOptimizer
|
|
|
|
model = MyModel()
|
|
optimizer = FCESOptimizer(model.parameters(), lr=1.6e-3, population_size=200)
|
|
|
|
for batch in dataloader:
|
|
loss = model(batch)
|
|
loss.backward()
|
|
optimizer.step()
|
|
optimizer.update_fitness(loss.item())
|
|
optimizer.zero_grad()
|
|
```
|
|
|
|
## Performance
|
|
|
|
Expected speedup over the Python implementation:
|
|
|
|
| Component | Python | C++ | Speedup |
|
|
|-----------|--------|-----|---------|
|
|
| `evolve()` (200 controllers) | ~2ms | ~20μs | ~100x |
|
|
| `decide_update()` | ~0.5ms | ~5μs | ~100x |
|
|
| End-to-end `step()` overhead | ~4ms | ~0.15ms | ~25x |
|
|
|
|
## Scientific Background
|
|
|
|
See [FCES SCIENCE.md](https://git.zky.de/sven/FCES/src/branch/main/SCIENCE.md) for the full scientific log documenting the evolution from V1.0 (hardcoded fuzzy rules) through V49.0 (born-quantized training).
|
|
|
|
## License
|
|
|
|
MIT License — See [LICENSE](LICENSE) for details.
|
|
|
|
## Related
|
|
|
|
- [FCES (Python)](https://git.zky.de/sven/FCES) — Original Python implementation
|