Ternary Neural Networks for Efficient Reinforcement Learning

03/15/2025

Investigate whether Ternary Neural Networks—restricting weights to just {-1, 0, +1}—can revolutionize the use of evolutionary algorithms as an alternative or complement to reinforcement learning by drastically shrinking the search space and enabling faster, efficient convergence.

Motivation

Ternary Neural Networks (TNNs)—where weights and biases take values in {−1,0,+1}—have gained increasing attention due to their potential for reducing computational load, improving energy efficiency, and preserving good performance in deep learning applications. Recent developments indicate that such low-precision networks can be applied at large scales, as demonstrated in 1-bit LLM research, which achieves near-parity with standard 16-bit models in terms of performance but with significantly lower memory and energy demands [Ma et al., 2024].

TNNs drastically reduce the search space by restricting parameters to a finite set, which can accelerate the convergence of evolutionary algorithms compared to optimizing traditional full-precision networks. In the context of neuroevolution—where methods like NEAT [Stanley and Miikkulainen, 2002] have shown promise—this reduced complexity allows for more efficient exploration and refinement of network architectures. Work by OpenAI [Salimans et al., 2017] highlights that evolutionary strategies can serve as a scalable alternative to standard RL techniques, indicating that the combination of TNNs with such optimization methods may lead to faster solutions for complex RL tasks.

Goal

The main objective is to design, implement, and evaluate neuroevolution methods for Ternary Neural Networks for reinforcement learning tasks of varying complexity. This includes:

Investigating the use of TNNs as function approximators or policy representations in RL.
Exploring evolutionary algorithms as an alternative or complement to gradient-based optimizers for training TNNs.
Measuring training efficiency, training stability, and performance metrics to identify conditions under which TNNs excel compared to conventional continuous-valued networks as well as traditional RL methods.

Tasks

Literature Review: Survey existing work on ternary or low-precision neural networks, with special attention to recent breakthroughs in 1-bit LLMs. Survey neuroevolution methods and their usage for RL problems.
Implementation: Develop a modular framework (e.g., using PyTorch or TensorFlow) to support ternary network layers, integrated into standard RL toolkits.
Experimentation:
- Compare different training procedures (e.g., policy gradient vs. evolutionary search) on RL benchmarks (e.g., Classic Control, MuJoCo, Atari, or other environments).
- Investigate trade-offs in speed, sample efficiency, and accuracy/performance.
Analysis: Document experimental outcomes, pinpointing conditions (such as network size or task complexity) where TNN-based RL and evolutionary algorithms show the most promise.
Thesis Documentation: Present findings and propose guidelines for deploying TNN-based RL algorithms in future research or practical applications.

Prerequisites

Proficiency in Python and familiarity with a deep learning framework (e.g., PyTorch, TensorFlow).
Solid understanding of reinforcement learning concepts (e.g., Q-learning, policy gradients) and basic knowledge of evolutionary algorithms.

Literature

Ma, Shuming, et al. "The era of 1-bit llms: All large language models are in 1.58 bits." arXiv preprint arXiv:2402.17764 1 (2024).
Salimans, Tim, et al. "Evolution strategies as a scalable alternative to reinforcement learning." arXiv preprint arXiv:1703.03864 (2017).
Stanley, Kenneth O., and Risto Miikkulainen. "Evolving neural networks through augmenting topologies." Evolutionary computation 10.2 (2002): 99-127.

Contact Person

Johannes Büttner

Hubland Süd, Geb. M2