Statistical physics and practical training of soft-committee machines

M. Ahr; M. Biehl; R. Urbanczik

doi:10.1007/s100510050889

EPJ

a
b
c
d
e
ap
st
h
plus
ds
pv
ti
qt
am
n

2024 Impact factor 1.7

Condensed Matter and Complex Systems

Eur. Phys. J. B 10, 583-588 (1999)
https://doi.org/10.1007/s100510050889

Statistical physics and practical training of soft-committee machines

M. Ahr¹^a, M. Biehl¹ and R. Urbanczik²

¹ Institut für Theoretische Physik, Julius-Maximilians-Universität Würzburg, Am Hubland, 97074 Würzburg, Germany,
² Neural Computing Research Group, Aston University, Aston Triangle, Birmingham B4 7ET, UK

Corresponding author: ^a ahr@physik.uni-wuerzburg.de

Received: 16 December 1998
Published online: 15 August 1999

Abstract

Equilibrium states of large layered neural networks with differentiable activation function and a single, linear output unit are investigated using the replica formalism. The quenched free energy of a student network with a very large number of hidden units learning a rule of perfectly matching complexity is calculated analytically. The system undergoes a first order phase transition from unspecialized to specialized student configurations at a critical size of the training set. Computer simulations of learning by stochastic gradient descent from a fixed training set demonstrate that the equilibrium results describe quantitatively the plateau states which occur in practical training procedures at sufficiently small but finite learning rates.

PACS: 05.90.+m – Other topics in statistical physics, thermodynamics and nonlinear dynamical systems / 07.05.Mh – Neural networks, fuzzy logic, artificial intelligence / 87.10.+e – General theory and mathematical aspects

Conference announcements

12 Internat. Congress of the Balkan Physical Union
July 8-12, 2025
Bucharest, Romania

Joint Annual Meeting of ÖPG and SPS
August 18-22, 2025
Wien, Austria

111th Italian National Society Congress
September 22-26, 2025
Palermo, Italy