Broken symmetries in multilayered perceptrons

The statistical mechanics of two-layered perceptrons with N input units, K hidden units, and a single output unit that makes a decision based on a majority rule (Committee Machine) are studied. Two architectures are considered. In the nonoverlapping case the hidden units do not share common inputs. In the fully connected case each hidden unit is connected to the entire input layer. In both cases the network realizes a random dichotomy of P inputs. The statistical properties of the space of solutions as a function of P is studied, using the replica method, and by numerical simulations, in the regime where NK. In the nonoverlapping architecture with continuously varying weights the capacity, defined as the maximal number of P per weight, (αc) is calculated under a replica-symmetric (RS) ansatz. At large K, αc diverges as K1/2 in contradiction with the rigorous upper bound, αc<C lnK, where C is a proportionality constant, derived by Mitchison and Durbin [Biol. Cybern. 60, 345 (1989)]. This suggests a strong replica-symmetry-breaking effect.

The instability of the RS solution is shown to occur at a value of α which remains finite in the large-Klimit. A one-step replica-symmetry-breaking (RSB) ansatz is studied for K=3 and in the limit K goes to infinity. The results indicate that αc(K) diverges with K, probably logarithmically. The occurrence of RSB far below the capacity limit is confirmed by comparison of the theoretical results with numerical simulations for K=3. This symmetry breaking implies that unlike the single-layer perceptron case, the space of solutions of the two-layer perceptron breaks, beyond a critical value of α, into many disjoint subregions. The entropies of the connected subregions are almost degenerate, their relative difference being of order 1/N. In the case of a nonoverlapping Committee Machine with binary, i.e., ±1 weights, αc≤1 is an upper bound for all K. The RS theory predicts αc=0.92 for K=3 and αc=0.95 for the large-Klimit.

The theoretical prediction (for K=3) is in excellent agreement with the numerical estimate based on an exhaustive search in the space of solutions for small N. These results indicate that in the binary case there is no RSB in the space of solutions below the maximal capacity. In the fully connected architecture, the solution’s phase space has a global permutation symmetry (PS) reflecting the invariance under permuting the hidden units. The order parameters that signal the spontaneous breaking of this symmetry are defined. The RS theory shows that for small α the PS is maintained. For larger values of α<αc the symmetry is broken, inplying the breaking of the solution space into disjoint regions. These regions are related by permutation symmetry, hence they are fully degenerate with respect to their entropies and statistical properties. This prediction has been tested by simulations of the K=3 case, calculating the order parameters by random walks in the space of solutions. They yield good evidence for existence of a phase with broken permutation symmetry at values of α≥2. Finally, both theory and simulations show that for a typical fully connected network the connections joining the same input to a pair of hidden units are negatively correlated.

Authors: Barkai E, Hansel D, Sompolinsky H.
Year of publication: 1992
Journal: Phys Rev A. 1992 Mar 15;45(6):4146-4161.

Link to publication:


“Working memory”