Recent numerical studies of neural networks have revealed that the solutions typically found by modern machine learning algorithms are found in large and complex regions of the loss landscape. In these regions, energy-free paths between pairs of distant solutions can be established.
Researchers from Bocconi University, Politecnico di Torino and the Bocconi Institute for Data Science and Analysis recently conducted a study to explore these regions using one of the neural network models simplest non-convex, known as a negative spherical perceptron. Their article, published in Physical Examination Lettersfound that the solutions of this neural network model are arranged in a star-shaped geometry.
“Recent research on the neural network landscape has shown that independent stochastic gradient descent (SGD) trajectories often land in the same low-loss pool, and often no barriers are found along the linear interpolation between them,” Luca Saglietti, co-author of the paper, told Tech Xplore. “We started to think about whether we could analytically reproduce this phenomenology in a simple neural network model.”
In their paper, Saglietti and colleagues presented an analytical method that can be used to calculate energy barriers when linearly interpolating between pairs of solutions. In their recent study, they applied this method to the spherical negative perceptron, which is essentially a paradigmatic toy model of a neural network.
“This model is particularly interesting because it is both continuous and non-convex,” said Clarissa Lauditi, co-author of the paper. “This involves a set of parameters (called weights) that must be tuned so as to satisfy a training set of input-output associations. In the negative perceptron, constraints can be relaxed, and in this overparameterized regime, the geometry The solution space becomes surprisingly rich.
Researchers have studied this model and the rich solution landscape associated with it by characterizing energy barriers on linear interpolation between pairs of solutions using the so-called replica method. This is a well-known and established technique that is commonly applied in statistical physics studies.
“Unexpectedly, we revealed that the solutions are arranged in a star-shaped geometry,” said Enrico Malatesta, co-author of the paper.
“Most of them are located at the tip of the star, but there is a subset of solutions (located in the core of the star) that are connected by a straight line to almost every other solution. We We found that this shape actually impacts the behavior of the training algorithms: common algorithms used in deep learning have a bias towards solutions located at the core of the star. These solutions have desirable properties, e.g. better robustness and generalization capabilities.”
The recent work of this research team offers interesting new insights into the geometric shape of the solutions associated with the negative spherical perceptron. The team found that the way solutions are arranged affects the performance of algorithms, with algorithms often preferentially selecting solutions located in the center of the star-shaped geometry, which often have advantageous characteristics.
“Our future research aims to understand to what extent star-shaped geometry may be a universal property of overparameterized neural networks and weakly constrained optimization problems,” added Gabriele Perugini, co-author of the paper. “The possibility that completely different non-convex optimization problems could develop simple connectivity properties when overparameterized is certainly intriguing.”
More information:
Brandon Livio Annesi et al, Star-shaped solution space of the spherical negative perceptron, Physical Examination Letters (2023). DOI: 10.1103/PhysRevLett.131.227301.
© 2024 Science X Network
Quote: Solution space of spherical negative perceptron model is star-shaped, researchers discover (January 17, 2024) retrieved January 18, 2024 from
This document is subject to copyright. Apart from fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.