Conference Proceedings
Conference Proceedings
Machine Learning Techniques

Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures

Despite being the workhorse of deep learning, the backpropagation algorithm is no panacea. It enforces sequential layer updates, thus preventing efficient parallelization of the training process. Furthermore, its biological plausibility is being challenged. Alternative schemes have been devised; yet, under the constraint of synaptic asymmetry, none have scaled to modern deep learning tasks and architectures. Here, we challenge this perspective and study the applicability of Direct Feedback Alignment to neural view synthesis, recommender systems, geometric learning, and natural language processing.
Authors: Julien Launay, Iacopo Poli, François Boniface, Florent Krzakala
In: Proceedings of NeurIPS 2020
Conference Proceedings
Machine Learning Techniques

Reservoir Computing meets Recurrent Kernels and Structured Transforms

Reservoir Computing is a class of simple yet efficient Recurrent Neural Networks where internal weights are fixed at random and only a linear output layer is trained. In the large size limit, such random neural networks have a deep connection with kernel methods. Our contributions are threefold: a) We rigorously establish the recurrent kernel limit of Reservoir Computing and prove its convergence. b) We test our models on chaotic time series prediction, a classic but challenging benchmark in Reservoir Computing, and show how the Recurrent Kernel is competitive and computationally efficient when the number of data points remains moderate. c) When the number of samples is too large, we leverage the success of structured Random Features for kernel approximation by introducing Structured Reservoir Computing. The two proposed methods, Recurrent Kernel and Structured Reservoir Computing, turn out to be much faster and more memory-efficient than conventional Reservoir Computing.
Authors: Jonathan Dong, Ruben Ohana, Mushegh Rafayelyan, Florent Krzakala
In: Proceedings of NeurIPS 2020
Conference Proceedings
Hardware

Light-in-the-loop: using a photonics co-processor for scalable training of neural networks

As neural networks grow larger and more complex and data-hungry, training costs are skyrocketing. Especially when lifelong learning is necessary, such as in recommender systems or self-driving cars, this might soon become unsustainable. In this study, we present the first optical co-processor able to accelerate the training phase of digitally-implemented neural networks.
Authors: Julien Launay, Iacopo Poli, Kilian Müller, Igor Carron, Laurent Daudet, Florent Krzakala, Sylvain Gigan
In: Proceedings of Hot Chips 2020
In: Proceedings of IEEE HotChips 2020 (Stanford, USA).
Conference Proceedings
Computer Vision

Kernel computations from large-scale random features obtained by Optical Processing Units

Approximating kernel functions with random features (RFs)has been a successful application of random projections for nonparametric estimation. However, performing random projections presents computational challenges for large-scale problems. Recently, a new optical hardware called Optical Processing Unit (OPU) has been developed for fast and energy-efficient computation of large-scale RFs in the analog domain.
Authors: Ruben Ohana, Jonas Wacker, Jonathan Dong, Sébastien Marmin, Florent Krzakala, Maurizio Filippone, Laurent Daudet
In: Proceedings of ICASSP 2020
Conference Proceedings
Signal Processing

Fast Optical System Identification by Numerical Interferometry

We propose a numerical interferometry method for identification of optical multiply-scattering systems when only intensity can be measured. Our method simplifies the calibration of optical transmission matrices from a quadratic to a linear inverse problem by first recovering the phase of the measurements. We show that by carefully designing the probing signals, measurement phase retrieval amounts to a distance geometry problem—a multilateration—in the complex plane.
Authors: Sidharth Gupta, Rémi Gribonval, Laurent Daudet, Ivan Dokmanić
In: Proceedings of ICASSP 2020
Conference Proceedings
Signal Processing

Don’t take it lightly: Phasing optical random projections with unknown operators

In this paper we tackle the problem of recovering the phase of complex linear measurements when only magnitude information is available and we control the input. We are motivated by the recent development of dedicated optics-based hardware for rapid random projections which leverages the propagation of light in random media.
Authors: Sidharth Gupta, Rémi Gribonval, Laurent Daudet, Ivan Dokmanić
In: Proceedings of NeurIPS 2019
Conference Proceedings
Time Series 

Scaling up Echo-State Networks with multiple light scattering

Echo-State Networks and Reservoir Computing have been studied for more than a decade. They provide a simpler yet powerful alternative to Recurrent Neural Networks, every internal weight is fixed and only the last linear layer is trained. They involve many multiplications by dense random matrices. Very large networks are difficult to obtain, as the complexity scales quadratically both in time and memory.
Authors: Jonathan Dong, Sylvain Gigan, Florent Krzakala, Gilles Wainrib
In: IEEE Statistical Signal Processing Workshop 2018
Conference Proceedings
Hardware

Random Projections through multiple optical scattering: Approximating kernels at the speed of light

Random projections have proven extremely useful in many signal processing and machine learning applications. However, they often require either to store a very large random matrix, or to use a different, structured matrix to reduce the computational and memory costs. Here, we overcome this difficulty by proposing an analog, optical device, that performs the random projections literally at the speed of light without having to store any matrix in memory.
Authors: Alaa Saade, Francesco Caltagirone, Igor Carron, Laurent Daudet, Angélique Drémeau, Sylvain Gigan, Florent Krzakala
In: Proceedings of ICASSP 2016
Journal Papers
Journal Papers
Time series

NEWMA: a new method for scalable model-free online change-point detection

We consider the problem of detecting abrupt changes in the distribution of a multi-dimensional time series, with limited computing power and memory. In this paper, we propose a new, simple method for model-free online change-point detection that relies only on fast and light recursive statistics, inspired by the classical Exponential Weighted Moving Average algorithm (EWMA).
Authors: Nicolas Keriven, Damien Garreau, Iacopo Poli
In: IEEE Transactions on Signal Processing Vol. 68 pp. 3515 – 3528 (2020)
Journal Papers
Time Series

Optical Reservoir Computing using multiple light scattering for chaotic systems prediction

Reservoir Computing is a relatively recent computational framework based on a large Recurrent Neural Network with fixed weights. Many physical implementations of Reservoir Computing have been proposed to improve speed and energy efficiency. In this study, we report new advances in Optical Reservoir Computing using multiple light scattering to accelerate the recursive computation of the reservoir states.
Authors: Jonathan Dong, Mushegh Rafayelyan, Florent Krzakala, Sylvain Gigan
In: IEEE Journal of Selected Topics in Quantum Electronic Vol. 26(1) (2020)
Journal Papers
Quantum Physics

Machine learning and the physical sciences

Machine learning (ML) encompasses a broad range of algorithms and modeling tools used for a vast array of data processing tasks, which has entered most scientific disciplines in recent years. This article reviews in a selective way the recent research on the interface between machine learning and the physical sciences.
Authors: Giuseppe Carleo, Ignacio Cirac, Kyle Cranmer, Laurent Daudet, Maria Schuld, Naftali Tishby, Leslie Vogt-Maranto, Lenka Zdeborová
In: Reviews of Modern Physics Vol. 91(4) (2019)
Technical Reports & Preprints
Technical Reports and Preprints
Machine Learning Techniques

The dynamics of learning with feedback alignment

Direct Feedback Alignment (DFA) is emerging as an efficient and biologically plausible alternative to the ubiquitous backpropagation algorithm for training deep neural networks. Despite relying on random feedback weights for the backward pass, DFA successfully trains state-of-the-art models such as Transformers. On the other hand, it notoriously fails to train convolutional networks. An understanding of the inner workings of DFA to explain these diverging results remains elusive. Here, we propose a theory for the success of DFA. We first show that learning in shallow networks proceeds in two steps: an alignment phase, where the model adapts its weights to align the approximate gradient with the true gradient of the loss function, is followed by a memorisation phase, where the model focuses on fitting the data. This two-step process has a degeneracy breaking effect: out of all the low-loss solutions in the landscape, a network trained with DFA naturally converges to the solution which maximises gradient alignment. We also identify a key quantity underlying alignment in deep linear networks: the conditioning of the alignment matrices. The latter enables a detailed understanding of the impact of data structure on alignment and suggests a simple explanation for the well-known failure of DFA to train convolutional neural networks. Numerical experiments on MNIST and CIFAR10 clearly demonstrate degeneracy breaking in deep non-linear networks and show that the align-then-memorize process occurs sequentially from the bottom layers of the network to the top.
Authors: Maria Refinetti, Stéphane d’Ascoli, Ruben Ohana, Sebastian Goldt
Technical Reports and Preprints
Time Series

Online Change Point Detection in Molecular Dynamics With Optical Random Features

Proteins are made of atoms constantly fluctuating, but can occasionally undergo large-scale changes. Such transitions are of biological interest, linking the structure of a protein to its function with a cell. Atomic-level simulations, such as Molecular Dynamics (MD), are used to study these events. However, molecular dynamics simulations produce time series with multiple observables, while changes often only affect a few of them.
Authors: Amélie Chatelain, Giuseppe Luca Tommasone, Laurent Daudet, Iacopo Poli
Technical Reports and Preprints
Machine Learning Techniques

Principled Training of Neural Networks with Direct Feedback Alignment

The backpropagation algorithm has long been the canonical training method for neural networks. Modern paradigms are implicitly optimized for it, and numerous guidelines exist to ensure its proper use. Recently, synthetic gradients methods -where the error gradient is only roughly approximated – have garnered interest. These methods not only better portray how biological brains are learning, but also open new computational possibilities, such as updating layers asynchronously.
Authors: Julien Launay, Iacopo Poli, Florent Krzakala
Publications by the LightOn Community
Publications by the LightOn Community
Machine Learning Techniques

FAST GRAPH KERNEL WITH OPTICAL RANDOM FEATURES

The graphlet kernel is a classical method in graph classification. It, however, suffers from a high computation cost due to the isomorphism test it includes. As a generic proxy, and in general, at the cost of losing some information, this test can be efficiently replaced by a user-defined mapping that computes various graph characteristics. In this paper, we propose to leverage kernel random features within the graphlet framework and establish a theoretical link with a mean kernel metric. If this method can still be prohibitively costly for usual random features, we then incorporate optical random features that can be computed in constant time. Experiments show that the resulting algorithm is orders of magnitude faster than the graphlet kernel for the same, or better, accuracy.
Authors: Hashem Ghanem, Nicolas Keriven, Nicolas Tremblay
Publications by the LightOn Community
ICHEP 2020 | Prague

Using an Optical Processing Unit for tracking and calorimetry at the LHC

The High Luminosity Large Hadron Collider is expected to have a 10 times higher readout rate than the current state, significantly increasing the computational load required. It is then essential to explore new hardware paradigms. In this work we consider the Optical Processing Units (OPU) from LightOn, which compute random matrix multiplications on large datasets in an analog, fast and economic way, fostering faster machine learning results on a dataset of reduced dimension. We consider two case studies.
1) “Event classification”: high energy proton collision at the Large Hadron Collider have been simulated, each collision being recorded as an image representing the energy flux in the detector. The task is to train a classifier to separate a Susy signal from the background. The OPU allows fast end-to-end classification without building intermediate objects (like jets). This technique is presented, compared with more classical particle physics approaches.
2) “Tracking”: high energy proton collisions at the LHC yield billions of records with typically 100,000 3D points corresponding to the trajectory of 10.000 particles. Using two datasets from previous tracking challenges, we investigate the OPU potential to solve similar or related problems in high-energy physics, in terms of dimensionality reduction, data representation, and preliminary results.
Authors: Aishik Ghosh (Centre National de la Recherche Scientifique (FR)), Laurent Basara (LAL/LRI, Université Paris Saclay), Biswajit Biswas (Centre National de la Recherche Scientifique (FR)), David Rousseau (IJCLab-Orsay)