The 8th LightOn AI Meetup rocked! This time, we had Claudio Gallicchio, Assistant Professor at the University of Pisa, talking about Deep Reservoir Computing and Beyond. You may check out the slides for his talk while we prepare the video of the meetup for upload. It will be posted soon on the Youtube channel of LightOn, subscribe to get notified and subscribe to our Meetup to get notified of the next events. We meet again on the 3rd of December to discuss the Hardware Lottery with Sara Hooker!
The presentation started with an introduction about Reservoir Computing: take a neural network, with a random recursive hidden layer, and train only the readout, voilà! You have a reservoir computer 🛢️, also called Echo State Network (ESN). These models are much faster and lightweight to train with respect to recurrent neural networks. How do they work?
You may remember that not all neural network architectures are created equal, some encode better priors than others: the state space of these models groups input sequences with common suffix together even without training.
Echo State Networks have other advantages besides being more lightweight and faster. They allow a clean mathematical analysis, and unconventional hardware implementation, for example in photonics ⚡ (a definitely not biased collection of paper on this topic 😉 https://arxiv.org/abs/1609.05204, https://arxiv.org/abs/1907.00657, https://arxiv.org/abs/2001.09131).
So what is the next step? Make it deep! When deep learning meets reservoir computing, the recurrent component becomes a stacked composition of multiple reservoirs.
Depth allows to develop richer dynamics, even without training of the recurrent connections, and to fit multiple time-scales and frequencies. A guideline 👨🏫👩🏫 to design Deep ESNs: since each reservoir layer cuts part of the frequency content, stop adding layers when the filtering becomes negligible.
Finally, Claudio talked about an exciting new development: Reservoir Computing for Graphs 🕸️🧬
If we take a vertex, the embedding at a certain layer is given by the input modulated by a random weight matrix summed to the contribution given by the neighboring vectors modulated by a random recursive weight matrix. The input of each layer is either the vertex label for the first layer or the embedding of the previous layer for the following ones.
Once again, deep graph neural network architectures with stable dynamics can inherently build rich neural embeddings for graphs, even without training.
The cherry on the cake: the code by Claudio on Deep Reservoir Computing is available on Github.
At LightOn we compute dense random layers literally at the speed of light. If you want to try out your latest idea, you can register to the LightOn Cloud or apply to the LightOn Cloud for Research Program!