The deep neural network models that power today’s most demanding machine-learning applications have grown so large and complicated that they’re pushing the boundaries of traditional electronic computing hardware.
Photonic hardware, which might perform machine-learning computations with light, offers a faster and more energy-efficient alternative. Nevertheless, there are some sorts of neural network computations that a photonic device can’t perform, requiring using off-chip electronics or other techniques that hamper speed and efficiency.
Constructing on a decade of research, scientists from MIT and elsewhere have developed a brand new photonic chip that overcomes these roadblocks. They demonstrated a completely integrated photonic processor that may perform all the important thing computations of a deep neural network optically on the chip.
The optical device was capable of complete the important thing computations for a machine-learning classification task in lower than half a nanosecond while achieving greater than 92 percent accuracy — performance that’s on par with traditional hardware.
The chip, composed of interconnected modules that form an optical neural network, is fabricated using industrial foundry processes, which could enable the scaling of the technology and its integration into electronics.
In the long term, the photonic processor could lead on to faster and more energy-efficient deep learning for computationally demanding applications like lidar, scientific research in astronomy and particle physics, or high-speed telecommunications.
“There are lots of cases where how well the model performs isn’t the one thing that matters, but additionally how briskly you possibly can get a solution. Now that we’ve got an end-to-end system that may run a neural network in optics, at a nanosecond time scale, we are able to start pondering at a better level about applications and algorithms,” says Saumil Bandyopadhyay ’17, MEng ’18, PhD ’23, a visiting scientist within the Quantum Photonics and AI Group inside the Research Laboratory of Electronics (RLE) and a postdoc at NTT Research, Inc., who’s the lead creator of a paper on the brand new chip.
Bandyopadhyay is joined on the paper by Alexander Sludds ’18, MEng ’19, PhD ’23; Nicholas Harris PhD ’17; Darius Bunandar PhD ’19; Stefan Krastanov, a former RLE research scientist who’s now an assistant professor on the University of Massachusetts at Amherst; Ryan Hamerly, a visiting scientist at RLE and senior scientist at NTT Research; Matthew Streshinsky, a former silicon photonics lead at Nokia who’s now co-founder and CEO of Enosemi; Michael Hochberg, president of Periplous, LLC; and Dirk Englund, a professor within the Department of Electrical Engineering and Computer Science, principal investigator of the Quantum Photonics and Artificial Intelligence Group and of RLE, and senior creator of the paper. The research appears today in
Machine learning with light
Deep neural networks are composed of many interconnected layers of nodes, or neurons, that operate on input data to supply an output. One key operation in a deep neural network involves using linear algebra to perform matrix multiplication, which transforms data because it is passed from layer to layer.
But along with these linear operations, deep neural networks perform nonlinear operations that help the model learn more intricate patterns. Nonlinear operations, like activation functions, give deep neural networks the ability to unravel complex problems.
In 2017, Englund’s group, together with researchers within the lab of Marin Soljačić, the Cecil and Ida Green Professor of Physics, demonstrated an optical neural network on a single photonic chip that might perform matrix multiplication with light.
But on the time, the device couldn’t perform nonlinear operations on the chip. Optical data needed to be converted into electrical signals and sent to a digital processor to perform nonlinear operations.
“Nonlinearity in optics is kind of difficult because photons don’t interact with one another very easily. That makes it very power consuming to trigger optical nonlinearities, so it becomes difficult to construct a system that may do it in a scalable way,” Bandyopadhyay explains.
They overcame that challenge by designing devices called nonlinear optical function units (NOFUs), which mix electronics and optics to implement nonlinear operations on the chip.
The researchers built an optical deep neural network on a photonic chip using three layers of devices that perform linear and nonlinear operations.
A totally-integrated network
On the outset, their system encodes the parameters of a deep neural network into light. Then, an array of programmable beamsplitters, which was demonstrated within the 2017 paper, performs matrix multiplication on those inputs.
The info then pass to programmable NOFUs, which implement nonlinear functions by siphoning off a small amount of sunshine to photodiodes that convert optical signals to electric current. This process, which eliminates the necessity for an external amplifier, consumes little or no energy.
“We stay within the optical domain the entire time, until the tip when we would like to read out the reply. This permits us to attain ultra-low latency,” Bandyopadhyay says.
Achieving such low latency enabled them to efficiently train a deep neural network on the chip, a process referred to as in situtraining that typically consumes an enormous amount of energy in digital hardware.
“This is very useful for systems where you’re doing in-domain processing of optical signals, like navigation or telecommunications, but additionally in systems that you just want to learn in real time,” he says.
The photonic system achieved greater than 96 percent accuracy during training tests and greater than 92 percent accuracy during inference, which is comparable to traditional hardware. As well as, the chip performs key computations in lower than half a nanosecond.
“This work demonstrates that computing — at its essence, the mapping of inputs to outputs — could be compiled onto recent architectures of linear and nonlinear physics that enable a fundamentally different scaling law of computation versus effort needed,” says Englund.
All the circuit was fabricated using the identical infrastructure and foundry processes that produce CMOS computer chips. This might enable the chip to be manufactured at scale, using tried-and-true techniques that introduce little or no error into the fabrication process.
Scaling up their device and integrating it with real-world electronics like cameras or telecommunications systems shall be a significant focus of future work, Bandyopadhyay says. As well as, the researchers need to explore algorithms that may leverage some great benefits of optics to coach systems faster and with higher energy efficiency.
This research was funded, partially, by the U.S. National Science Foundation, the U.S. Air Force Office of Scientific Research, and NTT Research.