Breaking the scaling limits of analog computing

-

As machine-learning models develop into larger and more complex, they require faster and more energy-efficient hardware to perform computations. Conventional digital computers are struggling to maintain up.

An analog optical neural network could perform the identical tasks as a digital one, equivalent to image classification or speech recognition, but because computations are performed using light as a substitute of electrical signals, optical neural networks can run over and over faster while consuming less energy.

Nevertheless, these analog devices are liable to hardware errors that could make computations less precise. Microscopic imperfections in hardware components are one reason for these errors. In an optical neural network that has many connected components, errors can quickly accumulate.

Even with error-correction techniques, because of fundamental properties of the devices that make up an optical neural network, some amount of error is unavoidable. A network that’s large enough to be implemented in the true world can be far too imprecise to be effective.

MIT researchers have overcome this hurdle and located a approach to effectively scale an optical neural network. By adding a tiny hardware component to the optical switches that form the network’s architecture, they’ll reduce even the uncorrectable errors that may otherwise accumulate within the device.

Their work could enable a super-fast, energy-efficient, analog neural network that may function with the identical accuracy as a digital one. With this system, as an optical circuit becomes larger, the quantity of error in its computations actually decreases.  

“That is remarkable, because it runs counter to the intuition of analog systems, where larger circuits are alleged to have higher errors, in order that errors set a limit on scalability. This present paper allows us to handle the scalability query of those systems with an unambiguous ‘yes,’” says lead writer Ryan Hamerly, a visiting scientist within the MIT Research Laboratory for Electronics (RLE) and Quantum Photonics Laboratory and senior scientist at NTT Research.

Hamerly’s co-authors are graduate student Saumil Bandyopadhyay and senior writer Dirk Englund, an associate professor within the MIT Department of Electrical Engineering and Computer Science (EECS), leader of the Quantum Photonics Laboratory, and member of the RLE. The research is published today in .

Multiplying with light

An optical neural network consists of many connected components that function like reprogrammable, tunable mirrors. These tunable mirrors are called Mach-Zehnder Inferometers (MZI). Neural network data are encoded into light, which is fired into the optical neural network from a laser.

A typical MZI accommodates two mirrors and two beam splitters. Light enters the highest of an MZI, where it’s split into two parts which interfere with one another before being recombined by the second beam splitter after which reflected out the underside to the subsequent MZI within the array. Researchers can leverage the interference of those optical signals to perform complex linear algebra operations, often known as matrix multiplication, which is how neural networks process data.

But errors that may occur in each MZI quickly accumulate as light moves from one device to the subsequent. One can avoid some errors by identifying them upfront and tuning the MZIs so earlier errors are cancelled out by later devices within the array.

“It’s a quite simple algorithm should you know what the errors are. But these errors are notoriously difficult to determine since you only have access to the inputs and outputs of your chip,” says Hamerly. “This motivated us to take a look at whether it is feasible to create calibration-free error correction.”

Hamerly and his collaborators previously demonstrated a mathematical technique that went a step further. They may successfully infer the errors and appropriately tune the MZIs accordingly, but even this didn’t remove all of the error.

As a consequence of the basic nature of an MZI, there are instances where it’s unimaginable to tune a tool so all light flows out the underside port to the subsequent MZI. If the device loses a fraction of sunshine at each step and the array could be very large, by the tip there’ll only be a tiny little bit of power left.

“Even with error correction, there may be a fundamental limit to how good a chip might be. MZIs are physically unable to comprehend certain settings they must be configured to,” he says.

So, the team developed a recent style of MZI. The researchers added a further beam splitter to the tip of the device, calling it a 3-MZI since it has three beam splitters as a substitute of two. As a consequence of the best way this extra beam splitter mixes the sunshine, it becomes much easier for an MZI to achieve the setting it must send all light from out through its bottom port.

Importantly, the extra beam splitter is simply just a few micrometers in size and is a passive component, so it doesn’t require any extra wiring. Adding additional beam splitters doesn’t significantly change the scale of the chip.

Greater chip, fewer errors

When the researchers conducted simulations to check their architecture, they found that it might eliminate much of the uncorrectable error that hampers accuracy. And because the optical neural network becomes larger, the quantity of error within the device actually drops — the other of what happens in a tool with standard MZIs.

Using 3-MZIs, they might potentially create a tool sufficiently big for business uses with error that has been reduced by an element of 20, Hamerly says.

The researchers also developed a variant of the MZI design specifically for correlated errors. These occur because of manufacturing imperfections — if the thickness of a chip is barely mistaken, the MZIs may all be off by concerning the same amount, so the errors are all concerning the same. They found a approach to change the configuration of an MZI to make it robust to a majority of these errors. This system also increased the bandwidth of the optical neural network so it might run 3 times faster.

Now that they’ve showcased these techniques using simulations, Hamerly and his collaborators plan to check these approaches on physical hardware and proceed driving toward an optical neural network they’ll effectively deploy in the true world.

This research is funded, partially, by a National Science Foundation graduate research fellowship and the U.S. Air Force Office of Scientific Research.

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

1 COMMENT

0 0 votes
Article Rating
guest
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

1
0
Would love your thoughts, please comment.x
()
x