Acceleration Method for Learning Fine-Layered Optical Neural Networks

Aoyama, Kazuo; Sawada, Hiroshi

doi:10.48550/arXiv.2109.01731

Acceleration Method for Learning Fine-Layered Optical Neural Networks

preprint

posted on 2023-01-12, 13:56 authored by Kazuo Aoyama, Hiroshi Sawada

An optical neural network (ONN) is a promising system due to its high-speed and low-power operation. Its linear unit performs a multiplication of an input vector and a weight matrix in optical analog circuits. Among them, a circuit with a multiple-layered structure of programmable Mach-Zehnder interferometers (MZIs) can realize a specific class of unitary matrices with a limited number of MZIs as its weight matrix. The circuit is effective for balancing the number of programmable MZIs and ONN performance. However, it takes a lot of time to learn MZI parameters of the circuit with a conventional automatic differentiation (AD), which machine learning platforms are equipped with. To solve the time-consuming problem, we propose an acceleration method for learning MZI parameters. We create customized complex-valued derivatives for an MZI, exploiting Wirtinger derivatives and a chain rule. They are incorporated into our newly developed function module implemented in C++ to collectively calculate their values in a multi-layered structure. Our method is simple, fast, and versatile as well as compatible with the conventional AD. We demonstrate that our method works 20 times faster than the conventional AD when a pixel-by-pixel MNIST task is performed in a complex-valued recurrent neural network with an MZI-based hidden unit.

History

Disclaimer

This arXiv metadata record was not reviewed or approved by, nor does it necessarily express or reflect the policies or opinions of, arXiv.

Acceleration Method for Learning Fine-Layered Optical Neural Networks

History

Disclaimer

Usage metrics

Categories

Keywords

Licence

Exports