8.3 Training QNN with Gradient Descent
Since we are not only interested in building QNNs as standalone QML tools but also in comparing and contrasting them with classical neural networks, we start our review of QNN training methods with gradient descent – a ubiquitous classical ML algorithm.
8.3.1 The finite difference scheme
Training QNNs consists of specifying and executing a procedure that finds an optimal configuration of the adjustable rotation parameters 𝜃. Assume that a QNN is specified on n quantum registers with l layers of adjustable quantum gates, where each adjustable gate is controlled by a single parameter (𝜃ij)i=1,…,n; j=1,…,l. In this case, 𝜃 ∈Mn,l is an n×l matrix of adjustable network parameters:
Without loss of generality, we assume that we work with a binary classifier. The latter takes an input (a quantum state that encodes a sample from the dataset), applies a sequence...