Variational Quantum Algorithm Landscape Reconstruction by Low-Rank Tensor Completion

Tianyi Hao1, Zichang He2, Ruslan Shaydulin2, Marco Pistoia2, Swamit Tannu1 1Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706 USA 2Global Technology Applied Research, JPMorganChase, New York, NY 10017 USA
Abstract

Variational quantum algorithms (VQAs) are a broad class of algorithms with many applications in science and industry. Applying a VQA to a problem involves optimizing a parameterized quantum circuit by maximizing or minimizing a cost function. A particular challenge associated with VQAs is understanding the properties of associated cost functions. Having the landscapes of VQA cost functions can greatly assist in developing and testing new variational quantum algorithms, but they are extremely expensive to compute. Reconstructing the landscape of a VQA using existing techniques requires a large number of cost function evaluations, especially when the dimension or the resolution of the landscape is high. To address this challenge, we propose a low-rank tensor-completion-based approach for local landscape reconstruction. By leveraging compact low-rank representations of tensors, our technique can overcome the curse of dimensionality and handle high-resolution landscapes. We demonstrate the power of landscapes in VQA development by showcasing practical applications of analyzing penalty terms for constrained optimization problems and examining the probability landscapes of certain basis states.

Index Terms:
Variational Quantum Algorithms, Landscape Reconstruction, Tensor Networks, Quantum Optimization

I Introduction

Quantum computers have the potential to accelerate a wide range of scientific and commercial applications  [1, 2]. Variational quantum algorithms (VQAs) have attracted a lot of attention due to their broad applicability and low resource requirements, enabling their execution on near-term noisy devices [3]. Applying a VQA to a given problem requires implementing multiple components, namely a quantum operator representing the problem of interest (Hamiltonian), a parameterized quantum circuit (ansatz), a classical optimizer, a cost function to minimize, an initialization strategy for the parameters, and oftentimes a strategy for taking and postprocessing measurements to mitigate noise or handle problem constraints. In a VQA, a classical optimizer iteratively updates the circuit parameters to minimize a cost function value. Then the quantum state prepared by the parameterized circuit with optimized parameters correspond to high-quality solutions to the target problem.

The performance of a VQA depends crucially on the choice and configuration of the components described above [4, 5, 6, 7, 8, 9, 10]. A poor choice of any one component can drastically decrease the algorithm’s efficacy. Unfortunately, VQAs are difficult to analyze, tune, and debug due to their heuristic nature and complex structure. The need to obtain sufficient statistics on the circuit output and the presence of hardware noise contribute to the high resource requirements associated with VQA debugging. As a consequence, researchers typically try a few configurations in an ad-hoc manner to identify the best-performing setup, which often leads to suboptimal performance.

The landscape of cost function values given by a range of parameter values can provide much more information than just the cost values from a single optimization [11, 12, 13]. In machine learning, loss function landscapes play a pivotal role in the development and fine-tuning of models [14]. Similarly, VQAs can benefit from analyzing landscapes as they also have an optimizer-driven outer loop. By analyzing and visualizing landscapes in conjunction with the optimizer trace, researchers can gain insights into the optimization process. They can identify sources of misconfiguration and suboptimal behavior of VQAs and adjust the components accordingly to improve performance. Furthermore, these landscapes can reveal the presence of local minima, saddle points, and flat regions, which are critical for understanding the behavior of VQAs.

However, obtaining VQA landscapes can be extremely expensive. The naive way of generating a landscape is by performing a grid search, which involves computing the value of every point on a parameter grid. The number of points in the grid is exponential in the number of parameters of the landscape, rendering this method intractable for all but the smallest instances.

Recently, studies have proposed using compressed sensing to reconstruct VQA landscapes using only a small number of evaluations [11, 15, 16, 17]. This class of methods leverages the observation that the VQA landscapes are sparse in the frequency domain under Fourier transform. While such techniques have been shown to be highly accurate and stable, their applicability is limited. Since compressed sensing relies on data exhibiting symmetry in the time domain, the landscape range needs to be reasonably global to exploit periodicities in VQAs. Specifically, when we want to look at details in local regions, the zoomed-in resolution is often inadequate while local regions around the optimal points are always of more interest in the practical usage of VQAs. More critically, the landscape is represented as a dense tensor of unknowns or values during or after compressed sensing [11], which incurs an exponential memory and computational cost. Thus, existing techniques can only be performed on relatively low-dimensional and low-resolution landscapes, restricting the domain of applicability.

In this work, we introduce a technique for VQA landscape reconstruction from a small number of samples using low-rank tensor completion. To motivate our technique, we show extensive evidence that local VQA landscapes can be well-approximated by low-rank tensors. We implement the proposed workflow as a user-friendly, highly configurable, open-source Python package, available at https://github.com/QUEST-UWMadison/OSCAR. We demonstrate the power and broad applicability of our technique by performing numerical experiments with different VQA constructions applied to optimization and chemistry problems. We identify novel applications of landscape reconstruction, including the analysis of penalty terms for constrained problems and insights with the probability landscapes of basis states.

II Background

II-A Variational Quantum Algorithms

Variational quantum algorithms (VQAs) are a class of algorithms that leverage classical optimization techniques to train parameterized quantum circuits |Ψ(𝜽)ketΨ𝜽\ket{\Psi(\bm{\theta})}| start_ARG roman_Ψ ( bold_italic_θ ) end_ARG ⟩ (ansatz) such that the quantum state obtained by the circuit optimized parameters 𝜽𝜽\bm{\theta}bold_italic_θ corresponds to high-quality solutions to the target problem. The dimensionality of 𝜽𝜽\bm{\theta}bold_italic_θ can be generally high in order to enable the expressivity of the ansatz. As the VQAs are known to suffer from barren plateau issues and contain lots of local optimum [18, 19], a good initialization of 𝜽𝜽\bm{\theta}bold_italic_θ and a carefully chosen range for parameter search are necessary to enable high-quality solutions. This provides the need for high-resolution cost landscapes under a local range of parameters.

The variational quantum eigensolver (VQE) is a VQA designed for finding the ground state of a given molecule [20], and has been generalized with various ansatzes to solve a wide range of problems [3]. The unitary coupled-cluster singles and doubles (UCCSD) ansatz [21] is a chemistry-inspired VQE ansatz suitable for solving quantum chemistry problems.

The quantum approximate optimization algorithm (QAOA) [22, 23] can be viewed as a VQA with a problem-dependent ansatz. Specifically, QAOA ansatz is given by

|Ψ(𝜽)QAOA=j=1p[eiβj𝑯𝑩eiγj𝑯𝑪]|+,subscriptketΨ𝜽QAOAsuperscriptsubscriptproduct𝑗1𝑝delimited-[]superscript𝑒𝑖subscript𝛽𝑗subscript𝑯𝑩superscript𝑒𝑖subscript𝛾𝑗subscript𝑯𝑪ket\ket{\Psi(\bm{\theta})}_{\text{QAOA}}=\prod_{j=1}^{p}\left[e^{-i\beta_{j}\bm{H% _{B}}}e^{-i\gamma_{j}\bm{H_{C}}}\right]\ket{+},| start_ARG roman_Ψ ( bold_italic_θ ) end_ARG ⟩ start_POSTSUBSCRIPT QAOA end_POSTSUBSCRIPT = ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT [ italic_e start_POSTSUPERSCRIPT - italic_i italic_β start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT bold_italic_H start_POSTSUBSCRIPT bold_italic_B end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - italic_i italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT bold_italic_H start_POSTSUBSCRIPT bold_italic_C end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ] | start_ARG + end_ARG ⟩ , (1)

where p𝑝pitalic_p is the number of layers, β1,,βpsubscript𝛽1subscript𝛽𝑝\beta_{1},\ldots,\beta_{p}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_β start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT and γ1,,γjsubscript𝛾1subscript𝛾𝑗\gamma_{1},\ldots,\gamma_{j}italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT are free parameters, 𝑯𝑩subscript𝑯𝑩\bm{H_{B}}bold_italic_H start_POSTSUBSCRIPT bold_italic_B end_POSTSUBSCRIPT is the mixing Hamiltonian and 𝑯𝑪subscript𝑯𝑪\bm{H_{C}}bold_italic_H start_POSTSUBSCRIPT bold_italic_C end_POSTSUBSCRIPT is the Hamiltonian encoding the objective function to be optimized. QAOA has been shown to provide an asymptotic algorithmic speedup on some problems [24, 25], motivating its study.

II-B Tensor Networks

Tensors are multi-dimensional arrays that generalize vectors and matrices. An index or bond of a tensor is one of its dimensions, and the term bond dimension refers to the size of that dimension. For example, a vector 𝒗n𝒗superscript𝑛\bm{v}\in\mathcal{R}^{n}bold_italic_v ∈ caligraphic_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT (order-1 tensor) has one index with bond dimension n𝑛nitalic_n, and a matrix 𝑴m×n𝑴superscript𝑚𝑛\bm{M}\in\mathcal{R}^{m\times n}bold_italic_M ∈ caligraphic_R start_POSTSUPERSCRIPT italic_m × italic_n end_POSTSUPERSCRIPT (order-2 tensor) has two indices with bond dimensions m𝑚mitalic_m and n𝑛nitalic_n. The multiplication between vectors and matrices can be viewed as a summation over the shared index: 𝑴𝒗=iM:,ivi𝑴𝒗subscript𝑖subscript𝑀:𝑖subscript𝑣𝑖\bm{M}\bm{v}=\sum_{i}M_{:,i}v_{i}bold_italic_M bold_italic_v = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT : , italic_i end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. The generalization of multiplication is tensor contraction, which sums over the shared tensor indices.

A tensor network [26, 27, 28, 29, 30, 31] is a generalized graph (with dangling edges) of tensors connected by shared indices. In particular, the 1D chain tensor networks, known as Tensor Train (TT) [32] in Mathematics, or Matrix Product States (MPS) [33] in Physics and Chemistry, are very successful in various applications. A d𝑑ditalic_d-site MPS consists of 2 order-2 tensors 𝕄(1),𝕄(d)superscript𝕄1superscript𝕄𝑑\mathbb{M}^{(1)},\mathbb{M}^{(d)}blackboard_M start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , blackboard_M start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT and d2𝑑2d-2italic_d - 2 order-3 tensors 𝕄(2),𝕄(d1)superscript𝕄2superscript𝕄𝑑1\mathbb{M}^{(2)},\cdots\mathbb{M}^{(d-1)}blackboard_M start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT , ⋯ blackboard_M start_POSTSUPERSCRIPT ( italic_d - 1 ) end_POSTSUPERSCRIPT, such that the full contraction

r1=1R1r2=1R2rd1=1Rd1𝕄(1)[i1,r1]𝕄(2)[r1,i2,r2]superscriptsubscriptsubscript𝑟11subscript𝑅1superscriptsubscriptsubscript𝑟21subscript𝑅2superscriptsubscriptsubscript𝑟𝑑11subscript𝑅𝑑1superscript𝕄1subscript𝑖1subscript𝑟1superscript𝕄2subscript𝑟1subscript𝑖2subscript𝑟2\displaystyle\sum_{{r_{1}}=1}^{R_{1}}\sum_{{r_{2}}=1}^{R_{2}}\ldots\sum_{{r_{d% -1}}=1}^{R_{d-1}}\mathbb{M}^{(1)}\left[i_{1},r_{1}\right]\mathbb{M}^{(2)}\left% [r_{1},i_{2},r_{2}\right]\cdots∑ start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT … ∑ start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_d - 1 end_POSTSUBSCRIPT = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R start_POSTSUBSCRIPT italic_d - 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT blackboard_M start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT [ italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] blackboard_M start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT [ italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ⋯ (2)
𝕄(d1)[rd2,id1,rd1]𝕄(d)[rd1,id]superscript𝕄𝑑1subscript𝑟𝑑2subscript𝑖𝑑1subscript𝑟𝑑1superscript𝕄𝑑subscript𝑟𝑑1subscript𝑖𝑑\displaystyle\mathbb{M}^{(d-1)}\left[r_{d-2},i_{d-1},r_{d-1}\right]\mathbb{M}^% {(d)}\left[r_{d-1},i_{d}\right]blackboard_M start_POSTSUPERSCRIPT ( italic_d - 1 ) end_POSTSUPERSCRIPT [ italic_r start_POSTSUBSCRIPT italic_d - 2 end_POSTSUBSCRIPT , italic_i start_POSTSUBSCRIPT italic_d - 1 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT italic_d - 1 end_POSTSUBSCRIPT ] blackboard_M start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT [ italic_r start_POSTSUBSCRIPT italic_d - 1 end_POSTSUBSCRIPT , italic_i start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ]

gives the dense tensor it represents. The bond dimensions R1,R2,,Rd1subscript𝑅1subscript𝑅2subscript𝑅𝑑1R_{1},R_{2},\cdots,R_{d-1}italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_R start_POSTSUBSCRIPT italic_d - 1 end_POSTSUBSCRIPT of the interconnected indices r1,r2,,rd1subscript𝑟1subscript𝑟2subscript𝑟𝑑1r_{1},r_{2},\cdots,r_{d-1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_r start_POSTSUBSCRIPT italic_d - 1 end_POSTSUBSCRIPT are referred to as ranks.

Refer to caption
Figure 1: Overview of the proposed landscape reconstruction method by low-rank tensor completion. A small number of quantum circuit outputs are used to perform tensor completion on the low-rank representation of the landscape.

III Landscape Reconstruction by Low-rank Tensor Completion

Without loss of generality, the cost function of a VQA f(𝜽)𝑓𝜽f(\bm{\theta})italic_f ( bold_italic_θ ) can be defined as Ψ(𝜽)|𝑶|Ψ(𝜽)quantum-operator-productΨ𝜽𝑶Ψ𝜽\braket{\Psi(\bm{\theta})}{\bm{O}}{\Psi(\bm{\theta})}⟨ start_ARG roman_Ψ ( bold_italic_θ ) end_ARG | start_ARG bold_italic_O end_ARG | start_ARG roman_Ψ ( bold_italic_θ ) end_ARG ⟩, where 𝑶𝑶\bm{O}bold_italic_O is some observable of interest. Generally, 𝜽𝜽\bm{\theta}bold_italic_θ is a d𝑑ditalic_d-dimensional continuous variable. To enable landscape reconstruction, we first discretize 𝜽𝜽\bm{\theta}bold_italic_θ using a d𝑑ditalic_d-dimensional grid, which discretizes the i𝑖iitalic_i-th parameter into Nisubscript𝑁𝑖N_{i}italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT points. As a consequence of the curse of dimensionality, the space (memory) cost of landscape grows exponentially with d𝑑ditalic_d as i=1dNisuperscriptsubscriptproduct𝑖1𝑑subscript𝑁𝑖\prod_{i=1}^{d}N_{i}∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT Storing the grid explicitly is impossible for even moderate d𝑑ditalic_d.

Given a d𝑑ditalic_d-dimensional VQA parameter 𝜽=(θ1,,θd)𝜽subscript𝜃1subscript𝜃𝑑\bm{\theta}=\left(\theta_{1},\ldots,\theta_{d}\right)bold_italic_θ = ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_θ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) with indices (i1,,id)subscript𝑖1subscript𝑖𝑑\left(i_{1},\ldots,i_{d}\right)( italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ), f(𝜽)𝑓𝜽f(\bm{\theta})italic_f ( bold_italic_θ ) can be approximated by Eq. (2). The space complexity of a TT representation is R1N1+Rd1Nd+i=2d1Ri1RiNisubscript𝑅1subscript𝑁1subscript𝑅𝑑1subscript𝑁𝑑superscriptsubscript𝑖2𝑑1subscript𝑅𝑖1subscript𝑅𝑖subscript𝑁𝑖R_{1}N_{1}+R_{d-1}N_{d}+\sum_{i=2}^{d-1}R_{i-1}R_{i}N_{i}italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_R start_POSTSUBSCRIPT italic_d - 1 end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_i = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT italic_R start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, reducing the memory requirements from exponential to linear with respect to d𝑑ditalic_d and Nisubscript𝑁𝑖N_{i}italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. The compact TT representation accurately expresses the landscape as long as it is low rank. We show evidence that the landscape is low-rank in Section III-B. Here, we use the TT format, but other tensor network and tensor decomposition formats are generally applicable.

The proposed landscape reconstruction process is illustrated in Fig. 1. To begin with, we need N𝑁Nitalic_N samples from the (N1,N2,,Nd)subscript𝑁1subscript𝑁2subscript𝑁𝑑(N_{1},N_{2},\ldots,N_{d})( italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_N start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT )-sized full landscape tensor (Ni=1dNimuch-less-than𝑁superscriptsubscriptproduct𝑖1𝑑subscript𝑁𝑖N\ll\prod_{i=1}^{d}N_{i}italic_N ≪ ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) and execute the associated circuits in a quantum computer to obtain 𝒇(𝜽)={f(𝜽i)}i=1N𝒇𝜽superscriptsubscript𝑓superscript𝜽𝑖𝑖1𝑁\bm{f}(\bm{\theta})=\{f(\bm{\theta}^{i})\}_{i=1}^{N}bold_italic_f ( bold_italic_θ ) = { italic_f ( bold_italic_θ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. Next, we create the data structure for low-rank representation 𝒇^(𝜽)={f^(𝜽i)}i=1Nbold-^𝒇𝜽superscriptsubscript^𝑓superscript𝜽𝑖𝑖1𝑁\bm{\hat{f}}(\bm{\theta})=\{\hat{f}(\bm{\theta}^{i})\}_{i=1}^{N}overbold_^ start_ARG bold_italic_f end_ARG ( bold_italic_θ ) = { over^ start_ARG italic_f end_ARG ( bold_italic_θ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT and solve a tensor completion problem through optimizing low-rank factors {𝕄(i)}i=1dsuperscriptsubscriptsuperscript𝕄𝑖𝑖1𝑑\{\mathbb{M}^{(i)}\}_{i=1}^{d}{ blackboard_M start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. After the tensor completion, given any unsampled parameter of the ansatz, we can predict its output through the full contraction given by Eq. 2.

III-A Tensor Completion Formulation

The tensor completion problem for the tensor-train format tensor is formulated as [34]:

min{𝕄(i)}i=1d𝒇(𝜽)𝒇^(𝜽)Fsubscriptsuperscriptsubscriptsuperscript𝕄𝑖𝑖1𝑑subscriptnorm𝒇𝜽^𝒇𝜽𝐹~{}\min_{\{\mathbb{M}^{(i)}\}_{i=1}^{d}}\|\bm{f}(\bm{\theta})-\hat{\bm{f}}(\bm% {\theta})\|_{F}roman_min start_POSTSUBSCRIPT { blackboard_M start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ bold_italic_f ( bold_italic_θ ) - over^ start_ARG bold_italic_f end_ARG ( bold_italic_θ ) ∥ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT (3)

Here, we mainly use the algorithm in Ref. [35] to solve the low-rank tensor completion problem, where the initial approximation of the factor matrices {𝕄(i)}i=1dsuperscriptsubscriptsuperscript𝕄𝑖𝑖1𝑑\{\mathbb{M}^{(i)}\}_{i=1}^{d}{ blackboard_M start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT are set based on an ANOVA (analysis of variance) approach and then optimized in an ALS (alternating least square) algorithm. We note that alternative algorithms [36, 37] for solving tensor completion problems can also be used in our landscape reconstruction procedure.

III-B Evidence that Local VQA Landscapes are Low-rank

To verify that the local VQA landscapes are low-rank, we obtain the landscapes of various configurations in the dense tensor format by numerical simulation. With the dense landscapes, we perform sequential reshaping and singular value decompositions (SVD) along each dimension of the dense tensor to transform it into the TT format. This process is known as the TT decomposition, where the approximation error can be upper bounded by the truncation threshold of the singular values during SVDs. The number of remaining singular values after truncation is the aforementioned rank of the TT.

VQA Ansatz QAOA p=1𝑝1p=1italic_p = 1 QAOA p=2𝑝2p=2italic_p = 2 UCCSD H2
Truncation threshold 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 105superscript10510^{-5}10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 105superscript10510^{-5}10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 105superscript10510^{-5}10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT
Range Resolution Rank Space Rank Space Rank Space Rank Space Rank Space Rank Space
π4𝜋4\frac{\pi}{4}divide start_ARG italic_π end_ARG start_ARG 4 end_ARG 16 2 4×\times× 2 4×4\times4 × 4 6 3 84×84\times84 × 7 13 6 23×\times× 2 2 32×\times× 3 3 17×\times×
32 2 8×\times× 2 8×8\times8 × 4 5 3 780×780\times780 × 7 12 6 194×\times× 2 2 128×\times× 3 3 68×\times×
64 2 16×\times× 2 16×16\times16 × 4 6 3 5350×5350\times5350 × 7 13 6 1440×\times× 2 2 512×\times× 3 3 273×\times×
π8𝜋8\frac{\pi}{8}divide start_ARG italic_π end_ARG start_ARG 8 end_ARG 16 1 8×\times× 2 4×4\times4 × 2 3 3 205×205\times205 × 5 9 4 46×\times× 1 1 85×\times× 3 3 17×\times×
32 1 16×\times× 2 8×8\times8 × 2 3 2 2048×2048\times2048 × 5 9 4 364×\times× 1 1 341×\times× 3 3 68×\times×
64 1 32×\times× 2 16×16\times16 × 2 3 2 16384×16384\times16384 × 5 9 4 2913×2913\times2913 × 1 1 1365×\times× 3 3 273×\times×
π16𝜋16\frac{\pi}{16}divide start_ARG italic_π end_ARG start_ARG 16 end_ARG 16 1 8×\times× 2 4×4\times4 × 2 2 2 341×\times× 4 6 4 73×\times× 1 1 85×\times× 3 2 23×\times×
32 1 16×\times× 2 8×8\times8 × 2 2 1 3641×\times× 4 6 4 585×\times× 1 1 341×\times× 3 2 93×\times×
64 1 32×\times× 2 16×16\times16 × 2 2 1 29127×29127\times29127 × 4 6 4 4681×4681\times4681 × 1 1 1365×\times× 3 2 372×\times×
TABLE I: Rank and space reduction of representing VQA landscapes in the TT format with specified truncation thresholds. Each row shows a different landscape range and resolution combination.

Table I summarizes the rank and space reduction results by performing the TT decomposition of a fully sampled landscape tensor. We performed one-and-two-layer QAOA solving the Maximum Cut (MaxCut) problem and VQE with the UCCSD ansatz solving the Hydrogen molecule.

For each VQA configuration, we show the ranks and space reduction after truncating singular values that are less than 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT and 105superscript10510^{-5}10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT of the 2-norm of all singular values, respectively. For simplicity, we let each dimension share the same resolution. The interested parameter space is defined as [θcenterl2,θcenter+l2]subscript𝜃center𝑙2subscript𝜃center𝑙2\left[\theta_{\text{center}}-\frac{l}{2},\theta_{\text{center}}+\frac{l}{2}\right][ italic_θ start_POSTSUBSCRIPT center end_POSTSUBSCRIPT - divide start_ARG italic_l end_ARG start_ARG 2 end_ARG , italic_θ start_POSTSUBSCRIPT center end_POSTSUBSCRIPT + divide start_ARG italic_l end_ARG start_ARG 2 end_ARG ], where θcentersubscript𝜃center\theta_{\text{center}}italic_θ start_POSTSUBSCRIPT center end_POSTSUBSCRIPT is the center of parameter space and l𝑙litalic_l is the range of a parameter. We test landscape ranges l{π4,π8,π16}𝑙𝜋4𝜋8𝜋16l\in\{\frac{\pi}{4},\frac{\pi}{8},\frac{\pi}{16}\}italic_l ∈ { divide start_ARG italic_π end_ARG start_ARG 4 end_ARG , divide start_ARG italic_π end_ARG start_ARG 8 end_ARG , divide start_ARG italic_π end_ARG start_ARG 16 end_ARG } and resolutions 16161616, 32323232, and 64646464 along each dimension. The landscape is centered at informed initial points of respective VQAs. For QAOA with MaxCut, we set 𝜽centersubscript𝜽center\bm{\theta}_{\text{center}}bold_italic_θ start_POSTSUBSCRIPT center end_POSTSUBSCRIPT based on QAOA parameter concentration in [38]. For the Hydrogen molecule, we use the state given by the Hartree-Fock method. These initial points have been shown to be reasonably close to the optimal points. From practical perspectives, researchers often use them as initializations in actual VQA experiments, and local optimizations in a small surrounding region are typically enough to find the optimal point.

We observe that generally, the rank remains low compared to the resolution of the landscape, resulting in substantial space reduction. Notably, it is independent of the landscape resolution, enabling the sparse representation of very dense landscapes. As expected, higher-dimensional landscapes have higher ranks, and more local landscapes have lower ranks.

III-C Evaluation

We implement the TT representation of landscapes and the tensor completion reconstruction workflow as part of the VQA helper package OSCAR (https://github.com/QUEST-UWMadison/OSCAR). We employ the teneva package [39, 35] for the tensor completion functionalities and provide a unified interface with the compressed-sensing-based reconstruction method in [11]. Note that due to the discrepancy in the applicable scope of the two methods, it is not possible to directly compare their performance.

Refer to caption
(a) QAOA p=1𝑝1p=1italic_p = 1
Refer to caption
(b) QAOA p=2𝑝2p=2italic_p = 2
Refer to caption
(c) UCCSD H2
Figure 2: Validation error as a function of sampling fraction for resolutions 16, 32, and 64. The reconstruction error drops steadily as the sampling fraction increases. Note that at the same error level, the required sampling fraction declines exponentially as the resolution doubles.

We perform numerical simulation of QAOA of various depths, solving the MaxCut problem and VQE with the UCCSD ansatz solving the Hydrogen molecule. We fix the landscape range to be π16𝜋16\frac{\pi}{16}divide start_ARG italic_π end_ARG start_ARG 16 end_ARG and vary the resolution and sampling fraction. We use second-order ANOVA decomposition, followed by 1000100010001000 iterations of ALS with regularization of 0.010.010.010.01. We set the rank for both ANOVA and ALS to be 2222, 6666, and 3333 for p=1𝑝1p=1italic_p = 1, p=2𝑝2p=2italic_p = 2 QAOA, and H2, respectively. We take 5%percent55\%5 % of the samples as the validation set and calculate the relative error between the true values and the reconstructed values.

Fig. 2 shows the validation error as a function of sampling fraction for resolutions 16161616, 32323232, and 64646464. We see that the error drops steadily as the sampling fraction increases. Since the three problem configurations have two, four, and three dimensions respectively, doubling the resolution increases the total number of points on the landscape by 4444, 16161616, and 8888 times. Importantly, note that the sampling fraction for achieving the same error also drops exponentially as the resolution doubles. Thus, the tensor-completion-based reconstruction allows us to greatly enhance the resolution of the discretization of the landscape with a remarkably small cost in comparison. Approximately, doubling the resolution only requires twice the amount of samples, instead of 2dsuperscript2𝑑2^{d}2 start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT times, to keep the same level of accuracy.

Refer to caption
Figure 3: 2D slices of the 3D landscape of UCCSD ansatz solving the Hydrogen molecule. Each figure has one of the three parameters fixed at the value given by the Hartree-Fock method. The optimization trace by the COBYLA method is projected and overlaid on each landscape slice, showing that only θ3subscript𝜃3\theta_{3}italic_θ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT needs to be predominantly optimized.

For a reference of the error level, Fig. 3 shows an example landscape reconstruction of VQE with the UCCSD ansatz solving the Hydrogen molecule. Since the landscape is three-dimensional, we present three 2D slices where one of the parameters is fixed at the center. The landscape range is π8𝜋8\frac{\pi}{8}divide start_ARG italic_π end_ARG start_ARG 8 end_ARG and the resolution is 64. The validation error is 5.4×1045.4superscript1045.4\times 10^{-4}5.4 × 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT. Visually, there is little discrepancy between the reconstructed landscape and the exactly evaluated landscape.

With the optimizer trace overlaid, Fig. 3 also demonstrates a simple use case of the landscape. We use COBYLA as the optimizer, which has one initial query in each direction, forming the triangles we see in the projected trace. The center of the landscape is the initial point of the optimization, which is derived by the Hartree-Fock method. We can clearly see that only θ3subscript𝜃3\theta_{3}italic_θ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT of the three parameters needs to be optimized. For θ1subscript𝜃1\theta_{1}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and θ2subscript𝜃2\theta_{2}italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, COBYLA decides that the Hartree-Fock values are close enough to the optimal values and do not need much adjustment. Thus, in actual experiments when the number of quantum processor queries is very limited, we can fix θ1subscript𝜃1\theta_{1}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and θ2subscript𝜃2\theta_{2}italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and only optimize for θ3subscript𝜃3\theta_{3}italic_θ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. Doing so is especially helpful for optimizers assuming a complex model and requiring initial queries that are superlinear in the number of parameters.

IV Landscape Applications

In this section, we showcase novel examples of how landscape information can be used in VQA research and development. For additional applications such as efficient benchmarking optimizers, configuring noise mitigation methods, and informed initialization, the reader is referred to Ref. [11].

IV-A Understanding Penalty Terms

One of the central challenges in applying VQAs to many real-world optimization problems is to take into consideration their constraints. VQAs can address these constraints by enforcing them in the ansatz, but this approach often leads to deep circuits that behave poorly on near-term noisy hardware. The simplest and most widely used approach is to introduce a penalty term 𝑯Psubscript𝑯𝑃\bm{H}_{P}bold_italic_H start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT in the Hamiltonian, responsible for enforcing the constraint. VQAs then work with the penalized Hamiltonian 𝑯=𝑯𝑪+λ𝑯P𝑯subscript𝑯𝑪𝜆subscript𝑯𝑃\bm{H}=\bm{H_{C}}+\lambda\bm{H}_{P}bold_italic_H = bold_italic_H start_POSTSUBSCRIPT bold_italic_C end_POSTSUBSCRIPT + italic_λ bold_italic_H start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT to balance between solution quality and solution feasibility, where 𝑯Csubscript𝑯𝐶\bm{H}_{C}bold_italic_H start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT is the cost Hamiltonian and λ𝜆\lambdaitalic_λ is a penalty factor that controls the effectiveness of the penalty. VQAs rely on a carefully selected penalty factor to ensure high-quality in-constraint solutions can be found with high probability [40, 41, 6]. Unfortunately, like configuring other VQA components, tuning the effect of the penalty term is typically carried out in an empirical fashion, presenting a nontrivial challenge.

We can employ VQA landscapes to help analyze, visualize, and configure penalty terms. Each point of the landscape is the expectation value of the circuit and the Hamiltonian Ψ(𝜽)|𝑯|Ψ(𝜽)quantum-operator-productΨ𝜽𝑯Ψ𝜽\braket{\Psi(\bm{\theta})}{\bm{H}}{\Psi(\bm{\theta})}⟨ start_ARG roman_Ψ ( bold_italic_θ ) end_ARG | start_ARG bold_italic_H end_ARG | start_ARG roman_Ψ ( bold_italic_θ ) end_ARG ⟩. By the linearity of expectation, it can be decomposed to linear combinations of expectations of subterms in the Hamiltonian:

Ψ(𝜽)|𝑯𝑪+λ𝑯P|Ψ(𝜽)quantum-operator-productΨ𝜽subscript𝑯𝑪𝜆subscript𝑯𝑃Ψ𝜽\displaystyle\braket{\Psi(\bm{\theta})}{\bm{H_{C}}+\lambda\bm{H}_{P}}{\Psi(\bm% {\theta})}⟨ start_ARG roman_Ψ ( bold_italic_θ ) end_ARG | start_ARG bold_italic_H start_POSTSUBSCRIPT bold_italic_C end_POSTSUBSCRIPT + italic_λ bold_italic_H start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT end_ARG | start_ARG roman_Ψ ( bold_italic_θ ) end_ARG ⟩
=\displaystyle== Ψ(𝜽)|𝑯𝑪|Ψ(𝜽)+λΨ(𝜽)|𝑯𝑷|Ψ(𝜽).quantum-operator-productΨ𝜽subscript𝑯𝑪Ψ𝜽𝜆quantum-operator-productΨ𝜽subscript𝑯𝑷Ψ𝜽\displaystyle\braket{\Psi(\bm{\theta})}{\bm{H_{C}}}{\Psi(\bm{\theta})}+\lambda% \braket{\Psi(\bm{\theta})}{\bm{H_{P}}}{\Psi(\bm{\theta})}.⟨ start_ARG roman_Ψ ( bold_italic_θ ) end_ARG | start_ARG bold_italic_H start_POSTSUBSCRIPT bold_italic_C end_POSTSUBSCRIPT end_ARG | start_ARG roman_Ψ ( bold_italic_θ ) end_ARG ⟩ + italic_λ ⟨ start_ARG roman_Ψ ( bold_italic_θ ) end_ARG | start_ARG bold_italic_H start_POSTSUBSCRIPT bold_italic_P end_POSTSUBSCRIPT end_ARG | start_ARG roman_Ψ ( bold_italic_θ ) end_ARG ⟩ .

Therefore, with the cost landscape and the penalty landscape, we can trivially get arbitrary linear combinations of them. The effect of the penalty term can thus be understood by comparing landscapes with varying penalty factors.

Refer to caption
(a)   
Refer to caption
(b)   
Refer to caption
(c)   
Refer to caption
(d)   
Refer to caption
(e)   
Refer to caption
(f)   
Refer to caption
(g)   
Refer to caption
(h)   
Refer to caption
(i)   
Figure 4: Different combinations of the cost, penalty, and penalized Hamiltonians used in ansatz construction (rows) and as observable in the expectation (columns). That is, each row shares the same ansatz, constructed with the cost, penalty, and penalized Hamiltonians, from top to bottom. From left to right, each column uses the cost, penalty, and penalized Hamiltonians as the observable. So as an example, Fig. 4 (g) shows the landscape of the penalized ansatz with the cost Hamiltonian Ψ𝑯(𝜽)|𝑯C|Ψ𝑯(𝜽)quantum-operator-productsubscriptΨ𝑯𝜽subscript𝑯𝐶subscriptΨ𝑯𝜽\braket{\Psi_{\bm{H}}(\bm{\theta})}{\bm{H}_{C}}{\Psi_{\bm{H}}(\bm{\theta})}⟨ start_ARG roman_Ψ start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT ( bold_italic_θ ) end_ARG | start_ARG bold_italic_H start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT end_ARG | start_ARG roman_Ψ start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT ( bold_italic_θ ) end_ARG ⟩.

The above method applies to cases where the ansatz does not change with the Hamiltonian. Here we show a more complicated example. For the original QAOA, whose ansatz is constructed with the knowledge of the Hamiltonian, we need to take additional considerations. Fig. 4 shows different combinations of the cost, penalty, and penalized Hamiltonians of the Portfolio Optimization problem used in ansatz construction and as observable in the expectation. From top to bottom, each row corresponds to the ansatz constructed with the cost, penalty, and penalized Hamiltonians, respectively. Similarly, the columns correspond to the Hamiltonians used as observable in the expectation. So as an example, Fig. 4 (g) shows the landscape of the penalized ansatz with the cost Hamiltonian Ψ𝑯(𝜽)|𝑯C|Ψ𝑯(𝜽)quantum-operator-productsubscriptΨ𝑯𝜽subscript𝑯𝐶subscriptΨ𝑯𝜽\braket{\Psi_{\bm{H}}(\bm{\theta})}{\bm{H}_{C}}{\Psi_{\bm{H}}(\bm{\theta})}⟨ start_ARG roman_Ψ start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT ( bold_italic_θ ) end_ARG | start_ARG bold_italic_H start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT end_ARG | start_ARG roman_Ψ start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT ( bold_italic_θ ) end_ARG ⟩. The range of landscapes is chosen to cover exactly one cycle (a unique region) in the periodic parameter space.

We observe that cost ansatz with penalty Hamiltonian (Fig. 4 (b)) and penalty ansatz with cost Hamiltonian (Fig. 4 (d)) are random noise-like signals, making the cost and penalty ansatzes with the penalized Hamiltonian (Fig. 4 (c) and (f)) just a noisy version of the cost landscape and the penalty landscape. This is due to the relative value of the penalty or the cost being too small, making the landscape impossible to resolve. Similar observations have motivated the rescaling of QAOA cost Hamiltonian in Refs. [7, 9]. Thus, for constrained problems, the cost Hamiltonian or the penalty Hamiltonian cannot be used alone when constructing the QAOA ansatz, even if the expectation observable contains both terms.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 5: Examples showing the effect of poorly chosen penalty factors. Each row uses the same penalty factor. From the first to the third row, the penalty factors are 0.01, 0.5, and 1. From left to right, the columns show cost, penalty, and penalized Hamiltonians as the observable in expectation values.

But how does an ansatz constructed with a penalized Hamiltonian achieve a balance between the two terms? Based on the magnitude of the cost and penalty landscapes, we select a penalty factor of 0.1 to bring them to the same scale. We see that the cost landscape (Fig. 4 (g)) remains approximately the same. In particular, the lowest value achievable, which corresponds to the highest solution quality possible, is not sacrificed too much. On the other hand, the penalty landscape (Fig. 4 (h)) becomes completely different, where the feasible region aligns considerably better with the cost landscape. As a result, the optimal point of the combined landscape (Fig. 4 (i)) is reasonably close to the optimal point of the cost landscape to produce high-quality solutions while being highly feasible according to the penalty landscape.

Fig. 5 shows three examples where the penalty factor is poorly chosen. From the first to the third row, the penalty factors are 0.01, 0.5, and 1. From left to right, the columns are cost, penalty, and penalized landscapes. We notice that decreasing the penalty factor too much transforms the penalty landscape into a random noise-like signal while increasing the penalty factor over the balance noticeably deforms the cost landscape. The results of both are not desirable.

IV-B Probability Landscape of Basis State

Refer to caption
(a)   
Refer to caption
(b)   
Refer to caption
(c)   
Figure 6: Three unique probability landscapes of basis states for a 4444-qubit 3333-regular graph MaxCut instance. The repetition of these landscapes is related to the Hamming weight of the basis state bitstring. (a) Hamming weight 00 or 4444; (b) Hamming weight 1111 or 3333; (c) Hamming weight 2222.

A quantum state can be described by the amplitudes of the basis states:

|Ψ=x{0,1}nαx|x,ketΨsubscript𝑥superscript01𝑛subscript𝛼𝑥ket𝑥\displaystyle\ket{\Psi}=\sum_{x\in\{0,1\}^{n}}\alpha_{x}\ket{x},| start_ARG roman_Ψ end_ARG ⟩ = ∑ start_POSTSUBSCRIPT italic_x ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT | start_ARG italic_x end_ARG ⟩ , (4)

where n𝑛nitalic_n is the number of qubits and αxsubscript𝛼𝑥\alpha_{x}italic_α start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT is the amplitude of basis state |xket𝑥\ket{x}| start_ARG italic_x end_ARG ⟩. Upon measuring, the state collapses to one of the basis states with probability |αx|2superscriptsubscript𝛼𝑥2|\alpha_{x}|^{2}| italic_α start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. This probability can be estimated by preparing and measuring the state many times. The expectation value used in VQAs uses such an estimation. The expectation value of a Hamiltonian given a state can be expressed as a weighted sum of each basis state’s associated cost values:

Ψ(𝜽)|𝑯|Ψ(𝜽)quantum-operator-productΨ𝜽𝑯Ψ𝜽\displaystyle\braket{\Psi(\bm{\theta})}{\bm{H}}{\Psi(\bm{\theta})}⟨ start_ARG roman_Ψ ( bold_italic_θ ) end_ARG | start_ARG bold_italic_H end_ARG | start_ARG roman_Ψ ( bold_italic_θ ) end_ARG ⟩ =x{0,1}n|αx(𝜽)|2x|𝑯x|xabsentsubscript𝑥superscript01𝑛superscriptsubscript𝛼𝑥𝜽2quantum-operator-product𝑥subscript𝑯𝑥𝑥\displaystyle=\sum_{x\in\{0,1\}^{n}}|\alpha_{x}(\bm{\theta})|^{2}\braket{x}{% \bm{H}_{x}}{x}= ∑ start_POSTSUBSCRIPT italic_x ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_α start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( bold_italic_θ ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⟨ start_ARG italic_x end_ARG | start_ARG bold_italic_H start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG | start_ARG italic_x end_ARG ⟩ (5)
=x{0,1}n|αx(𝜽)|2Cx,absentsubscript𝑥superscript01𝑛superscriptsubscript𝛼𝑥𝜽2subscript𝐶𝑥\displaystyle=\sum_{x\in\{0,1\}^{n}}|\alpha_{x}(\bm{\theta})|^{2}C_{x},= ∑ start_POSTSUBSCRIPT italic_x ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_α start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( bold_italic_θ ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_C start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , (6)

where Cx=x|𝑯x|xsubscript𝐶𝑥quantum-operator-product𝑥subscript𝑯𝑥𝑥C_{x}=\braket{x}{\bm{H}_{x}}{x}italic_C start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = ⟨ start_ARG italic_x end_ARG | start_ARG bold_italic_H start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG | start_ARG italic_x end_ARG ⟩ is the cost value for solutions x𝑥xitalic_x encoded in the Hamiltonian. Besides looking at the landscape of the expectation value, we can investigate the probability landscape of specific states. In particular, for classical optimization problems, there are one or multiple basis states that correspond to the optimal solution(s). Oftentimes, the goal is to find the parameters that generate optimal basis states with a high probability. In this case, the expectation value can be misleading, as parameters that produce high-probability suboptimal solutions can yield an expectation value smaller than those that lead to the optimal solution mixed with bad solutions. Thus, it is important to look at the probability landscapes of the optimal and suboptimal solutions to gain insights into initialization and optimizer configurations for similar problem settings.

Another scenario where basis state landscapes are useful is to investigate symmetries encoded in the problem. For example, a 4-qubit MaxCut instance has 16161616 basis states as solutions. We can immediately see that the symmetry of variable mapping marks half of the solutions as duplicates. Nonetheless, the number of unique probability landscapes is only three, as shown in Fig. 6, which is related to the Hamming weight of the corresponding bitstring. Such reflections of the symmetry in the original problem on state landscapes can be analyzed for VQA development.

V Conclusion

In this paper, we propose a tensor-completion-based approach for reconstructing local cost landscape of VQAs. Our approach takes advantage of the low-rank property of local VQA landscapes and represents them in the tensor network format, which achieves exponential space reduction compared to the dense tensor representation. Our approach avoids exponential growth of required samples when the resolution increases, enabling high-resolution reconstruction with low overhead. Additionally, we show two examples of how landscapes are useful in developing VQAs.

We highlight a few directions for future work. From the reconstruction algorithm perspective, it is worth investigating the properties of specific VQAs to further reduce the sampling cost of tensor completion, such as better strategies of the initial sampling and more customized tensor network structures for functional approximation of quantum algorithms. From the application perspective, it would be interesting to apply the landscape reconstruction techniques to study the impact of quantum noise and to perform error mitigation.

Disclaimer

This paper was prepared for informational purposes with contributions from the Global Technology Applied Research center of JPMorganChase. This paper is not a product of the Research Department of JPMorganChase or its affiliates. Neither JPMorganChase nor any of its affiliates makes any explicit or implied representation or warranty and none of them accept any liability in connection with this position paper, including, without limitation, with respect to the completeness, accuracy, or reliability of the information contained herein and the potential legal, compliance, tax, or accounting effects thereof. This document is not intended as investment research or investment advice, or as a recommendation, offer, or solicitation for the purchase or sale of any security, financial instrument, financial product or service, or to be used in any way for evaluating the merits of participating in any transaction.

References

  • [1] A. M. Dalzell, S. McArdle, M. Berta, P. Bienias, C.-F. Chen, A. Gilyén, C. T. Hann, M. J. Kastoryano, E. T. Khabiboulline, A. Kubica, G. Salton, S. Wang, and F. G. S. L. Brandão, “Quantum algorithms: A survey of applications and end-to-end complexities,” arXiv:2310.03011, 2023.
  • [2] D. Herman, C. Googin, X. Liu, Y. Sun, A. Galda, I. Safro, M. Pistoia, and Y. Alexeev, “Quantum computing for finance,” Nature Reviews Physics, vol. 5, no. 8, pp. 450–465, 2023.
  • [3] M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, L. Cincio et al., “Variational quantum algorithms,” Nature Reviews Physics, vol. 3, no. 9, pp. 625–644, 2021.
  • [4] Z. He, R. Shaydulin, S. Chakrabarti, D. Herman, C. Li, Y. Sun, and M. Pistoia, “Alignment between initial state and mixer improves QAOA performance for constrained optimization,” npj Quantum Information, vol. 9, no. 1, p. 121, 2023.
  • [5] Z. He, B. Peng, Y. Alexeev, and Z. Zhang, “Distributionally robust variational quantum algorithms with shifted noise,” arXiv preprint arXiv:2308.14935, 2023.
  • [6] T. Hao, R. Shaydulin, M. Pistoia, and J. Larson, “Exploiting in-constraint energy in constrained variational quantum optimization,” in 2022 IEEE/ACM Third International Workshop on Quantum Computing Software (QCS).   IEEE, 2022, pp. 100–106.
  • [7] R. Shaydulin, P. C. Lotshaw, J. Larson, J. Ostrowski, and T. S. Humble, “Parameter transfer for quantum approximate optimization of weighted MaxCut,” ACM Transactions on Quantum Computing, vol. 4, no. 3, pp. 1–15, feb 2023.
  • [8] R. Shaydulin, I. Safro, and J. Larson, “Multistart methods for quantum approximate optimization,” in IEEE High Performance Extreme Computing Conference, 2019.
  • [9] S. H. Sureshbabu, D. Herman, R. Shaydulin, J. Basso, S. Chakrabarti, Y. Sun, and M. Pistoia, “Parameter setting in quantum approximate optimization of weighted problems,” Quantum, vol. 8, p. 1231, 2024.
  • [10] X. Liu, A. Angone, R. Shaydulin, I. Safro, Y. Alexeev, and L. Cincio, “Layer VQE: A variational approach for combinatorial optimization on noisy quantum computers,” IEEE Transactions on Quantum Engineering, vol. 3, pp. 1–20, 2022.
  • [11] T. Hao, K. Liu, and S. Tannu, “Enabling high performance debugging for variational quantum algorithms using compressed sensing,” in Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023, pp. 1–13.
  • [12] M. S. Rudolph, S. Sim, A. Raza, M. Stechly, J. R. McClean, E. R. Anschuetz, L. Serrano, and A. Perdomo-Ortiz, “ORQVIZ: Visualizing high-dimensional landscapes in variational quantum algorithms,” arXiv preprint arXiv:2111.04695, 2021.
  • [13] A. Pérez-Salinas, H. Wang, and X. Bonet-Monroig, “Analyzing variational quantum landscapes with information content,” npj Quantum Information, vol. 10, no. 1, p. 27, 2024.
  • [14] H. Li, Z. Xu, G. Taylor, C. Studer, and T. Goldstein, “Visualizing the loss landscape of neural nets,” Advances in neural information processing systems, vol. 31, 2018.
  • [15] E. Fontana, I. Rungger, R. Duncan, and C. Cîrstoiu, “Efficient recovery of variational quantum algorithms landscapes using classical signal processing,” arXiv preprint arXiv:2208.05958, 2022.
  • [16] J. Lee, A. B. Magann, H. A. Rabitz, and C. Arenz, “Progress toward favorable landscapes in quantum combinatorial optimization,” Physical Review A, vol. 104, no. 3, p. 032401, 2021.
  • [17] J. Kim and Y. Oz, “Quantum energy landscape and circuit optimization,” Physical Review A, vol. 106, no. 5, p. 052424, 2022.
  • [18] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and H. Neven, “Barren plateaus in quantum neural network training landscapes,” Nature communications, vol. 9, no. 1, p. 4812, 2018.
  • [19] E. Fontana, D. Herman, S. Chakrabarti, N. Kumar, R. Yalovetzky, J. Heredge, S. H. Sureshbabu, and M. Pistoia, “The adjoint is all you need: Characterizing barren plateaus in quantum ansätze,” arXiv:2309.07902, 2023.
  • [20] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’brien, “A variational eigenvalue solver on a photonic quantum processor,” Nature communications, vol. 5, no. 1, p. 4213, 2014.
  • [21] P. K. Barkoutsos, J. F. Gonthier, I. Sokolov, N. Moll, G. Salis, A. Fuhrer, M. Ganzhorn, D. J. Egger, M. Troyer, A. Mezzacapo et al., “Quantum algorithms for electronic structure calculations: Particle-hole hamiltonian and optimized wave-function expansions,” Physical Review A, vol. 98, no. 2, p. 022322, 2018.
  • [22] T. Hogg and D. Portnov, “Quantum optimization,” Information Sciences, vol. 128, no. 3–4, p. 181–197, Oct. 2000.
  • [23] E. Farhi, J. Goldstone, and S. Gutmann, “A quantum approximate optimization algorithm,” arXiv:1411.4028, 2014.
  • [24] S. Boulebnane and A. Montanaro, “Solving Boolean satisfiability problems with the quantum approximate optimization algorithm,” arXiv:2208.06909, 2022.
  • [25] R. Shaydulin, C. Li, S. Chakrabarti, M. DeCross, D. Herman, N. Kumar, J. Larson, D. Lykov, P. Minssen, Y. Sun et al., “Evidence of scaling advantage for the quantum approximate optimization algorithm on a classically intractable problem,” arXiv preprint arXiv:2308.02342, 2023.
  • [26] R. Orús, “A practical introduction to tensor networks: Matrix product states and projected entangled pair states,” Annals of physics, vol. 349, pp. 117–158, 2014.
  • [27] G. Vidal, “Efficient classical simulation of slightly entangled quantum computations,” Physical review letters, vol. 91, no. 14, p. 147902, 2003.
  • [28] C. Ibrahim, D. Lykov, Z. He, Y. Alexeev, and I. Safro, “Constructing optimal contraction trees for tensor network quantum circuit simulation,” in 2022 IEEE High Performance Extreme Computing Conference (HPEC).   IEEE, 2022, pp. 1–8.
  • [29] Z. He and Z. Zhang, “High-dimensional uncertainty quantification via tensor regression with rank determination and adaptive sampling,” IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 11, no. 9, pp. 1317–1328, 2021.
  • [30] Y. Pang, T. Hao, A. Dugad, Y. Zhou, and E. Solomonik, “Efficient 2D tensor network simulation of quantum systems,” in International conference for high performance computing, networking, storage and analysis.   IEEE, 2020, pp. 1–14.
  • [31] T. Hao, X. Huang, C. Jia, and C. Peng, “A quantum-inspired tensor network algorithm for constrained combinatorial optimization problems,” Frontiers in Physics, vol. 10, p. 906590, 2022.
  • [32] I. V. Oseledets, “Tensor-train decomposition,” SIAM Journal on Scientific Computing, vol. 33, no. 5, pp. 2295–2317, 2011.
  • [33] U. Schollwöck, “The density-matrix renormalization group in the age of matrix product states,” Annals of physics, vol. 326, no. 1, pp. 96–192, 2011.
  • [34] L. Grasedyck, M. Kluge, and S. Kramer, “Variants of alternating least squares tensor completion in the tensor train format,” SIAM Journal on Scientific Computing, vol. 37, no. 5, pp. A2424–A2450, 2015.
  • [35] A. Chertkov, G. Ryzhakov, and I. Oseledets, “Black box approximation in the tensor train format initialized by ANOVA decomposition,” SIAM Journal on Scientific Computing, vol. 45, no. 4, pp. A2101–A2118, 2023.
  • [36] L. Yuan, Q. Zhao, L. Gui, and J. Cao, “High-order tensor completion via gradient-based optimization under tensor train format,” Signal Processing: Image Communication, vol. 73, pp. 53–61, 2019.
  • [37] M. Steinlechner, “Riemannian optimization for high-dimensional tensor completion,” SIAM Journal on Scientific Computing, vol. 38, no. 5, pp. S461–S484, 2016.
  • [38] J. Wurtz and D. Lykov, “Fixed-angle conjectures for the quantum approximate optimization algorithm on regular MaxCut graphs,” Physical Review A, vol. 104, no. 5, p. 052419, 2021.
  • [39] A. Chertkov, G. Ryzhakov, G. Novikov, and I. Oseledets, “Optimization of functions given in the tensor train format,” arXiv preprint arXiv:2209.14808, 2022.
  • [40] D. Herman, R. Shaydulin, Y. Sun, S. Chakrabarti, S. Hu, P. Minssen, A. Rattew, R. Yalovetzky, and M. Pistoia, “Constrained optimization via quantum zeno dynamics,” Communications Physics, vol. 6, no. 1, p. 219, 2023.
  • [41] P. Niroula, R. Shaydulin, R. Yalovetzky, P. Minssen, D. Herman, S. Hu, and M. Pistoia, “Constrained quantum optimization for extractive summarization on a trapped-ion quantum computer,” Scientific Reports, vol. 12, no. 1, p. 17171, 2022.