# Background¶

This chapter provides general background information on the discrete optimization of objective functions and introduces the D-Wave QPU operation and architecture to provide context for the remainder of the document.

## Discrete Optimization¶

Discrete optimization, also known as combinatorial optimization, is the optimization of objective functions defined over search spaces consisting of a finite set of objects. This section focuses on two classes of optimization objectives:

• Ising problems defined over strings of -1/+1 symbols
• Quadratic unconstrained binary optimization (QUBO) problems defined over strings of $0/1$ symbols

Note

This limited focus is not constraining: any problem in the NP complexity class may be translated into this form.

### Ising Model¶

The D-Wave QPU can be viewed as a heuristic that minimizes Ising objective functions using a physically realized version of quantum annealing. The Ising objective function of $N$ variables $\vc s=[s_1,...,s_N]$ where $s_i \in \{+1,-1\}$ is given by

\begin{equation} \text{E}_{ising}(\vc s) = \sum_{i=1}^N h_i s_i + \sum_{i=1}^N \sum_{j=i+1}^N J_{i,j} s_i s_j. \end{equation}

Each variable $s_i$ corresponds to a physical Ising spin that can be in a $+1$ or $-1$ state with a local applied field on each spin that causes it to prefer either the $+1$ or $-1$ state. The sign and magnitude of this preference—that is, the value of the local field—is denoted $h_i$. There may also be couplings between spins $i$ and $j$ such that the system prefers the pair of spins to be in either of the two sets defined by $s_i=s_j$ (ferromagnetic coupling) or $s_i = -s_j$ (antiferromagnetic coupling). The sign and magnitude of this preference is denoted $J_{i,j}$.

For example, in a single-spin system where $h_1$ has a large positive value, then the lowest energy occurs when $s_1$ is $-1$. Similarly, for two spins, $s_1$ and $s_2$, with $h_1 = h_2 = 0$ and $J_{1,2} = -1$, the system has two degenerate lowest energy states: one with $s_1 = -1$ and $s_2 = -1$, and the other with $s_1 = +1$ and $s_2 = +1$.

The D-Wave QPU is based on a physical lattice of qubits and couplers referred to as the Chimera or Pegasus architecture; see the D-Wave QPU Architecture section. The number of spins depends on the number of qubits in the QPU. The Advantage QPU, for example, contains over 5000 spins $s_i$. Each is coupled to a maximum of 15 other spins. The domains over which $h$ and $J$ can be chosen for a given system are available by querying the solver properties.

### Formulation of the Ising Problem as a QUBO¶

In computer science, it is often more convenient to model with $0/1$-valued variables than the $-1/1$-valued Ising variables. Consider, for example, a maximum independent set (MIS) problem where the goal is to select the largest subset of vertices (the independent set) of a graph such that no pair of selected vertices is connected by an edge. For such a problem, it is more intuitive and succinct to define penalties to represent constraints using $0/1$-valued variables.

QUBOs are typically expressed using matrix notation. For more information on expressing QUBOs, and for guidelines on converting between Ising and QUBO formulations, see Getting Started with the D-Wave System.

Ising and QUBO problems are NP-hard [Bar1982]. NP-hardness holds even when the problem is restricted to the Chimera architecture and to the QPU-specific range of $h_i$ and $J_{i,j}$ values. There are a number of tractable subclasses [Ber1996] of Ising/QUBO problems; for details, see [Kol2004], [Sch2009], or [Cou2009]. For information on the mapping of NP problems to the Chimera architecture, see the Connectivity section.

## Quantum Annealing with Ising Spins in a Transverse Field¶

The superconducting QPU at the heart of the D-Wave system, which operates at a temperature of approximately 12 mK, is a controllable, physical realization of the quantum Ising spin system in a transverse field. Each qubit and coupler on the QPU has several controls that are manipulated by individual on-QPU digital-to-analog converters (DACs); see [Bun2014] and [Joh2010]. Along with the DACs, a small number of analog control lines provide the time-dependent control required by the quantum Hamiltonian:

\begin{equation} {\cal H}_{ising} = - \frac{A({s})}{2} \left(\sum_i {\hat\sigma_{x}^{(i)}}\right) + \frac{B({s})}{2} \left(\sum_{i} h_i {\hat\sigma_{z}^{(i)}} + \sum_{i>j} J_{i,j} {\hat\sigma_{z}^{(i)}} {\hat\sigma_{z}^{(j)}}\right) \end{equation}

where ${\hat\sigma_{x,z}^{(i)}}$ are Pauli matrices operating on a qubit $q_i$ (the quantum one-dimensional Ising spin), and nonzero values of $h_i$ and $J_{i,j}$ are limited to those available in the QPU graph; see the D-Wave QPU Architecture section.

The D-Wave QPU uses a method called quantum annealing (QA) to return low-energy spin states of $\text{E}_{ising}$ for given inputs $\vc{h}$ and $\vc{J}$. The quantum annealing process starts at time $t=0$ with $A(0) \gg B(0)$, which leads to a trivial and easily initialized quantum ground state of the system where each spin, $s_i$, is in a delocalized combination of its classical states $s_i = \pm 1$. The system is then slowly annealed by decreasing $A$ and increasing $B$ until time $t_f$, which users specify via the annealing_time parameter. Figure 80 shows how $A$ and $B$ change over time. (For simplicity, we introduce a normalized anneal fraction, $s$, parameter ranging from 0 to 1.)

At the end of the computation, when ${s} = 1$ and $A(1) \ll B(1)$, the qubits have dephased to classical systems and the ${\hat\sigma_{z}^{(i)}}$ can be replaced by classical spin variables $s_i = \pm 1$. At this point, the system is described by the classical Ising spin system

\begin{equation} \text{E}_{ising}(\vc{s}) = \sum_{i} h_i s_i + \sum_{i>j} J_{i,j} s_i s_j \end{equation}

such that the classical spin states represent a low-energy solution.

Note

If you are using the Leap™ quantum cloud service from D-Wave, you can find the $A(s)$ and $B(s)$ values for the QPUs in the cloud here: https://support.dwavesys.com/hc/en-us/articles/360005267253-QPU-Specific-Anneal-Schedules. If you have an installed system, contact D-Wave to obtain the values for your system. Fig. 80 Annealing functions $A(s)$, $B(s)$. Annealing begins at $s=0$ with $A(s) \gg B(s)$ and ends at $s=1$ with $A(s) \ll B(s)$. Data shown are representative of Advantage systems.

## Coupled rf-SQUID Qubits¶

The D-Wave QPU is built with a network of tunably coupled rf superconducting quantum–interference device (rf-SQUID) qubits; see [Har2010_2]. The physical Hamiltonian of this set of coupled rf-SQUIDs in the qubit approximation is

\begin{equation} H = -\frac{1}{2}\sum_i\left[\Delta_q(\Phi_{\rm CCJJ}(s)) {\hat\sigma_{x}^{(i)}} - 2 h_i |I_p(\Phi_{\rm CCJJ}(s))| \Phi^x_i(s) {\hat\sigma_{z}^{(i)}} \right] + \sum_{i>j} J_{i,j} M_{\rm AFM} I_p(\Phi_{\rm CCJJ}(s))^2 {\hat\sigma_{z}^{(i)}} {\hat\sigma_{z}^{(j)}} \end{equation}

where $\Delta_q$ is the energy difference between the two eigenstates of the rf-SQUID qubit with no external applied flux (the degeneracy point) where the eigenstates are $(\ket{0} \pm \ket{1})/\sqrt{2}$. This energy difference captures the contribution of coherent tunneling between the two wells. $I_p$ represents the magnitude of the current flowing in the body of the rf-SQUID loop; see Figure 81. $M_{\rm AFM}$ is the maximum mutual inductance generated by the couplers between the qubits (typically 2 pH), $\Phi_i^x(s)$ is an external flux applied to the qubits, and $\Phi_{\rm CCJJ}(s)$ is an external flux applied to all qubit compound Josephson-junction structures to change the potential energy shape of the rf-SQUID qubit. Fig. 81 Typical $I_p$ vs $\Delta_q$. This relationship is fixed by the physical parameters of the rf-SQUID qubit. Data shown are representative of D-Wave 2X systems.

To map this system to the Ising spin in the transverse field Hamiltonian discussed in the Quantum Annealing with Ising Spins in a Transverse Field section, set $\Phi^x_i(s) = M_{\rm AFM} |I_p(s)|$. Thus, as $\Phi_{\rm CCJJ}(s)$ changes during the anneal, $\Phi^x_i(s)$ changes as required to keep the relative energy ratio between the $h$ and $J$ terms constant. In particular, the physical flux applied to the qubit to implement a fixed $h$ value increases as the anneal progresses. Then, the mapping to the Ising Hamiltonian becomes:

\begin{align} A(s) &= \Delta_q(\Phi_{\rm CCJJ}(s)) \nonumber \\ B(s) & = 2M_{\rm AFM} |I_p(\Phi_{\rm CCJJ}(s))|^2 \end{align}

The relationship between $\Delta_q(\Phi_{\rm CCJJ})$ and $I_p(\Phi_{\rm CCJJ})$ is fixed by the physical parameters of the rf-SQUID qubit. Changing the applied $\Phi_{\rm CCJJ}$ moves the rf-SQUID qubit along the curve shown in Figure 81.

For simplicity, we introduce a normalized annealing bias,

\begin{equation} c(s) = \frac{\Phi_{\rm CCJJ}(s) - \Phi_{\rm CCJJ}^{\rm initial}}{\Phi_{\rm CCJJ}^{\rm final} - \Phi_{\rm CCJJ}^{\rm initial}}, \end{equation}

where $\Phi_{\rm CCJJ}^{\rm initial}$ and $\Phi_{\rm CCJJ}^{\rm final}$ are the values of $\Phi_{\rm CCJJ}$ at $s = 0$ and $s = 1$, respectively ($c(0) = 0$ and $c(1)$ = 1).

The signal $c(s)$ is provided by an external room temperature current source. The time-dependence of this bias signal is chosen to produce a linear growth in time of the persistent current flowing in the rf-SQUID flux qubits, $I_p(s)$. Because $B(s) = 2 M_{\rm AFM} I_p(s)^2$, the problem energy scale grows quadratically in time (as seen in Figure 80).

## Annealing Energy Functions¶

This section describes quantum annealing energy functions and freezeout points on the D-Wave QPU and explains how tunneling energy changes as rf-SQUID qubits are coupled.

### Energy Scales¶

Two energy scales are relevant during quantum annealing, $A(s)$ and $B(s)$:

• $A(s)$ represents the transverse, or tunneling, energy. It equals $\Delta_q$, as defined in the Coupled rf-SQUID Qubits section.
• $B(s)$ is the energy applied to the problem Hamiltonian. It equals $2M_{\rm AFM}I_p(s)^2$, where $M_{\rm AFM}$ represents the maximum available mutual inductance achievable between pairs of flux qubit bodies.

Energy scales $A(s)$ and $B(s)$ change during the quantum annealing process. In particular, a single, global, time-dependent bias controls the trajectory of $A$ and $B$. At any intermediate value of $s$, the ratio $A(s)/B(s)$ is fixed. We can choose the trajectory of one of $A$ or $B$ with time. The standard annealing schedule, $s = t / t_f$, results in $I_p(s)$ growing linearly with time, producing a quadratic growth in $B(s)$. Typical values of $A(s)$ and $B(s)$ are shown in Figure 80.

### Freezeout Points¶

$A(s)$ sets the time scale for qubit dynamics. As annealing progresses, $A(s)$, and therefore this time scale, decreases. When the dynamics of the complex Ising spin system become slow compared to $t_f$, the network is frozen—that is, the spin state does not change appreciably as the Ising spin Hamiltonian evolves. While in general each Ising spin problem has different dynamics, it is instructive to analyze a simple system consisting of clusters of uniformly coupled qubits. These clusters of coupled qubits are called logical qubits.

Networks of logical qubits freeze out at different points in the annealing process, depending on several factors, including:

  For a logical qubit made of 3 qubits, for example, the relevant multiqubit states might be $\ket{\uparrow \uparrow \uparrow}$ and $\ket{\downarrow \downarrow \downarrow}$.
• Number of qubits in the network
• Coupling strengths between the qubits
• Overall time scale of the anneal, $t_f$

In general, freezeout points move earlier in $s$ for larger logical qubit sizes, for more strongly coupled logical qubits, and for smaller annealing time $t_f$. Figure 82 and Figure 83 show representative freezeout points for several qubit network sizes. A network of logical qubits is created by coupling multiple qubits to a single central qubit using $J = +1$; see the Using Two-Spin Systems to Measure ICE section for more details. Fig. 82 Representative annealing schedules with freezeout points for several logical qubit sizes and $t_f = 5\ \mu s$. The dashed lines show localization points for a $N = 1$ through $N = 5$ logical qubit clusters from right to left, respectively. Data shown are representative of D-Wave 2X systems. Fig. 83 Representative annealing schedules with freezeout points for several logical qubit sizes and $t_f = 100\ \mu s$. The dashed lines show localization points for a $N = 1$ through $N = 5$ logical qubit clusters from right to left, respectively. Data shown are representative of D-Wave 2X systems.

Measuring $I_p$ at the freezeout point of various-sized logical qubits results in Figure 84. The figure shows that less annealing time and larger clusters move the freezeout point earlier in the anneal, where $I_p$ is lower. Fig. 84 $I_p$ (persistent current) at the freezeout point as a function of logical qubit size and annealing time. The change in $I_p$ at freezeout, as the annealing time varies from 5 to 1000 $\mu s$ and the logical spin cluster size varies from 1 to 5. Shorter annealing time and larger clusters move the freezeout point earlier in the anneal, where $I_p$ is lower. Data shown are representative of D-Wave 2X systems.

  While absolute calibration is difficult through the SAPI interfaces, if temperature is used as an absolute calibration factor, the population statistics of a simple logical qubit transition can approximately determine the relative value of $I_p$ at freezeout.

The signal corresponding to fixed $h$ values scales as $I_p$. Larger logical qubits freeze out earlier in the annealing process: at lower $I_p$. Thus, if an erroneous fixed flux offset exists in the physical body (a static addition to the term $\Phi^x_i$), then the corresponding $\delta h$ at freezeout grows with the logical qubit size because of the lower persistent current at the freezeout point. This error should roughly double going from a logical qubit of size 1 to a logical qubit of size 5. The standard deviation of $\delta h$ versus logical qubit size plot in Figure 100 indicates that this is approximately true.

## Annealing Controls¶

This section describes the features that allow you to control the annealing process: per-qubit anneal offsets and global anneal schedule changes. The latter supports mid-anneal quench and pause as well as reverse annealing.

  Another feature that enables control of the annealing process, the time-dependent gain applied to linear coefficients (biases), $h_i$, described in the Solver Properties and Parameters Reference guide, is currently used only experimentally and not described here.

### Anneal Offsets¶

The standard annealing trajectory lowers $A(s)$ and raises $B(s)$ identically for all qubits in the QPU. This single annealing path, however, may not be ideal for some applications of quantum annealing. This section describes anneal offsets, which allow you to adjust the standard annealing path per qubit.

Note

Anneal offsets are not supported on D-Wave 2X and earlier systems. Before using this feature, query the solver properties using SAPI calls to determine whether it is supported and, if so, to obtain the available tuning ranges per qubit.

As discussed above, the annealing process is controlled via a global, time-dependent bias signal $c(s)$ that simultaneously modifies both $A(s)$ and $B(s)$. Figure 80 shows typical $A(s)$ and $B(s)$ across the annealing algorithm. Figure 85 plots the annealing bias $c(s)$ versus $s$. Because of the shape of the rf-SQUID flux qubit energy potential, $c(s)$ is not linear in $s$ but is chosen to ensure that $I_p(s)$ grows linearly with $s$. Fig. 85 Annealing control bias $c$ versus anneal fraction $s$. At $c=0, s=0,$ $A(s) \gg B(s)$, and at $c=1, s=1, A(s) \ll B(s).$ Data shown are representative of D-Wave 2000Q systems.

On-QPU DACs allow adjustments of static annealing offsets $\delta c_i$ per qubit, thereby advancing or delaying the annealing signal locally for each. Figure 86 shows an example of the annealing control bias with $\delta c_i = 0.05$ and $\delta c_i = -0.05$. Note that the anneal offset is a vertical shift up or down in annealing control bias, not a shift in $s$. Fig. 86 Annealing control bias $c$ versus anneal fraction $s$ for several anneal offset values. Data shown are representative of D-Wave 2000Q systems.

Advancing or delaying the annealing bias by setting $\delta c_i \ne 0$ changes the transverse field $A_i(s)$ with respect to the original global $A(s)$. This allows you to increase or decrease $A_i(s)$. Note that $\delta c_i > 0$ advances the annealing process ($A_i(s) < A(s)$) and $\delta c_i < 0$ delays the annealing process ($A_i(s) > A(s)$). Figure 87 shows typical $A_i(s)$ versus $s$ for two values of $\delta c_i$. Both $A(s)$ and $B(s)$ simultaneously change with control bias $c$. Thus, a consequence of advancing or delaying the annealing process with anneal offset $\delta c_i$ is that $B(s)\rightarrow B_i(s)$. Figure 87 also shows typical $B_i(s)$ versus $s$ for the same set of $\delta c_i$. Fig. 87 $A(s)$ and $B(s)$ versus anneal fraction $s$. At $c(s=0)=0$, $A(s) \gg B(s)$, and at $c(s=1) = 1, A(s) \ll B(s)$. The baseline curve ($\delta c_i = 0$) is shown along with $A_i(s), B_i(s)$ for anneal offset $\delta c_i=0.05$ (advances the annealing process for the qubit) and $A_i(s),B_i(s)$ for anneal offset $\delta c_i = -0.05$ (delays the annealing process for the qubit). Data shown are representative of D-Wave 2000Q systems.

The change of $B(s)\rightarrow B_i(s)$ has consequences for the target Ising spin Hamiltonian parameters $h_i$ and $J_{i,j}$. The anneal offset for the $i{\rm th}$ qubit deflects $h_i\rightarrow h_i(\delta c_i,s)$, and the anneal offsets for the $i{\rm th}$ and $j{\rm th}$ qubit deflect $J_{i,j} \rightarrow J_{i,j}(\delta c_i,\delta c_j,s)$. Figure 88 shows plots of $h_i(\delta c_i,s)$ and $J_{i,j}(\delta c_i,0,s)$ for several values of $\delta c_i$. Figure 89 shows plots of $J_{i,j}(\delta c_i,\delta c_j,s)$ for several values of $\delta c_i, \delta c_j$. Fig. 88 $h_i(\delta c_i, s)/h$ and $J_{i,j}(\delta c_i,0, s)/J$ versus $s$ for several values of $\delta c_i$. The deviation of these quantities is $s$-dependent. You can choose a particular value of $s$ at which to exactly compensate the change in target parameter by adjusting $h_i(\delta c_i, s)\rightarrow h_i', J_{i,j}(\delta c_i,0, s) \rightarrow J_{i,j}'$. For values of $s$ before or after this point, a residual change in target parameter remains. Data shown are representative of D-Wave 2000Q systems. Fig. 89 $J_{ij}(\delta c_i,\delta c_j, s)/J$ versus $s$ for several values of $\delta c_i$ and $\delta c_j$. The deviation of these quantities away from 1 is $s$-dependent. You can choose a particular value of $s$ at which to exactly compensate the change in target parameter by adjusting $J_{i,j}(\delta c_i,\delta c_j, s)\rightarrow J_{i,j}'$. For values of $s$ before or after this point, a residual change in target parameter remains. Data shown are representative of D-Wave 2000Q systems.

The changes shown in Figure 88 and Figure 89 are $s$-dependent. The largest changes are earlier in the annealing process. You can choose a particular value of $s^*$ at which to exactly compensate these changes in target parameters by rescaling the requested target parameters. However, for values of $s$ before or after $s^*$, a residual change in target parameter remains.

Anneal offsets may improve results for problems in which the qubits have irregular dynamics for some easily determined reason. For example, if a qubit’s final value does not affect the energy of the classical state, you can advance it (with a positive offset) to reduce quantum bias in the system; see [Kin2016]. Anneal offsets can also be useful in embedded problems with varying chain length: longer chains may freeze out earlier than shorter ones—at an intermediate point in the anneal, some variables act as fixed constants while others remain undecided. If, however, you advance the anneal of the qubits in the shorter chains, they freeze out earlier than they otherwise would. The correct offset will synchronize the annealing trajectory of shorter chains with that of the longer ones. As a general rule, if a qubit is expected to be subject to a strong effective field relative to others, delay its anneal with a negative offset.

Determining the optimum offsets for different problem types is an area of research at D-Wave. Expect that the appropriate offsets for two different qubits in the same problem to be within 0.2 normalized offset units of each other.

### Variations on the Global Anneal Schedule¶

You can make changes to the global anneal schedule by submitting a set of points that that define the piece-wise linear (PWL) waveform of the annealing pattern you want. You can change the standard (forward) schedule by introducing a pause or a quench, or you can initialize the qubits into a specific classical state and anneal in reverse from there. This section describes these features.

Note

Variations on the global anneal schedule are not supported on D-Wave 2X and earlier systems. For D-Wave 2000Q and Advantage systems, the maximum number of points in a schedule is system-dependent.

Note

The maximum number of points permitted in an anneal schedule varies by system and by release. Check the max_anneal_schedule_points solver property to obtain this value. For reverse annealing, the maximum number of points allowed is one more than the number given by this property.

#### Pause and Quench¶

Advantage and D-Wave 2000Q system provide more control over the global annealing trajectories than was possible in previous products. As before, you can scale the quadratic growth in persistent current—that is, quadratic growth in $B(t)$—using the annealing_time parameter. New with the D-Wave 2000Q and later systems, however, is the anneal_schedule parameter, which allows for a pause or quench partway through the annealing process. A pause dwells for some time at a particular anneal fraction; a quench abruptly terminates the anneal within a few hundred nanoseconds of the point specified. Unlike the anneal offsets feature—which allows you to control the annealing path of individual qubits separately—anneal schedule changes apply to all qubits in the working graph.

  The annealing_time and anneal_schedule parameters are mutually exclusive.

Changes to the schedule are controlled by a PWL waveform comprising $n$ pairs of points. The first element is time $t$ in microseconds; the second, the anneal fraction, $s$, as a value between 0 and 1. This input causes the system to produce linear changes in $s$ between $s_i$ and $s_{i+1}$. The maximum slope for each segment of the PWL waveform is the upper limit given in the Annealing Slope Range parameter.

  This and other properties of your system are given in the QPU Properties document.

The following rules apply to the set of anneal schedule points provided:

• Time $t$ must increase for all points in the schedule.

• For forward annealing, the first point must be $(0, 0)$.

• For reverse annealing, the anneal fraction $s$ must start and end at $s = 1$.

• In the final point, anneal fraction $s$ must equal 1 and time $t$ must not exceed the maximum value in the annealing_time_range property.

• The number of points must be $\geq 2$.
The upper bound is system-dependent; check the max_anneal_schedule_points property. For reverse annealing, the maximum number of points allowed is one more than the number given by this property.
• Additional rules that govern maximum slope vary by product; check the QPU properties document for your system.

The table below gives three valid examples of anneal schedule points, producing the varying patterns of $B(t)$ that appear in Figure 90.

Table 50 Anneal Schedule Tuples: Examples
Points Result
${(0.0, 0.0) (20.0, 1.0)}$ Standard trajectory of 20-$\mu s$ anneal. Here, $B(t)$ grows quadratically with time.
${(0.0, 0.0) (10.0, 0.5) (110.0, 0.5) (120.0, 1.0)}$ Mid-anneal pause at $s = 0.5$. The quadratic growth of $B(t)$ is interrupted by a 100-$\mu s$ pause halfway through.
${(0.0, 0.0) (10.0, 0.5) (12.0, 1.0)}$ Mid-anneal quench at $s = 0.5$. The quadratic growth of $B(t)$ is interrupted by a rapid 2-$\mu s$ quench halfway through. Fig. 90 Annealing schedule variations. Top: Standard quadratic growth of $B$ for $t_f = 20\ \mu s$. Middle: Mid-anneal pause of $100\ \mu s$ at $s = 0.5$ for a background $t_f = 20\ \mu s$. Bottom: Mid-anneal quench of $2\ \mu s$ at $s = 0.5$ for a background $t_f = 20\ \mu s$.

This degree of control over the global annealing schedule allows you to study the quantum annealing algorithm in more detail. For example, a pause can be a useful diagnostic tool for instances with a small perturbative anticrossing. Figure 91 shows typical measurements of the 16-qubit instance reported in [Dickson2013] with a pause inserted. While pauses early or late in the anneal have no effect, a pause near the expected perturbative anticrossing produces a large increase in the ground-state success rate. Fig. 91 Typical measurements from a 16-qubit reference instance showing pauses of different lengths inserted at different points in the annealing schedule. Pauses inserted near the expected anticrossing increase the likelihood of obtaining results from the ground state. Data shown are representative of D-Wave 2000Q systems.

Another example is a quench inserted at $s < 1$. If the quench is fast compared to problem dynamics, then the distribution of states returned by the quench can differ significantly from that returned by the standard annealing schedule. Figure 92 shows typical measurements of the same 16-qubit instance with a quench added. The probability of obtaining ground state samples depends on when in the anneal the quench occurs, with later quenches more likely to obtain samples from the ground state. Fig. 92 Typical measurements from a 16-qubit reference instance showing quenches of different speeds compared to problem dynamics at different points in the anneal schedule. Later quenches increase the likelihood of obtaining results from the ground state. Data shown are representative of D-Wave 2000Q systems.

#### Reverse Annealing¶

As described above, the annealing functions $A(s)$ and $B(s)$ are defined such that $A(s) \gg B(s)$ at $s=0$ and $A(s) \ll B(s)$ at $s = 1$, where $s$ is the normalized annealing fraction. In the standard quantum annealing protocol, $s$ increases linearly with time, with $s(0)=0$ and $s(t_f)=1$, where $t_f$ is the total annealing time. The network of qubits starts in a global superposition over all possible classical states and, as $s \rightarrow 1$, the system localizes into a single classical state; see Figure 80.

Reverse annealing allows you to initialize the qubits into a specific classical state, begin the evolution at $s = 1$, anneal along a path toward $s=0$, and then return back up to $s=1$. Figure 93 shows a typical reverse annealing process where the system reverses to $s = 0.5$, pauses for $25 \ \mu s$ at $s = 0.5$, and ends at $s=1$.

Note

Spin-reversal transforms are incompatible with reverse annealing. Fig. 93 Typical reverse annealing protocol with a $5 \ \mu s$ equivalent ramp rate from $s = 1$ to $s = 0.5$, followed by a pause at $s = 0.5$ for $25 \ \mu s$, and then a ramp back to $s = 1$.

The reverse annealing feature gives you additional control over and insight into the quantum annealing process in the D-Wave QPU. Examples of how you might use reverse annealing include:

• Quantum Boltzmann sampling—Prepare a classical state and then draw a set of samples from a probability distribution that may be changing with $s$.

• Hybrid algorithms—Prepare a classical state provided by a classical heuristic and then turn on a finite $A(s)$ and allow the system to evolve.

• Tunneling rate measurements—Prepare a particular classical state and measure the rate at which the system tunnels from this state to another for a range of $s$.

• Relaxation rate measurements—Prepare a classical state that is an excited state of the problem Hamiltonian and measure the rate at which the system relaxes to lower energy states for a range of $s$.

  For more information, see Reverse Annealing for Local Refinement of Solutions, D-Wave White Paper Series, no. 14-1018A-A, 2017.

The reverse annealing interface uses three parameters: anneal_schedule defines the waveform $s(t)$, and initial_state and reinitialize_state control the system state at the start of an anneal.

As for pause and quench, the reverse annealing schedule is controlled by a PWL waveform comprising $n$ pairs of points. The first element of the pair is time $t$ in microseconds; the second, the anneal fraction, $s$, as a value between 0 and 1. Just as in forward annealing, time $t$ must increase for all points in the schedule. For a reverse anneal, however, the anneal fraction must start and end at $s = 1$.

The following table shows the tuples that you submit to get the pattern in Figure 93.

Table 51 Reverse Annealing Schedule Tuples: Example
Points Result
${(0.0, 0.1) (2.5, 0.5) (27.5, 0.5) (30.0, 1.0)}$ Reverse anneal, therefore begins at $s = 1$. A $30.0 \ \mu s$ anneal with a mid-anneal pause at $s = 0.5$ that lasts for $25 \ \mu s$.

When supplying a reverse annealing waveform through anneal_schedule, you must also supply the initial state to which the system is set. When multiple reads are requested in a single call to SAPI, you have two options for the starting state of the system. These are controlled by the reinitialize_state Boolean parameter:

• reinitialize_state=true (default)—Reinitialize the initial state for every anneal-readout cycle. Each anneal begins from the state given in the initial_state parameter. Initialization time is required before every anneal-readout cycle. The amount of time required to reinitialize varies by system.
• reinitialize_state=false—Initialize only at the beginning, before the first anneal cycle. Each anneal (after the first) is initialized from the final state of the qubits after the preceding cycle. Initialization time is required only once.

The reinitialize_state parameter affects timing. See Solver Computation Time for more information.

## D-Wave QPU Operation¶

This section describes the programming cycle and the anneal-read cycle of the D-Wave QPU. For information about system timing, see Solver Computation Time.

### Programming Cycle¶

When an Ising problem is provided as a set of $h$ and $J$ values, the D-Wave system conveys those values to the DACs located on the QPU. Room-temperature electronics generate the raw signals that are sent via wires into the refrigerator to program the DACs. The DACs then apply static magnetic-control signals locally to the qubits and couplers. This is the programming cycle of the QPU. After the programming cycle, the QPU is allowed to cool for a postprogramming thermalization time of, typically, 1 ms; see the Temperature section for more details about this cooling time.

  Several other instructions to the system are provided by the user: for example, an annealing time over which the quantum annealing process is to occur. See Solver Properties and Parameters Reference for details.
  In some descriptions, the programming cycle is subdivided into a reset step that erases previous data stored in the DACs, followed by a programming step.

The total time spent programming the QPU, including the postprogramming thermalization time, is reported back as qpu_programming_time.

After the programming cycle, the system switches to the annealing phase during which the QPU is repeatedly annealed and read out. Annealing is performed using the analog lines over a time specified by the user as annealing_time and reported by the QPU as qpu_anneal_time_per_sample. Afterward, the digital readout system of the QPU reads and returns the spin states of the qubits. The system is then allowed to cool for a time returned by the QPU as qpu_delay_time_per_sample—an interval comprising a constant value plus any additional time optionally specified by the user via the readout_thermalization parameter.

The anneal-read cycle is also referred to as a sample. The cycle repeats for some number of samples specified by the user in the num_reads parameter, and returns one solution per sample. The total time to complete the requested number of samples is returned by the QPU as qpu_sampling_time.

## D-Wave QPU Architecture¶

The D-Wave QPU is a lattice of interconnected qubits. While some qubits connect to others via couplers, the D-Wave QPU is not fully connected. Instead, the qubits of D-Wave 2000Q and earlier generations of QPUs interconnect in a topology known as Chimera while Advantage QPUs incorporate the Pegasus topology.

In a D-Wave QPU, the set of qubits and couplers that are available for computation is known as the working graph. The yield of a working graph is typically less than the total number of qubits and couplers that are fabricated and physically present in the QPU.

A given logical problem defined on a general graph can be mapped to a physical problem defined on the working graph using chains. A chain is a collection of qubits bound together to represent a single logical node. The association between the logical problem and the physical problem is carried out by minor embedding. For more details on minor embedding, see D-Wave Problem-Solving Handbook.

The qubits, denoted $q_i$, implement the Ising spins. Their physical connectivity determines which couplings, $J_{i,j}$, can be set to nonzero values. The allowed connectivity is described with a Chimera or Pegasus graph; see Figure 94 and Figure 97.

The following subsections provide a brief overview of the topologies. For further details, see the Getting Started with the D-Wave System guide.

### Chimera¶

The Chimera architecture comprises sets of connected unit cells, each with four horizontal qubits connected to four vertical qubits via couplers. Unit cells are tiled vertically and horizontally with adjacent qubits connected, creating a lattice of sparsely connected qubits. The notation CN refers to a Chimera graph consisting of an $N {\rm x} N$ grid of unit cells. The D-Wave 2000Q QPU supports a C16 Chimera graph: its more than 2000 qubits are logically mapped into a $16 {\rm x} 16$ matrix of unit cells of 8 qubits.

An $M\times N \times L$ Chimera graph consists of an $M\times N$ two-dimensional lattice of blocks, with each block consisting of $2L$ variables, for a total of $2MNL$ variables. Fig. 94 A $3\times 3 \times 4$ Chimera graph. Nodes in an $M\times N\times L$ Chimera graph represent each of the $2MNL$ qubits, $q_i$. Edges (connections between nodes) in the graph, $J_{i,j}$, indicate couplings that may be nonzero. As an example, $J_{3,4}$ may be nonzero because an edge connects qubits 3 and 4, but $J_{2,3}$ must always be zero because no edge connects qubits 2 and 3. The basic repeating block of Chimera (a block of $2L$ variables with complete bipartite connectivity) may be tiled into an $M\times N$ lattice. The left-side variables within each block connect vertically; the right-side variables, horizontally.

As shown in Figure 95, Chimera qubits are characterized as having:

• nominal length 4—each qubit is connected to 4 orthogonal qubits through internal couplers
• degree 6—each qubit is coupled to 6 different qubits Fig. 95 Each qubit has 6 couplers, 4 within and 2 between unit cells.

### Pegasus¶

In the Pegasus topology, qubits are “oriented” vertically or horizontally, as in Chimera, but similarly aligned qubits are also shifted, as illustrated in Figure 96. Fig. 96 A cropped view of the Pegasus topology with qubits represented as horizontal and vertical loops. This graphic shows approximately three rows of 12 vertical qubits and three columns of 12 horizontal qubits for a total of 72 qubits, 36 vertical and 36 horizontal.

For QPUs with the Pegasus topology it is conceptually useful to categorize couplers as internal, external, and odd. Figure 97 shows the coupling of qubits in this topology. Fig. 97 Coupled qubits (represented as horizontal and vertical loops): the horizontal qubit in the center, shown in red and numbered 1, with its odd coupler and paired qubit also in red, is internally coupled to vertical qubits, in pairs 3 through 8, each pair and its odd coupler shown in a different color, and externally coupled to horizontal qubits 2 and 9, each shown in a different color.

Pegasus features qubits of degree 15 and native $K_4$ and $K_{6,6}$ subgraphs. Pegasus qubits are considered to have a nominal length of 12 (each qubit is connected to 12 orthogonal qubits through internal couplers) and degree of 15 (each qubit is coupled to 15 different qubits).

As we use the notation $C_n$ to refer to a Chimera graph with size parameter N, we refer to instances of Pegasus topologies by $P_n$; for example, $P_3$ is a graph with 144 nodes.

### Virtual Graphs¶

The D-Wave virtual graph feature provides tools and interactive examples that simplify the process of minor-embedding by enabling you to more easily create, optimize, use, and reuse an embedding for a given working graph. When you submit an embedding and specify a chain strength using these tools, they automatically calibrate the qubits in a chain to compensate for the effects of biases that may be introduced as a result of strong couplings. For more information on virtual graphs, see Virtual Graphs for High-Performance Embedded Topologies, D-Wave White Paper Series, no. 14-1020A, 2017. This and other white papers are available from https://www.dwavesys.com/resources/publications.

This section discusses the underlying controls that support virtual graphs, and that are also available for direct use: extended $J$ range and flux-bias offsets. These controls allow chains to behave more like physical qubits on the working graph, thereby improving the performance of embedded sampling and optimization problems.

Note

Despite the similarity in name, the virtual graphs feature is unrelated to D-Wave’s virtual full-yield chip (VFYC) solver.

#### Extended $J$ Range¶

As explained above, the Ising minimization problems that the D-Wave system solves may require that the model representing a problem be minor-embedded on the working graph, a process that involves creating qubit chains to represent logical variables. In an embedding, intra-chain qubit couplings must be strong compared to the input couplings between the chains.

Most discussions of chain strength involve the ratio of two absolute values:

• Chain coupling strength—Magnitude of couplings between qubits in a chain
• Problem scale—Maximum magnitude of couplings among qubits (physical or logical) in a problem

that is,

\begin{equation} \frac{chain\_coupling\_strength}{problem\_scale}. \end{equation}

For example, if all of our chains have $J$ values of $-1$, and our rescaled logical problem has $J$ values of $- \frac{1}{4}$ to $+ \frac{1}{4}$, we say that the chain strength is $4$. Likewise, if the chains have $J$ values of $-2$, and our rescaled logical problem has $J$ values of $- \frac{1}{2}$ to $+ \frac{1}{2}$, again we say that the chain strength is $4$.

Because the range of coupling strengths available is finite, chaining is typically accomplished by setting the coupling strength to the largest allowed negative value and scaling down the input couplings relative to that. Yet a reduced energy scale for the input couplings may make it harder for the QPU to find global optima for a problem.

To address this issue, some solvers support stronger chain couplings through an extended $J$ range. Because embedded problems typically have chain couplings that are at least twice as strong as the other couplings, and standard chain couplings are all negative, this feature effectively doubles the energy scale available for embedded problems; see Figure 98. Fig. 98 Embedding an input that does not fit directly on a D-Wave QPU. The original problem (A) has a green vertex that is replaced by a chain of two vertices connected with a ferromagnetic coupling (B). With the standard coupling range, the embedded problem must be scaled down for the QPU, which can lead to decreased performance (C). However, with an extended coupling range, no rescaling is necessary (D).

Using the available larger negative values of $J$ increases the dynamic $J$ range. On embedded problems, which use strong chains of qubits to build the underlying graph, this increased range means that the problem couplings are less affected by ICE and other effects. However, strong negative couplings can bias a chain and therefore flux-bias offsets must be applied to recalibrate it to compensate for this effect; see the next section for more information.

#### Flux-Bias Offsets¶

In an optimal QPU calibration, annealing an unbiased chain produces spin-state statistics that are equally split between spin-up and spin-down. When plotted against the $h$ values, this even distribution results in a sigmoid curve that passes through the point of origin (0,0); see Figure 99. However, qubits in a chain with strong negative $J$ values experience a $J$-induced bias—an offset magnetic field that is potentially $s$-dependent. This field shifts the sigmoid curve of plotted $h$ values from its ideal path. To compensate, chains using strong negative $J$ couplings must be recalibrated to remove the bias from the chain and reduce the incidence of analog errors that may be associated with minor-embedding.

Recalibration involves applying per-qubit flux-bias offsets to nudge the plotted $h$ sigmoid to its ideal position. The optimal offset value for a chain depends on the qubits and couplers involved and on the chain coupling strength.

Note

The applied flux bias is constant in time (and $s$), but it appears in the Hamiltonian shown in Eqn. 2.7 as a term $I_p \phi_{\rm flux bias} \sigma_z$—the applied energy grows as $\sqrt{A(s)}$. An applied flux bias is different from an applied $h$; do not use one to correct an error in the other.

  This offset occurs because of the higher susceptibility of the tunable coupler [Har2009].
  The same flux-bias offset values are used for all qubits in a chain. Fig. 99 Magnetization (mean spin value) of a single chain as a function of applied per-qubit flux-bias offset. A sigmoid function (dashed line) is fit to the data points, and the optimal per-qubit flux-bias offset is given by the x-intercept of this function.

As stated above, D-Wave provides tools that automatically calculate and apply the appropriate flux-bias offsets when you submit an embedding that has strongly coupled chains. At a high level, however, the process of determining the value for the flux-bias offsets involves:

1. Define the strong chain that you intend to use in your embedding. Define it independently from any problem.
2. Sweeping the per-qubit flux-bias offset from the minimum allowed value to the maximum allowed value, anneal the chain many times (e.g., 1000) and calculate the average spin state of the chain (the magnetization).
3. Plot the results, with the per-qubit flux-bias offset on the x-axis and the chain magnetization on the y-axis. Fit a sigmoid model (in this case, a tanh function) to the data.
4. Measure the x-intercept of the sigmoid model (i.e., the per-qubit flux-bias offset value that results in a perfectly balanced magnetization of 0).
5. Add flux-bias offsets across the qubits in the chain to achieve an equal probability of reading spin up or spin down.

This operation is known as balancing the chain.

  While you can also balance a chain by adjusting the $h$ values on each of the chained qubits, this is less effective as a means to improve performance.

#### Usage Guidelines for Virtual Graph Features¶

Whether virtual graph features are available may vary by solver; check for the extended_j_range property to see if it is present and what the range is. (The j_range property is unchanged.) When using an extended $J$ range, be aware that there are additional limits on the total coupling per qubit: the sum of the $J$ values for each qubit must be in the range specified by the per_qubit_coupling_range solver property.

Flux-biases are set through the flux_biases parameter, which takes a list of doubles (floating-point numbers) of length equal to the number of working qubits. Flux-bias units are $\Phi_0$; typical values needed are $< 10^{-4} \ \Phi_0$. Maximum allowed values are typically $> 10^{-2} \ \Phi_0$. The minimum step size is smaller than the typical levels of intrinsic $1/f$ noise; see the Flux Noise and Quantum Annealing appendix.

Note

By default, the D-Wave system automatically compensates for flux drift as described in the Flux Noise and Quantum Annealing appendix. If you choose to disable this behavior, we recommend that you apply flux-bias offsets manually through the flux_biases parameter.

Be aware of the following points when using virtual graph features:

• Neither spin-reversal transformation (SRT) nor autoscaling is possible with extended $J$ ranges or flux-bias offsets. If you specify one or more SRTs with a problem submission, the problem is scaled down to use $J$ values that fall within the j_range property instead.

• When you use these features, autoscaling is automatically disabled by default, unlike in the usual system behavior.

• The $J$ value of every coupler must fall within the extended_j_range property.

• The sum of all the $J$ values of the couplers connected to a qubit must fall within the per_qubit_coupling_range property. For example, if this property is [-9.0 6.0], the following $J$ values for a six-coupler qubit are permissible:

$1, 1, 1, 1, 1, 1,$

where the sum is $6$, and also

$1, 1, 1, -2, -2, -2,$

where the sum is $-3$. However, the following values, when summed, exceed the range and therefore are impermissible:

$-2, -2, -2, -2, -2, -2.$
• While the extended $J$ range in principle allows you to create almost arbitrarily long chains without breakage, the expected chain length where embedded problems work well is in the range of 5 to 7 qubits.

• When embedding logical qubits using the extended $J$ range, limit the degree, $D$, of each node in the logical qubit tree to

\begin{equation} D = {\rm floor}\bigg[\frac{\min(per\_qubit\_coupling\_range) - num\_couplers\_per\_qubit \times \min(j\_range)}{\min(extended\_j\_range) - \min(j\_range)}\bigg] \end{equation}

where $num\_couplers\_per\_qubit = 6$ for the D-Wave 2000Q QPU; see Figure 95.