Stating the Problem¶
Once properly stated, a problem can be formulated as an objective function to be solved on a DWave solver. The example in the Introduction chapter states a communicationnetworks problem as the wellknown graph problem, vertex cover.
DWave provides several resource containing many reference examples:
Ocean software’s collection of code examples on GitHub.
DWave corporate website papers (for example, [dwave2]) and links to user applications.
For beginners, Ocean documentation provides a series of code examples, for different levels of experience, that explain various aspects of quantum computing in solving problems.
This chapter provides a sample of problems in various fields, and the available resources for each (at the time of writing); having a relevant reference problem may enable you to use similar solution steps when solving your own problems on DWave solvers.
Many of these are discrete optimization, also known as combinatorial optimization, which is the optimization of an objective function defined over a set of discrete values such as Booleans.
Keep in mind that there are different ways to model a given problem; for example, a constraint satisfaction problem (CSP) can have various domains, variables, and constraints. Model selection can affect solution performance, so it may be useful to consider various approaches.
Problem 
Beginner Content Available? 
Solvers 
Content 

QPU, hybrid 
Code, papers 

QPU 
Papers 

QPU 
Papers 

Yes 
QPU 
Code, papers 

QPU, hybrid 
Papers 

Yes 
QPU, hybrid 
Code, papers 

QPU 
Papers 

Yes 
QPU, hybrid 
Code, paper 

QPU 
Papers 

Yes 
QPU, hybrid 
Code, papers 

QPU, hybrid 
Papers 
Whether or not you see a relevant problem here, it’s recommended you check out the examples in DWave’s collection of code examples and corporate website for the latest examples of problems from all fields of study and industry.
Circuits & Fault Diagnosis¶
Fault diagnosis attempts to quickly localize failures as soon as they are detected in systems such as sensor networks, process monitoring, and safety monitoring. Circuit fault diagnosis attempts to identify failed gates during manufacturing, under the assumption that gate failure is rare enough that the minimum number of gates failing is the most likely cause of the detected problem.
The Example Reformulation: Circuit Fault Diagnosis section in the Reformulating a Problem chapter shows the steps of solving a circuit fault diagnosis problem on a DWave QPU.
Code Examples¶
Ocean documentation example: MultipleGate Circuit
Solves a logic circuit problem using Ocean tools to demonstrate solving a CSP on a DWave QPU solver.
dwaveexamples code example: circuitfaultdiagnosis
Demonstrates the use of DWave solvers to solve a threebit multiplier circuit.
dwaveexamples code example: circuitequivalence
Verifies equivalence of two representations of electronic circuits using a discrete quadratic model (DQM).
Papers¶
[Bia2016] discusses embedding fault diagnosis CSPs on the DWave system.
[Bis2017] discusses a problem of diagnosing faults in an electrical powerdistribution system.
[Pap1976] discusses decomposing complex systems for the problem of generating tests for digitalfaults detection.
[Per2015] maps fault diagnosis to a QUBO and embeds onto a QPU.
Computer Vision¶
Computer vision develops techniques to enable computers to analyse digital images, including video. The field has applications in industrial manufacturing, healthcare, navigation, miltary, and many other areas.
Papers¶
[Arr2022] compares quantum and classical approaches to motion segmentation.
[Bir2021] discusses a quantum algorithm for solving a synchronization problem, specifically permutation synchronization, a nonconvex optimization problem in discrete variables.
[Gol2019] derive an algorithm for correspondence problems on point sets.
[Li2020] translates detection scores from bounding boxes and overlap ratio between pairs of bounding boxes into QUBOs for removing redundant object detections.
[Ngu2019] demonstrates good prediction performance of a regression algorithm for a lattice quantum chromodynamics simulation data using a DWave 2000Q.
Database Queries (SAT Filters)¶
A satisfiability (SAT) filter is a small data structure that enables fast querying over a huge dataset by allowing for false positives (but not false negatives).
Factoring¶
The factoring problem is to decompose a number into its factors. There is no known method to quickly factor large integers—the complexity of this problem has made it the basis of publickey cryptography algorithms.
Code Examples¶
dwaveexamples code example: factoring
Demonstrates the use of DWave QPU solvers to solve a small factoring problem.
dwaveexamples code example: factoringnotebook
Demonstrates the use of DWave QPU solvers to solve a small factoring problem.
Papers¶
[Dri2017] investigates prime factorization using quantum annealing and computational algebraic geometry, specifically Grobner bases.
[Dwave3] discusses integer factoring in the context of using the DWave Anneal Offsets feature; see also the Anneal Offsets section.
[Bur2002] discusses factoring as optimization.
[Jia2018] develops a framework to convert an arbitrary integer factorization problem to an executable Ising model.
[Lin2021] applies deep reinforcement learning to configure adiabatic quantum computing on prime factoring problems.
Finance¶
Portfolio optimization is the problem of optimizing the allocation of a budget to a set of financial assets.
Papers¶
[Coh2020] investigates the use of quantum computers for building an optimal portfolio.
[Coh2020b] analyzes 3,171 US common stocks to create an efficient portfolio.
[Das2019] provides a quantum annealing algorithm in QUBO form for a dynamic asset allocation problem using expected shortfall constraint.
[Din2019] seeks the optimal configuration of a supply chain’s infrastructures and facilities based on customer demand.
[Els2017] discusses using Markowitz’s optimization of the financial portfolio selection problem on the DWave system.
[Gra2021] uses portfolio optimization as a case study by which to benchmark quantum annealing controls.
[Kal2019] explores how commercially available quantum hardware and algorithms can solve real world problems in finance.
[Mug2020] implements dynamic portfolio optimization on quantum and quantuminspired algorithms and compare with DWave hybrid solvers.
[Mug2021] proposes a hybrid quantumclassical algorithm for dynamic portfolio optimization with minimal holding period.
[Oru2019] looks at forecasting financial crashes.
[Pal2021] implement in a simple way some complex reallife constraints on the portfolio optimization problem
[Phi2021] selects a set of assets for investment such that the total risk is minimised, a minimum return is realised and a budget constraint is met.
[Ros2016a] discusses solving a portfolio optimization problem on the DWave system.
[Ven2019] investigates a hybrid quantumclassical solution method to the meanvariance portfolio optimization problems.
Graph Partitioning¶
Graph partitioning is the problem of reducing a graph into mutually exclusive sets of nodes.
Code Examples¶
 dwaveexamples code example: graphpartitioning
Solves a graph partitioning problem on the QPU.
 dwaveexamples code example: graphpartitioningdqm
Solves a graph partitioning problem on Leap’s DQM solver.
 dwaveexamples code example: maximumcut
Solves a maximum cut problem on the QPU.
 dwaveexamples code example: clustering
Identifies clusters in a data set.
 dwaveexamples code example: immunizationstrategy
Fragments a population into separate groups via a “separator” on Leap’s DQM solver.
Papers¶
[Bod1994] investigates the complexity of the maximum cut problem.
[Gue2018] performs simulations of the Quantum Approximate Optimization Algorithm (QAOA) for maximum cut problems.
[Hig2022] computes coreperiphery partition for an undirected network formulated as a QUBO problem.
[Jas2019] proposes a using quantum annealing on extreme clustering problems.
[Ush2017] discusses unconstrained graph partitioning as community clustering.
[Zah2019] proposes an algorithm to detect multiple communities in a signed graph.
Machine Learning¶
Artificial intelligence (AI) is transforming the world. You see it every day at home, at work, when shopping, when socializing, and even when driving a car. Machine learning algorithms operate by constructing a model with parameters that can be learned from a large amount of example input so that the trained model can make predictions about unseen data.
Most of the transformation that AI has brought todate has been based on deterministic machine learning models such as feedforward neural networks. The real world, however, is nondeterministic and filled with uncertainty. Probabilistic models explicitly handle this uncertainty by accounting for gaps in our knowledge and errors in data sources.
A probability distribution is a mathematical function that assigns a probability value to an event. Depending on the nature of the underlying event, this function can be defined for a continuous event (e.g., a normal distribution) or a discrete event (e.g., a Bernoulli distribution). In probabilistic models, probability distributions represent the unobserved quantities in a model (including noise effects) and how they relate to the data. The distribution of the data is approximated based on a finite set of samples. The model infers from the observed data, and learning occurs as it transforms the prior distributions, defined before observing the data, into posterior distributions, defined afterward. If the training process is successful, the learned distribution resembles the actual distribution of the data to the extent that the model can make correct predictions about unseen situations—correctly interpreting a previously unseen handwritten digit, for example.
In short, probabilistic modeling is a practical approach for designing machines that:
Learn from noisy and unlabeled data
Define confidence levels in predictions
Allow decision making in the absence of complete information
Infer missing data and latent correlations in data
Machine learning algorithms operate by constructing a model with parameters that can be learned from a large amount of example input so that the trained model can make predictions about unseen data.
Boltzmann Distribution¶
A Boltzmann distribution is an energybased discrete distribution that defines probability, \(p\), for each of the states in a binary vector.
Assume \(\vc{x}\) represents a set of \(N\) binary random variables. Conceptually, the space of \(\vc{x}\) corresponds to binary representations of all numbers from 0 to \(2^N  1\). You can represent it as a column vector, \(\vc{x}^T = [x_1, x_2, \dots, x_N]\), where \(x_n \in \{0, 1\}\) is the state of the \(n^{th}\) binary random variable in \(\vc{x}\).
The Boltzmann distribution defines a probability for each possible state that \(\vc{x}\) can take using1
 1
\(\beta\) is omitted from this equation because usually, in the context of machine learning, it is assumed to be 1.
where \(E(\vc{x};\theta)\) is an energy function parameterized by \(\theta\), which contains the biases, and
is the normalizing coefficient, also known as the partition function, that ensures that \(p(\vc{x})\) sums to 1 over all the possible states of \(x\); that is,
Note that because of the negative sign for energy, \(E\), the states with high probability correspond to states with low energy.
The energy function \(E(\vc{x};\theta)\) can be represented as a QUBO: the linear coefficients bias the probability of individual binary variables in \(\vc{x}\) and the quadratic coefficients represent the correlation weights between the elements of \(\vc{x}\). The DWave architecture, which natively processes information through the Ising/QUBO models (linear coefficients are represented by qubit biases and quadratic coefficients by coupler strengths), can help discrete energybased machine learning.
Sampling from the DWave QPU¶
Sampling from energybased distributions is a computationally intensive task that is an excellent match for the way that the DWave system solves problems; that is, by seeking lowenergy states. Samples from the DWave QPU can be obtained quickly and provide an advantage over sampling from classical distributions.
When training a probabilistic model, you need a wellcharacterized distribution; otherwise, it is difficult to calculate gradients and you have no guarantee of convergence. While both classical Boltzmann and quantum Boltzmann distributions are well characterized, all but the smallest problems solved by the QPU should undergo postprocessing to bring them closer to a Boltzmann distribution; for example, by running a lowtreewidth postprocessing algorithm. The DWave 2000Q system provides such an algorithm that you (optionally) enable when you submit a problem; see the QPU Solver Datasheet guide for more information on the available algorithms.
Temperature Effects¶
As in statistical mechanics, \(\beta\) represents inverse temperature: \(1/(k_B T)\), where \(T\) is the thermodynamic temperature in kelvin and \(k_B\) is Boltzmann’s constant.
The DWave QPU operates at cryogenic temperatures, nominally \(15\) mK, which can be translated to a scale parameter \(\beta\). The effective value of \(\beta\) varies from QPU to QPU and in fact from problem to problem since the DWave QPU samples are not Boltzmann and timevarying phenomena may affect samples. Therefore, to attain Boltzmann samples, run the Gibbs chain for a number of iterations starting from quantum computer samples. The objective is to further anneal the samples to the correct temperature of interest \(T = 1/{\beta}\), where \(\beta = 1.0\).
In the DWave software, postprocessing refines the returned solutions to target a Boltzmann distribution characterized by \(\beta\), which is represented by a floating point number without units. When choosing a value for \(\beta\), be aware that lower values result in samples less constrained to the lowest energy states. For more information on \(\beta\) and how it is used in the sampling postprocessing algorithm, see the QPU Solver Datasheet guide.
Probabilistic Sampling: RBM
A restricted Boltzmann machine (RBM) is a special type of Boltzmann machine with a symmetrical bipartite structure; see Figure 35.
It defines a probability distribution over a set of binary variables that are divided into visible (input), \(\vc{v}\), and hidden, \(\vc{h}\), variables, which are analogous to the retina and brain, respectively.2 The hidden variables allow for more complex dependencies among visible variables and are often used to learn a stochastic generative model over a set of inputs. All visible variables connect to all hidden variables, but no variables in the same layer are linked. This limited connectivity makes inference and therefore learning easier because the RBM takes only a single step to reach thermal equilibrium if you clamp the visible variables to particular binary states.
 2
Analogy courtesy of Pedro Domingos in The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. Basic Books, 2015.
During the learning process, each visible variable is responsible for a feature from an item in the dataset to be learned. For example, images from the famous MNIST dataset of handwritten digits3 have 784 pixels, so the RBMs that are training from this dataset require 784 visible variables. Each variable has a bias and each connection between variables has a weight. These values determine the energy of the output.
Without the introduction of hidden variables, the energy function \(E(\vc{x})\) by itself is not sufficiently flexible to give good models. You can write \(\vc{x}=[\vc{v},\vc{h}]\) and denote the energy function as \(E(\vc{v},\vc{h})\).
Then,
\begin{equation} p(\vc{x};\theta) = p(\vc{v},\vc{h};\theta) \end{equation}and of interest is
\begin{equation} p(\vc{v};\theta) = \sum_\vc{h} p(\vc{v},\vc{h};\theta), \end{equation}which you can obtain by marginalizing over the hidden variables, \(\vc{h}\).
A standard training criterion used to determine the energy function is to maximize the log likelihood (LL) of the training data—or, equivalently, to minimize the negative log likelihood (NLL) of the data. Training data is repetitively fed to the model and corresponding improvements made to the model.
When training a model, you are given \(D\) training (visible) examples \(\vc{v}^{(1)}, ..., \vc{v}^{(D)}\), and would like to find a setting for \(\theta\) under which this data is highly likely. Note that \(n^{th}\) component of the \(d^{th}\) training example is \(v_n^{(d)}\).
To find \(\theta\), maximize the likelihood of the training data:
The likelihood is \(L(\theta) = \prod_{d=1}^D p(v^{(d)};\theta)\)
It is more convenient, computationally, to maximize the log likelihood:
\begin{equation} LL(\theta)=log(L(\theta))=\sum_{d=1}^D {\log}p(v^{(d)};\theta). \end{equation}You can use the gradient descent method to minimize the \(NLL(\theta)\):
Starting at an initial guess for \(\theta\) (say, all zero values), calculate the gradient (the direction of fastest improvement) and then take a step in that direction.
Iterate by taking the gradient at the new point and moving downhill again.
To calculate the gradient at a particular \(\theta\), evaluate some expected values: \(E_{p(\vc{x};\theta)} f(\vc{x})\) for a set of functions \(f(\vc{x})\) known as the sufficient statistics. The expected values cannot be determined exactly, because you cannot sum over all \(2^N\) configurations; therefore, approximate by only summing over the most probable configurations, which you can obtain by sampling from the distribution given by the current \(\theta\).
EnergyBased Models
Machine learning with energybased models (EBMs) minimizes an objective function by lowering scalar energy for configurations of variables that best represent dependencies for probabilistic and nonprobabilistic models.
For an RBM as a generative model, for example, where the gradient needed to maximize loglikelihood of data is intractable (due to the partition function for the energy objective function), instead of using the standard Gibbs’s sampling, use samples from the DWave system. The training will have steps like these: a. Initialize variables. b. Teach visible nodes with training samples. c. Sample from the DWave system. d. Update and repeat as needed.
Support Vector Machines
Support vector machines (SVM) find a hyperplane separating data into classes with maximized margin to each class; structured support vector machines (SSVM) assume structure in the output labels; for example, a beach in a picture increases the chance the picture is of a sunset.
Boosting
In machine learning, boosting methods are used to combine a set of simple, “weak” predictors in such a way as to produce a more powerful, “strong” predictor.
Code Examples¶
Qboost is an example of formulating boosting as an optimization problem for solution on a QPU.
Papers¶
General machine learning and sampling:
[Bia2010] discusses using quantum annealing for machine learning applications in two modes of operation: zerotemperature for optimization and finitetemperature for sampling.
[Ben2017] discusses sampling on the DWave system.
[Inc2022] presents a QUBO formulation of the Graph Edit Distance problem and uses quantum annealing and variational quantum algorithms on it.
[Muc2022] proposes and evaluates a featureselection algorithm based on QUBOs.
[Per2022] presents a systematic literature review of 2017–21 published papers to identify, analyze and classify different algorithms used in quantum machine learning and their applications.
[Vah2017] discusses label noise in neural networks.
RBMs:
[Ada2015] describes implementing an RBM on the DWave system to generate samples for estimating model expectations of deep neural networks.
[Dum2013] discusses implementing an RBM using physical computation.
[Hin2012] is a tutorial on training RBMs.
[Kor2016] benchmarks quantum hardware on Boltzmann machines.
[Mac2018] discusses mutual information and renormalization group using RBMs.
[Rol2016] describes discrete variational autoencoders.
[Sal2007] describes RBMs used to model tabular data, such as users’ ratings of movies.
[Vin2019] describes using DWave quantum annealers as Boltzmann samplers to perform quantumassisted, endtoend training of QVAE.
EnergyBased Models:
[Lec2006] describes EBMs.
Support Vector Machines:
[Boy2007] gives a concise introduction to subgradient methods.
[Wil2019] gives a method to train SVMs on a DWave 2000Q, and applies it to data from biology experiments.
Boosting:
[Nev2012] describes the Qboost formulation.
Map Coloring¶
Map coloring is an example of a constraint satisfaction problem (CSP). CSPs require that all a problem’s variables be assigned values, out of a finite domain, that result in the satisfying of all constraints. The mapcoloring CSP is to assign a color to each region of a map such that any two regions sharing a border have different colors.
The Example Reformulation: Map Coloring section in the Reformulating a Problem chapter is an example of map coloring on the DWave system.
Code Examples¶
Ocean documentation example: Large Map Coloring
Demonstrates an outofthebox use of an Ocean classicalquantum hybrid sampler solving a problem of arbitrary structure and size.
Ocean documentation example: Map Coloring: Hybrid DQM Sampler
Demonstrates Leap’s hybrid discrete quadratic model (DQM) solver.
Ocean documentation example: Map Coloring
Demonstrates solving a mapcoloring CSP on a QPU.
dwaveexamples code example: mapcoloring
Demonstrates the use of DWave QPU solvers to solve a mapcoloring problem.
Material Simulation¶
One promise of quantum computing lies in harnessing programmable quantum devices for practical applications such as efficient simulation of quantum materials and condensed matter systems; for example, simulation of geometrically frustrated magnets in which topological phenomena can emerge from competition between quantum and thermal fluctuations.
Protein folding refers to the way protein chains structure themselves in the context of providing some biological function. Although their constituent amino acids enable multiple configurations, proteins rarely misfold (such proteins are a cause of disease) because the standard configuration has lower energy and so is more stable.
Papers¶
[Kin2021] report on experimental observations of equilibration in simulation of geometrically frustrated magnets.
[Mni2021] reduces the molecular Hamiltonian matrix in Slater determinant basis to determine the lowest energy cluster.
[Per2012] discusses using the DWave system to find the lowestenergy configuration for a folded protein.
[Tep2021] uses a quantumclassical solver to calculate excited electronic states of molecular systems. Note that the paper uses the qbsolv package, which has since been discontinued in favor of Leap hybrid solvers and the dwavehybrid package.
Scheduling¶
The wellknown jobshop scheduling problem is to maximize priority or minimize schedule length (known as a makespan, the time interval between starting the first job and finishing the last) of multiple jobs done on several machines, where a job is an ordered sequence of tasks performed on particular machines, with constraints that a machine executes one task at a time and must complete started tasks.
Code Examples¶
Ocean documentation example: Vertex Cover
Shows new users how to formulate a constraint satisfaction problem (CSP) using Ocean tools and solve it on a DWave QPU solver.
dwaveexamples code example: jobshopscheduling
An implementation of [Ven2015] for DWave QPU solvers.
dwaveexamples code example: employeescheduling
A formulation of a discrete quadratic model (DQM) for solution on Leap’s hybrid DQM solver.
dwaveexamples code example: nursescheduling
An implementation of [Ike2019] that forms a QUBO for solution by Leap’s hybrid BQM solver.
Papers¶
[Ike2019] describes an implementation of nurse scheduling.
[Kur2020] describes an implementation of jobshop scheduling on a DWave QPU solver.
[Liu2020] proposes to use deep reinforcement learning on jobshop scheduling.
[Ven2015] describes an implementation of jobshop scheduling on the DWave system, which includes formulating the problem, translating to QUBO, and applying variable reduction techniques. It also talks about direct embedding of local constraints.
Traffic Flow¶
One form of the trafficflow optimization problem is to minimize the travel time of a group of vehicles from their sources to destinations by minimizing congestion on the roads being used.