[WIP] Parallelize swap trials in stochastic swap #4781

mtreinish · 2020-07-22T18:46:24Z

Summary

This commit adds the use of parallel_map to parallelize the execution of
individual swap trials as a part of stochastic_swap transpiler pass. The
individual trials are independent and can be run in parallel without any
issues.

As part of this pickling support is added for the NLayout
Cython class which is a parameter used in the swap trial method. Without
this the parallel workers would not be able to send or recieve these
objects since pickling is used to pass data between processes.

An alternative approach to explore in the future is to leverage cython's
builtin parallel and openmp support to use multiple threads (without the
GIL) to avoid the serialization overhead and have more efficient
parallelization. This will require a refactor of the cython code though
because to effectively leverage parallel cython we'll need to minimize
the creation of new python objects (ie run in nogil mode).

Details and comments

Fixes #1743

This commit adds the use of parallel_map to parallelize the execution of individual swap trials as a part of stochastic_swap transpiler pass. The individual trials are independent and can be run in parallel without any issues. As part of this pickling support is added for the NLayout Cython class which is a parameter used in the swap trial method. Without this the parallel workers would not be able to send or recieve these objects since pickling is used to pass data between processes. An alternative approach to explore in the future is to leverage cython's builtin parallel and openmp support to use multiple threads (without the GIL) to avoid the serialization overhead and have more efficient parallelization. This will require a refactor of the cython code though because to effectively leverage parallel cython we'll need to minimize the creation of new python objects. Fixes Qiskit#1743

The previous iteration stored shared state between processes as attributes on the StochasticSwap pass object, this resulted in a pickle being created on each side of a process that contained all the contents of the pass object which ended up having a large serialization overhead. This commit removes that by moving parallel_map and the worker function outside of the pass object io are handled as standalone parameters except for the short circuit if an ideal layout is found, then a tempfile is used because this is the only reliable way to share state between processes without the serialization overhead.

This commit fixes an issue with the performance of the pool used underneath parallel_map. parallel_map spawns a new worker pool each time it is called, and we were calling parallel_map once per layer_permutation call which ends up being called multiple times. The overhead of launching multiple multiprocessing.Pool obejcts added up and became the performance bottleneck. This resolves that by creating a multiprocessing pool during run() and passing that through to each call to layer_permutation so we only launch it once.

This commit caps the number of workers to min(8, CPU_COUNT) and configures each worker to run iters / min(8, CPU_COUNT) swap_trials in each worker process. The serialization cost of all the data we need to pass to swap_trial was adding too much overhead to each process, so that the serialization time was greater than the amount of work run on each worker. This commit adjusts the worker pool size to max out at 8 to try and ensure each worker has sufficient work to justify launching a new thread. Further tuning will likely be required as this is dependent on the CPU_COUNT of the system and the number of iteratrions. But this is a starting point.

This commit adjusts the return from each parallel worker so that instead of returning the full list of results from the trials executed it instead returns a single result representing the local best. For large number of trials per worker this should cut down on the serialization cost because there is less data being sent back to the parent process.

This commit reworks the logic for selecting the number of workers. To optimize the throughput on the workers we want to make sure there is sufficient work available on each node to justify the serialization cost of the input parameters. A rough idea is to make sure that each worker runs at least 5 trials. So this changes the logic from running min(CPU_COUNT, 8) to making the maximum number min(workers trials / 5, CPU_COUNT). This should ensure that for larger trial counts we can leverage systems with more cores but still maintain throughput with smaller trial counts.

This commit reworks the pickle methods added to the NLayout class to improve the serialization preformance by relying on numpy types. Instead of looping over the array creating a list on getstate and then looping over the list and copying into a new array which could be slow this pivots to using numpy. On getstate we cast the arrays to numpy arrays then rely on numpy's pickle support for the output state. Then on setstate when we read the state we copy from the numpy array to the new location in memory allocated for the new object.

Instead of initializing a new rng object every time _swap_trial is called this commit adds rng initialization as a global for each child process when the Pool is created. This should avoid unecessary overhead initializing a new rng for each iteration.

There are a few arrays that are static for the life of a run of the stochastic swap pass that are just based on the data from the coupling map. Instead of serializing these everytime we run _swap_trial this commit creates some shared arrays that are passed to each child process at Pool creation. This can't be used for other arrays because there are run time changes to the data.

…it-core into multiproc-stochastic-swap

mtreinish · 2020-07-24T20:49:53Z

This needs some more investigation and tuning. As this sits now this PR causes a bit of a performance regression for normal cases. It looks like the overhead of forking and serializing the parameters to swap_trial and the results outweigh the speed of parallel execution for those cases But there are some specific cases where this is significantly faster, for example the asv qv 50x20 level3 benchmark in qiskit/qiskit is ~2X faster with this PR applied. I'm thinking before we move forward on this we either might want to make this a config flag to let users choose for their use case. Or try to better characterize where this makes sense and make parallel execution conditional on those conditions

mtreinish · 2020-07-24T20:58:21Z

I just put together a custom asv benchmark to try and map out this a bit better:

from qiskit.transpiler import CouplingMap
from qiskit.transpiler.passes import *
from qiskit.converters import circuit_to_dag

from .utils import random_circuit

class StochasticSwapBenchmarks:
    params = ([5, 14, 20, 50, 100, 200, 500],
              [64, 512, 1024],
              [20, 30, 50, 100, 200, 500, 1000],
              [5, 14, 20, 50, 100, 200, 500])

    param_names = ['n_qubits', 'depth', 'trials', 'coupling_map_qubits']
    timeout = 300

    def setup(self, n_qubits, depth, trials, cmap_qubits):
        if n_qubits > cmap_qubits:
            raise NotImplementedError
        seed = 42
        self.circuit = random_circuit(n_qubits, depth, measure=True,
                                      conditional=True, reset=True, seed=seed,
                                      max_operands=2)
        self.fresh_dag = circuit_to_dag(self.circuit)
        self.basis_gates = ['u1', 'u2', 'u3', 'cx', 'iid']
        self.coupling_map = CouplingMap.from_full(cmap_qubits, False)
        layout_pass = DenseLayout(self.coupling_map)
        layout_pass.run(self.fresh_dag)
        self.layout = layout_pass.property_set['layout']
        full_ancilla_pass = FullAncillaAllocation(self.coupling_map)
        full_ancilla_pass.property_set['layout'] = self.layout
        self.full_ancilla_dag = full_ancilla_pass.run(self.fresh_dag)
        enlarge_pass = EnlargeWithAncilla()
        enlarge_pass.property_set['layout'] = self.layout
        self.enlarge_dag = enlarge_pass.run(self.full_ancilla_dag)
        apply_pass = ApplyLayout()
        apply_pass.property_set['layout'] = self.layout
        self.dag = apply_pass.run(self.enlarge_dag)

    def time_stochastic_swap(self, _, __, trial, ___):
        swap = StochasticSwap(self.coupling_map, trials=trial, seed=42)
        swap.property_set['layout'] = self.layout
        swap.run(self.dag)

    def track_stochastic_swap_depth(self, _, __, trial, ___):
        swap = StochasticSwap(self.coupling_map, trials=trial, seed=42)
        swap.property_set['layout'] = self.layout
        return swap.run(self.dag).depth()

My feeling is that there is a point for input circuits where the # of layers in dag X number of trials X number of qubits where this makes sense vs not (or some similar combination). But that's just a gut feeling which is why we need to collect data.

I'm running this locally now and comparing it against master. I assume it's going to take quite some time to run, I'll update when I have data.

This commit removes the usage of RawArray for the shared objects between the trial workers. This was unecessary since there is no modification of the arrays being shared and adds unecessary overhead.

This switches the RNG seed used for the per trial worker RNG objects to be supplied seed + count instead of supplied seed + PID. The PID is randomly assigned by the OS and makes the results from the pass not reproducible for a fixed seed. Using itertools count means that the seed is fixed for X number of workers.

Differences in floating point precision between different systems and environments was causing inconsistent decompositions for decompositions with 0s in the unitary would have differing results because of very subtle floating point precision errors in the results. For example one system would return 0 for a value and another would return -1.55582133e-19 (which is essentially 0). This would have compound effects in later stages because the signs were different. This commit attempts to fix cases like this by rounding to 13 decimal places to prevent issues like this in the future.

…swap

mtreinish · 2022-01-31T16:36:31Z

I'm closing this as this direction isn't really practical for a performance improvement. I did run the earlier analysis back in July 2020 and there wasn't really a clear pattern for improvement anywhere and pretty much slower across the board. I did generate this visualization as part of it:

the color bar is indicating performance ratio compared to main at the time. So anything >1 is a regression.

On the whole we're spending too much time in the overhead for launch parallel processes and moving the data back and forth between the main processes and the child processes (with pickle overhead). The only path forward here I think is going to be to implement stochastic swap in

This commit is a rewrite of the core swap trials functionality in the StochasticSwap transpiler pass. Previously this core routine was written using Cython (see Qiskit#1789) which had great performance, but that implementation was single threaded. The core of the stochastic swap algorithm by it's nature is well suited to be executed in parallel, it attempts a number of random trials and then picks the best result from all the trials and uses that for that layer. These trials can easily be run in parallel as there is no data dependency between the trials (there are shared inputs but read-only). As the algorithm generally scales exponentially the speed up from running the trials in parallel can offset this and improve the scaling of the pass. Running the pass in parallel was previously tried in Qiskit#4781 using Python multiprocessing but the overhead of launching an additional process and serializing the input arrays for each trial was significantly larger than the speed gains. To run the algorithm efficiently in parallel multithreading is needed to leverage shared memory on shared inputs. This commit rewrites the cython routine using rust. This was done for two reasons. The first is that rust's safety guarantees make dealing with and writing parallel code much easier and safer. It's also multiplatform because the rust language supports native threading primatives in language. The second is while writing parallel cython code using open-mp there are limitations with it, mainly on windows. In practice it was also difficult to write and maintain parallel cython code as it has very strict requirements on python and c code interactions. It was much faster and easier to port it to rust and the performance for each iteration (outside of parallelism) is the same (in some cases marginally faster) in rust. The implementation here reuses the data structures that the previous cython implementation introduced (mainly flattening all the terra objects into 1d or 2d numpy arrays for efficient access from C). The speedups from this PR can be significant, calling transpile() on a 400 qubit (with a depth of 10) QV model circuit targetting a 409 heavy hex coupling map goes from ~200 seconds with the single threaded cython to ~60 seconds with this PR locally on a 32 core system, When transpiling a 1000 qubit (also with a depth of 10) QV model circuit targetting a 1081 qubit heavy hex coupling map goes from taking ~6500 seconds to ~720 seconds. The tradeoff with this PR is for local qiskit-terra development a rust compiler needs to be installed. This is made trivial using rustup (https://rustup.rs/), but it is an additional burden and one that we might not want to make. If so we can look at turning this PR into a separate repository/package that qiskit-terra can depend on. The tradeoff here is that we'll be adding friction to the api boundary between the pass and the core swap trials interface. But, it does ease the dependency on development for qiskit-terra.

* Implement multithreaded stochastic swap in rust This commit is a rewrite of the core swap trials functionality in the StochasticSwap transpiler pass. Previously this core routine was written using Cython (see #1789) which had great performance, but that implementation was single threaded. The core of the stochastic swap algorithm by it's nature is well suited to be executed in parallel, it attempts a number of random trials and then picks the best result from all the trials and uses that for that layer. These trials can easily be run in parallel as there is no data dependency between the trials (there are shared inputs but read-only). As the algorithm generally scales exponentially the speed up from running the trials in parallel can offset this and improve the scaling of the pass. Running the pass in parallel was previously tried in #4781 using Python multiprocessing but the overhead of launching an additional process and serializing the input arrays for each trial was significantly larger than the speed gains. To run the algorithm efficiently in parallel multithreading is needed to leverage shared memory on shared inputs. This commit rewrites the cython routine using rust. This was done for two reasons. The first is that rust's safety guarantees make dealing with and writing parallel code much easier and safer. It's also multiplatform because the rust language supports native threading primatives in language. The second is while writing parallel cython code using open-mp there are limitations with it, mainly on windows. In practice it was also difficult to write and maintain parallel cython code as it has very strict requirements on python and c code interactions. It was much faster and easier to port it to rust and the performance for each iteration (outside of parallelism) is the same (in some cases marginally faster) in rust. The implementation here reuses the data structures that the previous cython implementation introduced (mainly flattening all the terra objects into 1d or 2d numpy arrays for efficient access from C). The speedups from this PR can be significant, calling transpile() on a 400 qubit (with a depth of 10) QV model circuit targetting a 409 heavy hex coupling map goes from ~200 seconds with the single threaded cython to ~60 seconds with this PR locally on a 32 core system, When transpiling a 1000 qubit (also with a depth of 10) QV model circuit targetting a 1081 qubit heavy hex coupling map goes from taking ~6500 seconds to ~720 seconds. The tradeoff with this PR is for local qiskit-terra development a rust compiler needs to be installed. This is made trivial using rustup (https://rustup.rs/), but it is an additional burden and one that we might not want to make. If so we can look at turning this PR into a separate repository/package that qiskit-terra can depend on. The tradeoff here is that we'll be adding friction to the api boundary between the pass and the core swap trials interface. But, it does ease the dependency on development for qiskit-terra. * Sanitize packaging to support future modules This commit fixes how we package the compiled rust module in qiskit-terra. As a single rust project only gives us a single compiled binary output we can't use the same scheme we did previously with cython with a separate dynamic lib file for each module. This shifts us to making the rust code build a `qiskit._accelerate` module and in that we have submodules for everything we need from compiled code. For this PR there is only one submodule, `stochastic_swap`, so for example the parallel swap_trials routine can be imported from `qiskit._accelerate.stochastic_swap.swap_trials`. In the future we can have additional submodules for other pieces of compiled code in qiskit. For example, the likely next candidate is the pauli expectation value cython module, which we'll likely port to rust and also make parallel (for sufficiently large number of qubits). In that case we'd add a new submodule for that functionality. * Adjust random normal distribution to use correct mean This commit corrects the use of the normal distribution to have the mean set to 1.0. Previously we were doing this out of band for each value by adding 1 to the random value which wasn't necessary because we could just generate it with a mean of 1.0. * Remove unecessary extra scope from locked read This commit removes an unecessary extra scope around the locked read for where we store the best solution. The scope was previously there to release the lock after we check if there is a solution or not. However this wasn't actually needed as we can just do the check inline and the lock will release after the condition block. * Remove unecessary explicit type from opt_edges variable * Fix indices typo in NLayout constructor Co-authored-by: Jake Lishman <jake@binhbar.com> * Remove explicit lifetime annotation from swap_trials Previously the swap_trials() function had an explicit lifetime annotation `'p` which wasn't necessary because the compiler can determine this on it's own. Normally when dealing with numpy views and a Python object (i.e. a GIL handle) we need a lifetime annotation to tell the rust compiler the numpy view and the python gil handle will have the same lifetime. But since swap_trials doesn't take a gil handle and operates purely in rust we don't need this lifetime and the rust compiler can deal with the lifetime of the numpy views on their own. * Use sum() instead of fold() * Fix lint and add rust style and lint checks to CI This commit fixes the python lint failures and also updates the ci configuration for the lint job to also run rust's style and lint enforcement. * Fix returned layout mapping from NLayout This commit fixes the output list from the `layout_mapping()` method of `NLayout`. Previously, it incorrectly would return the wrong indices it should be a list of virtual -> physical to qubit pairs. This commit corrects this error Co-authored-by: georgios-ts <45130028+georgios-ts@users.noreply.github.com> * Tweak tox configuration to try and reliably build rust extension * Make swap_trials parallelization configurable This commit makes the parallelization of the swap_trials() configurable. This is dones in two ways, first a new argument parallel_threshold is added which takes an optional int which is the number of qubits to switch between a parallel and serial version. The second is that it takes into account the the state of the QISKIT_IN_PARALLEL environment variable. This variable is set to TRUE by parallel_map() when we're running in a multiprocessing context. In those cases also running stochastic swap in parallel will likely just cause too much load as we're potentially oversubscribing work to the number of available CPUs. So, if QISKIT_IN_PARALLEL is set to True we run swap_trials serially. * Revert "Make swap_trials parallelization configurable" This reverts commit 57790c8. That commit attempted to sovle some issues in test running, mainly around multiple parallel dispatch causing exceess load. But in practice it was broken and caused more issues than it fixed. We'll investigate and add control for the parallelization in a future commit separately after all the tests are passing so we have a good baseline. * Add docs to swap_trials() and remove unecessary num_gates arg * Fix race condition leading to non-deterministic behavior Previously, in the case of circuits that had multiple best possible depth == 1 solutions for a layer, there was a race condition in the fast exit path between the threads which could lead to a non-deterministic result even with a fixed seed. The output was always valid, but which result was dependent on which parallel thread with an ideal solution finished last and wrote to the locked best result last. This was causing weird non-deterministic test failures for some tests because of #1794 as the exact match result would change between runs. This could be a bigger issue because user expectations are that with a fixed seed set on the transpiler that the output circuit will be deterministically reproducible. To address this is issue this commit trades off some performance to ensure we're always returning a deterministic result in this case. This is accomplished by updating/checking if a depth==1 solution has been found in another trial thread we only act (so either exit early or update the already found depth == 1 solution) if that solution already found has a trial number that is less than this thread's trial number. This does limit the effectiveness of the fast exit, but in practice it should hopefully not effect the speed too much. As part of this commit some tests are updated because the new deterministic behavior is slightly different from the previous results from the cython serial implementation. I manually verified that the new output circuits are still valid (it also looks like the quality of the results in some of those cases improved, but this is strictly anecdotal and shouldn't be taken as a general trend with this PR). * Apply suggestions from code review Co-authored-by: georgios-ts <45130028+georgios-ts@users.noreply.github.com> * Fix compiler errors in previous commit * Revert accidental commit of parallel reduction in compute_cost This was only a for local testing to prove it was a bad idea and was accidently included in the branch. We should not nest the parallel execution like this. * Eliminate short circuit for depth == 1 swap_trial() result This commit eliminates the short circuit fast return in swap_trial() when another trial thread has found an ideal solution. Trying to do this in a parallel context is tricky to make deterministic because in cases of >1 depth == 1 solutions there is an inherent race condition between the threads for writing out their depth == 1 result to the shared location. Different strategies were tried to make this reliably deterministic but there wa still a race condition. Since this was just a performance optimization to avoid doing unnecessary work this commit removes this step. Weighing improved performance against repeatability in the output of the compiler, the reproducible results are more important. After we've adopted a multithreaded stochastic swap we can investigate adding this back as a potential future optimization. * Add missing docstrings * Add section to contributing on installing form source * Make rust python classes pickleable * Add rust compiler install to linux wheel jobs * Try more tox changes to fix docs builds * Revert "Eliminate short circuit for depth == 1 swap_trial() result" This reverts commit c510764. The removal there was premature and we had a fix for the non-determinism in place, ignoring a typo which was preventing it from working. Co-Authored-By: Georgios Tsilimigkounakis <45130028+georgios-ts@users.noreply.github.com> * Fix submodule declaration and module attribute on rust classes * Fix rust lint * Fix docs job definition * Disable multiprocessing parallelism in unit tests This commit disables the multiprocessing based parallelism when running unittest jobs in CI. We historically have defaulted the use of multiprocessing in environments only where the "fork" start method is available because this has the best performance and has no caveats around how it is used by users (you don't need an `if __name__ == "__main__"` guard). However, the use of the "fork" method isn't always 100% reliable (see https://bugs.python.org/issue40379), which we saw on Python 3.9 #6188. In unittest CI (and tox) by default we use stestr which spawns (not using fork) parallel workers to run tests in parallel. With this PR this means in unittest we're now running multiple test runner subprocesses, which are executing parallel dispatched code using multiprocessing's fork start method, which is executing multithreaded rust code. This three layers of nesting is fairly reliably hanging as Python's fork doesn't seem to be able to handle this many layers of nested parallelism. There are 2 ways I've been able to fix this, the first is to change the start method used by `parallel_map()` to either "spawn" or "forkserver" either of these does not suffer from random hanging. However, doing this in the unittest context causes significant overhead and slows down test executing significantly. The other is to just disable the multiprocessing which fixes the hanging and doesn't impact runtime performance signifcantly (and might actually help in CI so we're not oversubscribing the limited resources. As I have not been able to reproduce `parallel_map()` hanging in a standalone context with multithreaded stochastic swap this commit opts for just disabling multiprocessing in CI and documenting the known issue in the release notes as this is the simpler solution. It's unlikely that users will nest parallel processes as it typically hurts performance (and parallel_map() actively guards against it), we only did it in testing previously because the tests which relied on it were a small portion of the test suite (roughly 65 tests) and typically did not have a significant impact on the total throughput of the test suite. * Fix typo in azure pipelines config * Remove unecessary extension compilation for image tests * Add test script to explicitly verify parallel dispatch In an earlier commit we disabled the use of parallel dispatch in parallel_map() to avoid a bug in cpython associated with their fork() based subprocess launch. Doing this works around the bug which was reliably triggered by running multiprocessing in parallel subprocesses. It also has the side benefit of providing a ~2x speed up for test suite execution in CI. However, this meant we lost our test coverage in CI for running parallel_map() with actual multiprocessing based parallel dispatch. To ensure we don't inadvertandtly regress this code path moving forward this commit adds a dedicated test script which runs a simple transpilation in parallel and verifies that everything works as expected with the default parallelism settings. * Avoid multi-threading when run in a multiprocessing context This commit adds a switch on running between a single threaded and a multithreaded variant of the swap_trials loop based on whether the QISKIT_IN_PARALLEL flag is set. If QISKIT_IN_PARALLEL is set to TRUE this means the `parallel_map()` function is running in the outer python context and we're running in multiprocessing already. This means we do not want to be running in multiple threads generally as that will lead to potential resource exhaustion by spawn n processes each potentially running with m threads where `n` is `min(num_phys_cpus, num_tasks)` and `m` is num_logical_cpus (although only `min(num_logical_cpus, num_trials)` will be active) which on the typical system there aren't enough cores to leverage both multiprocessing and multithreading. However, in case a user does have such an environment they can set the `QISKIT_FORCE_THREADS` env variable to `TRUE` which will use threading regardless of the status of `QISKIT_IN_PARALLEL`. * Apply suggestions from code review Co-authored-by: Jake Lishman <jake@binhbar.com> * Minor fixes from review comments This commits fixes some minor details found during code review. It expands the section on building from source to explain how to build a release optimized binary with editable mode, makes the QISKIT_PARALLEL env variable usage consistent across all jobs, and adds a missing shebang to the `install_rush.sh` script which is used to install rust in the manylinux container environment. * Simplify tox configuration In earlier commits the tox configuration was changed to try and fix the docs CI job by going to great effort to try and enforce that setuptools-rust was installed in all situations, even before it was actually needed. However, the problem with the docs ci job was unrelated to the tox configuration and this reverts the configuration to something that works with more versions of tox and setuptools-rust. * Add missing pieces of cargo configuration Co-authored-by: Jake Lishman <jake@binhbar.com> Co-authored-by: georgios-ts <45130028+georgios-ts@users.noreply.github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 689dd275aaf81c46b8c8e399097373a5ecc136a5 Merge: 408745636 77219b5c7 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Mon Mar 7 09:20:32 2022 +0900 Merge branch 'main' into primitives/base-class commit 408745636c507003419ae2a51480554e9b0524b4 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Sat Mar 5 12:38:38 2022 +0900 Apply suggestions from code review commit 77219b5c7b7146b1545c5e5190739b36f4064b2f Author: Jake Lishman <jake.lishman@ibm.com> Date: Fri Mar 4 20:38:04 2022 +0000 Workaround Aer bug with subnormal floats in randomised tests (#7719) Aer currently sets `-ffast-math` during compilation, which when compiled with GCC causes the CPU's floating-point rounding mode to be set to "flush to zero", and subnormal numbers are disallowed. This should not be the case, and Qiskit/qiskit-aer#1469 will solve the problem in release. Until then, we must instruct `hypothesis` to avoid subnormal numbers in its floating-point strategies, as since version 6.38 it explicitly tests to ensure that they are functional, if used. This commit should be reverted once Aer no longer sets `-ffast-math`. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 79cbb2455d2de2eead5e145785444ab42b5f8380 Author: Iulia Zidaru <iuliazidaru@users.noreply.github.com> Date: Fri Mar 4 20:52:03 2022 +0200 Relocate mock backends from qiskit.test.mock to qiskit.mock (#7437) * Relocate mock backends from qiskit.test.mock to qiskit.mock * Relocate mock backends from qiskit.test.mock to qiskit.mock * fix: Inline literal start-string without end-string * change package to qiskit.providers.fake_provider * fix test failure * reformat file * fix review comments * Release note is API change not feature Co-authored-by: Jake Lishman <jake@binhbar.com> commit 6e29dfe5c431a21828cc59868ebdb204c7f5c3a0 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Sat Mar 5 01:43:23 2022 +0900 fix by the suggestion commit 4e608871e79b89beed4c79fcb1c78014ea6874f6 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Sat Mar 5 01:40:34 2022 +0900 fix BaseSampler's doc commit 9f976ce01ba7e640f9fa6819c022d0b3ab4a9364 Author: Lev Bishop <18673315+levbishop@users.noreply.github.com> Date: Fri Mar 4 11:03:12 2022 -0500 Update qiskit/primitives/base_estimator.py commit db522c021b77cb7874aa1e1a29e2d44189fe440b Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Sat Mar 5 00:16:08 2022 +0900 Apply suggestions from code review Co-authored-by: Ali Javadi-Abhari <ajavadia@users.noreply.github.com> commit 7380fafc2e6203fb0b930e2769ccd817c7812c77 Merge: 7b143f7da 439f7a633 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Fri Mar 4 23:47:37 2022 +0900 Merge pull request #40 from t-imamichi/fix-sphnix fix sphinx markup commit 439f7a63395892a27140e683d8a1648a5a6dd696 Author: Takashi Imamichi <imamichi@jp.ibm.com> Date: Fri Mar 4 23:46:29 2022 +0900 fix sphinx markup commit 7b143f7dac4f5ca901a424ff09d00d518d6cb622 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 22:29:11 2022 +0900 Fix according to comments commit 8cf25aa16ca6318565f7fc742dfd9ef41df7fdbd Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 22:00:46 2022 +0900 pick from https://github.com/levbishop/qiskit-terra/commit/a5033d7785d58157a365892a2f954caeecaaabdf commit 32f769e367666cc2c0c0a2f7dcdc2f98c2da4baf Merge: 24b8a765e a8d7f707b Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Fri Mar 4 21:58:39 2022 +0900 Merge pull request #39 from ikkoham/primitives/base-class-remove-grouping Remove grouping commit a8d7f707b8abf37840eb60e95b37003925500e9f Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 21:54:58 2022 +0900 use typing instead of collections.abc commit 24b8a765e114a15cc44858181a91b293fbe40357 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 21:54:58 2022 +0900 use typing instead of collections.abc commit 592e1bed006fd345704bd4dba6980a2808fceadf Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 21:49:08 2022 +0900 remove grouping commit 77140eb0d8e99f12826cb039ff45b68194070941 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 21:47:41 2022 +0900 use Iterable commit becf352555a50e07ab0201efa7c21ec52632e7b3 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Fri Mar 4 21:27:47 2022 +0900 Update qiskit/primitives/base_estimator.py Co-authored-by: Lev Bishop <18673315+levbishop@users.noreply.github.com> commit 67a0f19a9b2510265c3f14da51972d3eb8695c39 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 21:13:25 2022 +0900 remove duplicated methods and remove variances commit c532395226943df5b5fe2cc0707c07676129b46e Merge: ce0ad7625 6bb4b1d91 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Fri Mar 4 19:29:02 2022 +0900 Merge pull request #38 from levbishop/primitives/base-class Primitives/base class commit 6bb4b1d91022795dfc51601ea3fc7bb0eb3bab34 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Fri Mar 4 19:28:54 2022 +0900 Update qiskit/primitives/base_sampler.py Co-authored-by: Takashi Imamichi <31178928+t-imamichi@users.noreply.github.com> commit bf9b52454c22878c61209f428e46e4600bd4f3c2 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Fri Mar 4 19:28:48 2022 +0900 Update qiskit/primitives/base_sampler.py Co-authored-by: Takashi Imamichi <31178928+t-imamichi@users.noreply.github.com> commit dbe5bb544d73e05c585d2c00835752b44858e9f9 Merge: b7fc45e40 ce0ad7625 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Fri Mar 4 19:00:52 2022 +0900 Merge branch 'primitives/base-class' into primitives/base-class commit b7fc45e40112619d6aff9085fcf041b0c6ba3bd9 Author: Lev S. Bishop <18673315+levbishop@users.noreply.github.com> Date: Fri Mar 4 03:50:39 2022 -0500 Parameter ordering and docs commit ce0ad7625d7b645dd48e9f3e2db7366dd50db3ed Merge: 3f9cc5cbd 75b7a7a5a Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Fri Mar 4 17:47:05 2022 +0900 Merge pull request #37 from t-imamichi/doc-fix Doc fix commit 75b7a7a5a3ba1ea14d9d26c414160feeeb7d328b Author: Takashi Imamichi <imamichi@jp.ibm.com> Date: Fri Mar 4 17:41:05 2022 +0900 fix sampler example commit eaabfc84094d19ec42fcb6ebb8947a51e9e84657 Author: Lev S. Bishop <18673315+levbishop@users.noreply.github.com> Date: Fri Mar 4 03:19:36 2022 -0500 Release note commit 3f9cc5cbdc25bf0f4ac4c3f3c04e2f2dfed48248 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 16:55:28 2022 +0900 move dunder methods and fix lint commit b2d8030d841223b6f3b25fafabc2c23cca790668 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 16:16:29 2022 +0900 add parenthesis commit 45a8d85e0dce602d59470698190bb03a47b2be68 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 16:14:48 2022 +0900 fix docs to pass the CI commit 5f96cd9025d0e85b8d3492daf6602d7c9bb3d70b Author: Takashi Imamichi <imamichi@jp.ibm.com> Date: Fri Mar 4 15:43:47 2022 +0900 fix docstring commit 348338ecfb7877771a3bb49d835547f068feda99 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Fri Mar 4 12:42:33 2022 +0900 fix type hints commit d26515980f10f6beab7fe22ac7ce529fceeb0825 Merge: 0fb08e94c f08e647cd Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Fri Mar 4 11:37:09 2022 +0900 Merge branch 'main' into primitives/base-class commit f08e647cd67f0644e1080e86f30a88705bfcc449 Author: Matthew Treinish <mtreinish@kortar.org> Date: Thu Mar 3 21:32:17 2022 -0500 Add rust to binder configuration (#7732) * Add rust to binder postBuild In #7658 we updated added rust code to the Qiskit build environment. This was done to accelerate performance critical portions of the library. However, in #7658 we overlooked the binder tests which are used to perform image comparisons for visualizations in a controlled environment. The base binder docker image does not have the rust compiler installed, so we need to manually install it prior to running pip to install terra. This commit takes care of this and adds rust to the binder environment. * Install rust via conda for binder env The binder image build is implicitly installing terra as it launches. So trying to install rust manually as part of the postBuild script is too late because it will have failed by then. Looking at the available configuration files: https://mybinder.readthedocs.io/en/latest/config_files.html?#configuration-files we can install conda packages defined in environment.yml as part of the image build, and this will occur prior to the installation of terra. This commit pivots to using the environment.yml to do this and we can rely on the conda packaged version of rust instead of rustup. * Move all binder files .binder/ dir * Remove duplicate pip install in postBuild * Revert "Remove duplicate pip install in postBuild" This was actually needed, without this terra isn't actually installed. This reverts commit d61c7a68e88747cd38f82643c2b35eca33932a52. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 0fb08e94cd856170d1b5e57f078ea30a004c86bf Merge: b88e398c9 6e47c2cd2 Author: Lev S. Bishop <18673315+levbishop@users.noreply.github.com> Date: Thu Mar 3 20:46:53 2022 -0500 Merge branch 'primitives/base-class' of github.com:ikkoham/qiskit-terra into primitives/base-class commit b88e398c94867325f5cc031cfceca0cfeaad93ff Author: Lev S. Bishop <18673315+levbishop@users.noreply.github.com> Date: Thu Mar 3 20:45:50 2022 -0500 Backwards compatible typechecking commit 6e47c2cd20cc607d2c5039e2670c0a063cee34ed Author: Lev Bishop <18673315+levbishop@users.noreply.github.com> Date: Thu Mar 3 18:09:47 2022 -0500 Delete prim.md Added in error commit 8a70ff0536f934be1cb4f5bdda48d5f0e2153e9d Author: Lev S. Bishop <18673315+levbishop@users.noreply.github.com> Date: Thu Mar 3 17:53:29 2022 -0500 Zip parameters and circuits commit 0be2c44f2ce4e0b83cb49bb275d2db30aef56b3c Author: Matthew Treinish <mtreinish@kortar.org> Date: Thu Mar 3 15:12:04 2022 -0500 Update FakeWashington backend with new API snapshots (#7731) The FakeWashington backend was recently added in #7392 but when that PR was created the washington device was missing it's pulse defaults payload. Since that PR was first created the IBM API is now returing a pulse defaults payload. This commit updates the FakeWashington backend to use current snapshots which includes the missing data. It is then changed to be a pulse backend now that we have the defaults payload available. commit 9a757c8ae20aa88dec6841e3986da0b4ce70b4c9 Author: Matthew Treinish <mtreinish@kortar.org> Date: Thu Mar 3 12:47:13 2022 -0500 Support reproducible builds of Rust library (#7728) By default Rust libraries don't ship a Cargo.lock file. This is to allow other Rust consumers of the library to pick a compatible version with the other upstream dependencies. [1] However, the library we build in Qiskit is a bit different since it's not a traditional Rust library but instead we're building a C dynamic library that is meant to be consumed by Python. This is much closer a model to developing a Rust binary program because we're shipping a standalone binary. To support reproducible builds we should include the Cargo.lock file in our source distribution to ensure that all builds of qiskit-terra are using the same versions of our upstream Rust dependencies. This commit commits the missing Cargo.lock file, removes it from the .gitignore (which was added automatically by cargo when creating a library project), and includes it in the sdist. This will ensure that any downstream consumer of terra from source will have a reproducible build. Additionally this adds a dependabot config file so the bot will manage proposing version bumps on upstream project releases, since we probably want to be using the latest versions on new releases in our lock file. [1] https://doc.rust-lang.org/cargo/faq.html#why-do-binaries-have-cargolock-in-version-control-but-not-libraries commit 4b86e1ef052d66beada61f039057006f4e9f909f Merge: f10d130b3 148c04448 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Thu Mar 3 17:30:54 2022 +0900 Merge pull request #36 from t-imamichi/doc (wip) docstrings commit 148c04448d564c937812c9c90c5828106c753b1a Author: Takashi Imamichi <imamichi@jp.ibm.com> Date: Thu Mar 3 17:24:28 2022 +0900 (wip) docstrings commit f10d130b3b1102591d1d7039575c623c821692dd Merge: 139e0cd52 ccfed937f Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Wed Mar 2 23:34:45 2022 +0900 Merge pull request #35 from t-imamichi/doc (wip) docstrings commit ccfed937f88210ae4d3beb84851e66dbc3c36cc2 Author: Takashi Imamichi <imamichi@jp.ibm.com> Date: Wed Mar 2 23:32:25 2022 +0900 (wip) docstrings commit bee5e7f62db400a4c2f6924064413371be0048eb Author: Julien Gacon <gaconju@gmail.com> Date: Tue Mar 1 16:35:19 2022 +0100 Remove deprecated methods in ``qiskit.algorithms`` (#7257) * rm deprecated algo methods * add reno * fix tests, remove from varalgo * intial point was said to be abstract in varalgo! * attempt to fix sphinx #1 of ? Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 139e0cd529e724c464499eb145fbb80de8e79170 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Wed Mar 2 00:35:05 2022 +0900 add grouping and minimum implementation commit 9eb6fc3394325684048d835ea15f9a0a5631aee1 Author: ikkoham <ikkoham@users.noreply.github.com> Date: Tue Mar 1 23:47:37 2022 +0900 move result files commit 733a9b7b5fbcbd44947e66b10c1b3dc17cfad49e Author: ikkoham <ikkoham@users.noreply.github.com> Date: Tue Mar 1 23:37:25 2022 +0900 Add base classes for primitives commit ccc371f8ff4dd8fbb7cef41a7d231800a72bda4e Author: Matthew Treinish <mtreinish@kortar.org> Date: Mon Feb 28 16:49:54 2022 -0500 Implement multithreaded stochastic swap in rust (#7658) * Implement multithreaded stochastic swap in rust This commit is a rewrite of the core swap trials functionality in the StochasticSwap transpiler pass. Previously this core routine was written using Cython (see #1789) which had great performance, but that implementation was single threaded. The core of the stochastic swap algorithm by it's nature is well suited to be executed in parallel, it attempts a number of random trials and then picks the best result from all the trials and uses that for that layer. These trials can easily be run in parallel as there is no data dependency between the trials (there are shared inputs but read-only). As the algorithm generally scales exponentially the speed up from running the trials in parallel can offset this and improve the scaling of the pass. Running the pass in parallel was previously tried in #4781 using Python multiprocessing but the overhead of launching an additional process and serializing the input arrays for each trial was significantly larger than the speed gains. To run the algorithm efficiently in parallel multithreading is needed to leverage shared memory on shared inputs. This commit rewrites the cython routine using rust. This was done for two reasons. The first is that rust's safety guarantees make dealing with and writing parallel code much easier and safer. It's also multiplatform because the rust language supports native threading primatives in language. The second is while writing parallel cython code using open-mp there are limitations with it, mainly on windows. In practice it was also difficult to write and maintain parallel cython code as it has very strict requirements on python and c code interactions. It was much faster and easier to port it to rust and the performance for each iteration (outside of parallelism) is the same (in some cases marginally faster) in rust. The implementation here reuses the data structures that the previous cython implementation introduced (mainly flattening all the terra objects into 1d or 2d numpy arrays for efficient access from C). The speedups from this PR can be significant, calling transpile() on a 400 qubit (with a depth of 10) QV model circuit targetting a 409 heavy hex coupling map goes from ~200 seconds with the single threaded cython to ~60 seconds with this PR locally on a 32 core system, When transpiling a 1000 qubit (also with a depth of 10) QV model circuit targetting a 1081 qubit heavy hex coupling map goes from taking ~6500 seconds to ~720 seconds. The tradeoff with this PR is for local qiskit-terra development a rust compiler needs to be installed. This is made trivial using rustup (https://rustup.rs/), but it is an additional burden and one that we might not want to make. If so we can look at turning this PR into a separate repository/package that qiskit-terra can depend on. The tradeoff here is that we'll be adding friction to the api boundary between the pass and the core swap trials interface. But, it does ease the dependency on development for qiskit-terra. * Sanitize packaging to support future modules This commit fixes how we package the compiled rust module in qiskit-terra. As a single rust project only gives us a single compiled binary output we can't use the same scheme we did previously with cython with a separate dynamic lib file for each module. This shifts us to making the rust code build a `qiskit._accelerate` module and in that we have submodules for everything we need from compiled code. For this PR there is only one submodule, `stochastic_swap`, so for example the parallel swap_trials routine can be imported from `qiskit._accelerate.stochastic_swap.swap_trials`. In the future we can have additional submodules for other pieces of compiled code in qiskit. For example, the likely next candidate is the pauli expectation value cython module, which we'll likely port to rust and also make parallel (for sufficiently large number of qubits). In that case we'd add a new submodule for that functionality. * Adjust random normal distribution to use correct mean This commit corrects the use of the normal distribution to have the mean set to 1.0. Previously we were doing this out of band for each value by adding 1 to the random value which wasn't necessary because we could just generate it with a mean of 1.0. * Remove unecessary extra scope from locked read This commit removes an unecessary extra scope around the locked read for where we store the best solution. The scope was previously there to release the lock after we check if there is a solution or not. However this wasn't actually needed as we can just do the check inline and the lock will release after the condition block. * Remove unecessary explicit type from opt_edges variable * Fix indices typo in NLayout constructor Co-authored-by: Jake Lishman <jake@binhbar.com> * Remove explicit lifetime annotation from swap_trials Previously the swap_trials() function had an explicit lifetime annotation `'p` which wasn't necessary because the compiler can determine this on it's own. Normally when dealing with numpy views and a Python object (i.e. a GIL handle) we need a lifetime annotation to tell the rust compiler the numpy view and the python gil handle will have the same lifetime. But since swap_trials doesn't take a gil handle and operates purely in rust we don't need this lifetime and the rust compiler can deal with the lifetime of the numpy views on their own. * Use sum() instead of fold() * Fix lint and add rust style and lint checks to CI This commit fixes the python lint failures and also updates the ci configuration for the lint job to also run rust's style and lint enforcement. * Fix returned layout mapping from NLayout This commit fixes the output list from the `layout_mapping()` method of `NLayout`. Previously, it incorrectly would return the wrong indices it should be a list of virtual -> physical to qubit pairs. This commit corrects this error Co-authored-by: georgios-ts <45130028+georgios-ts@users.noreply.github.com> * Tweak tox configuration to try and reliably build rust extension * Make swap_trials parallelization configurable This commit makes the parallelization of the swap_trials() configurable. This is dones in two ways, first a new argument parallel_threshold is added which takes an optional int which is the number of qubits to switch between a parallel and serial version. The second is that it takes into account the the state of the QISKIT_IN_PARALLEL environment variable. This variable is set to TRUE by parallel_map() when we're running in a multiprocessing context. In those cases also running stochastic swap in parallel will likely just cause too much load as we're potentially oversubscribing work to the number of available CPUs. So, if QISKIT_IN_PARALLEL is set to True we run swap_trials serially. * Revert "Make swap_trials parallelization configurable" This reverts commit 57790c84b03da10fd7296c57b38b54c5bccebf4c. That commit attempted to sovle some issues in test running, mainly around multiple parallel dispatch causing exceess load. But in practice it was broken and caused more issues than it fixed. We'll investigate and add control for the parallelization in a future commit separately after all the tests are passing so we have a good baseline. * Add docs to swap_trials() and remove unecessary num_gates arg * Fix race condition leading to non-deterministic behavior Previously, in the case of circuits that had multiple best possible depth == 1 solutions for a layer, there was a race condition in the fast exit path between the threads which could lead to a non-deterministic result even with a fixed seed. The output was always valid, but which result was dependent on which parallel thread with an ideal solution finished last and wrote to the locked best result last. This was causing weird non-deterministic test failures for some tests because of #1794 as the exact match result would change between runs. This could be a bigger issue because user expectations are that with a fixed seed set on the transpiler that the output circuit will be deterministically reproducible. To address this is issue this commit trades off some performance to ensure we're always returning a deterministic result in this case. This is accomplished by updating/checking if a depth==1 solution has been found in another trial thread we only act (so either exit early or update the already found depth == 1 solution) if that solution already found has a trial number that is less than this thread's trial number. This does limit the effectiveness of the fast exit, but in practice it should hopefully not effect the speed too much. As part of this commit some tests are updated because the new deterministic behavior is slightly different from the previous results from the cython serial implementation. I manually verified that the new output circuits are still valid (it also looks like the quality of the results in some of those cases improved, but this is strictly anecdotal and shouldn't be taken as a general trend with this PR). * Apply suggestions from code review Co-authored-by: georgios-ts <45130028+georgios-ts@users.noreply.github.com> * Fix compiler errors in previous commit * Revert accidental commit of parallel reduction in compute_cost This was only a for local testing to prove it was a bad idea and was accidently included in the branch. We should not nest the parallel execution like this. * Eliminate short circuit for depth == 1 swap_trial() result This commit eliminates the short circuit fast return in swap_trial() when another trial thread has found an ideal solution. Trying to do this in a parallel context is tricky to make deterministic because in cases of >1 depth == 1 solutions there is an inherent race condition between the threads for writing out their depth == 1 result to the shared location. Different strategies were tried to make this reliably deterministic but there wa still a race condition. Since this was just a performance optimization to avoid doing unnecessary work this commit removes this step. Weighing improved performance against repeatability in the output of the compiler, the reproducible results are more important. After we've adopted a multithreaded stochastic swap we can investigate adding this back as a potential future optimization. * Add missing docstrings * Add section to contributing on installing form source * Make rust python classes pickleable * Add rust compiler install to linux wheel jobs * Try more tox changes to fix docs builds * Revert "Eliminate short circuit for depth == 1 swap_trial() result" This reverts commit c510764a770cb610661bdb3732337cd45ab587fd. The removal there was premature and we had a fix for the non-determinism in place, ignoring a typo which was preventing it from working. Co-Authored-By: Georgios Tsilimigkounakis <45130028+georgios-ts@users.noreply.github.com> * Fix submodule declaration and module attribute on rust classes * Fix rust lint * Fix docs job definition * Disable multiprocessing parallelism in unit tests This commit disables the multiprocessing based parallelism when running unittest jobs in CI. We historically have defaulted the use of multiprocessing in environments only where the "fork" start method is available because this has the best performance and has no caveats around how it is used by users (you don't need an `if __name__ == "__main__"` guard). However, the use of the "fork" method isn't always 100% reliable (see https://bugs.python.org/issue40379), which we saw on Python 3.9 #6188. In unittest CI (and tox) by default we use stestr which spawns (not using fork) parallel workers to run tests in parallel. With this PR this means in unittest we're now running multiple test runner subprocesses, which are executing parallel dispatched code using multiprocessing's fork start method, which is executing multithreaded rust code. This three layers of nesting is fairly reliably hanging as Python's fork doesn't seem to be able to handle this many layers of nested parallelism. There are 2 ways I've been able to fix this, the first is to change the start method used by `parallel_map()` to either "spawn" or "forkserver" either of these does not suffer from random hanging. However, doing this in the unittest context causes significant overhead and slows down test executing significantly. The other is to just disable the multiprocessing which fixes the hanging and doesn't impact runtime performance signifcantly (and might actually help in CI so we're not oversubscribing the limited resources. As I have not been able to reproduce `parallel_map()` hanging in a standalone context with multithreaded stochastic swap this commit opts for just disabling multiprocessing in CI and documenting the known issue in the release notes as this is the simpler solution. It's unlikely that users will nest parallel processes as it typically hurts performance (and parallel_map() actively guards against it), we only did it in testing previously because the tests which relied on it were a small portion of the test suite (roughly 65 tests) and typically did not have a significant impact on the total throughput of the test suite. * Fix typo in azure pipelines config * Remove unecessary extension compilation for image tests * Add test script to explicitly verify parallel dispatch In an earlier commit we disabled the use of parallel dispatch in parallel_map() to avoid a bug in cpython associated with their fork() based subprocess launch. Doing this works around the bug which was reliably triggered by running multiprocessing in parallel subprocesses. It also has the side benefit of providing a ~2x speed up for test suite execution in CI. However, this meant we lost our test coverage in CI for running parallel_map() with actual multiprocessing based parallel dispatch. To ensure we don't inadvertandtly regress this code path moving forward this commit adds a dedicated test script which runs a simple transpilation in parallel and verifies that everything works as expected with the default parallelism settings. * Avoid multi-threading when run in a multiprocessing context This commit adds a switch on running between a single threaded and a multithreaded variant of the swap_trials loop based on whether the QISKIT_IN_PARALLEL flag is set. If QISKIT_IN_PARALLEL is set to TRUE this means the `parallel_map()` function is running in the outer python context and we're running in multiprocessing already. This means we do not want to be running in multiple threads generally as that will lead to potential resource exhaustion by spawn n processes each potentially running with m threads where `n` is `min(num_phys_cpus, num_tasks)` and `m` is num_logical_cpus (although only `min(num_logical_cpus, num_trials)` will be active) which on the typical system there aren't enough cores to leverage both multiprocessing and multithreading. However, in case a user does have such an environment they can set the `QISKIT_FORCE_THREADS` env variable to `TRUE` which will use threading regardless of the status of `QISKIT_IN_PARALLEL`. * Apply suggestions from code review Co-authored-by: Jake Lishman <jake@binhbar.com> * Minor fixes from review comments This commits fixes some minor details found during code review. It expands the section on building from source to explain how to build a release optimized binary with editable mode, makes the QISKIT_PARALLEL env variable usage consistent across all jobs, and adds a missing shebang to the `install_rush.sh` script which is used to install rust in the manylinux container environment. * Simplify tox configuration In earlier commits the tox configuration was changed to try and fix the docs CI job by going to great effort to try and enforce that setuptools-rust was installed in all situations, even before it was actually needed. However, the problem with the docs ci job was unrelated to the tox configuration and this reverts the configuration to something that works with more versions of tox and setuptools-rust. * Add missing pieces of cargo configuration Co-authored-by: Jake Lishman <jake@binhbar.com> Co-authored-by: georgios-ts <45130028+georgios-ts@users.noreply.github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 44f794aa7afa545900bda4ed361a7a27d71dff4f Author: Edwin Navarro <enavarro@comcast.net> Date: Sat Feb 26 13:56:55 2022 -0800 Fix display of sidetext gates with conditions (#7673) * First testing * Fix display of sidetext gates with conditions in text * Add comment * Start it up again * Add mpl and latex tests * Add cu1 and rzz tests * Start it up again * Break out RZZ and CU1 * Restart commit 5b53a15d047b51079b8d8269967514fd34ab8d81 Author: Ikko Hamamura <ikkoham@users.noreply.github.com> Date: Sat Feb 26 08:05:57 2022 +0900 Fix endianness in result.mitigator (#7689) * fix endian * add a release note * Reword release note * Remove debugging print Co-authored-by: Jake Lishman <jake.lishman@ibm.com> Co-authored-by: Jake Lishman <jake@binhbar.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 6d3a4f8f9ebbc55f2f6f6ecf01b49ffed800db53 Author: Lolcroc <Lolcroc@users.noreply.github.com> Date: Fri Feb 25 17:20:45 2022 +0100 Add `__slots__` for `Bit` subclasses (#7708) * Add __slots__ for Bit subclasses * Add release note Co-authored-by: Jake Lishman <jake.lishman@ibm.com> commit 15a109e05f6ecb4388512b428b81adb709847244 Author: Daniel J. Egger <38065505+eggerdj@users.noreply.github.com> Date: Fri Feb 25 01:38:43 2022 +0100 Parameters in InstructionDurations. (#7321) * * First draft of the instruction duration odification. * * Adding suggestion by Itoko * * Fix bug where duration and parameters were switched * * Remove None from tests. * * black. * * Added check on None duration. * * Added test. * * Reno * * Test fix. * * Moved test and updated reno. * * Docstring. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 2600586c9caa019634b6219c6e58bc4b640c680d Author: Alexander Ivrii <alexi@il.ibm.com> Date: Thu Feb 24 23:19:06 2022 +0200 Define LinearFunction class and collect blocks of gates that make a LinearFunction (#7361) * Implementing LinearFunction gate and a transpiler pass that collects a sequence of linear gates into a LinearFunction * removing test file * running black and pylint * Reimplementing linear function to inherit from Gate; adding tests for linear functions * adding tests for CollectLinearFunctions transpiler optimization pass * Improving tests for CollectLinearFunctions pass * style * style * Adding LinearFunction to exclude in test_gate_definitions * Normalizing internal representation to numpy array format; adding example of a linear matrix * using find_bit command * removing stray pylint comment * fixing lint error * adding a comment * Update qiskit/circuit/library/generalized_gates/linear_function.py Co-authored-by: Kevin Krsulich <kevin@krsulich.net> * adding a transpiler pass that synthesizes linear functions, and updating tests * renaming; changing default behavior to copy * Adding comments regarding synthesis and big-endian * adding transpiler passes to synthesize linear functions and to promote them to permutations whenever possible * adding release notes * code improvements following the review * adding explicit reference to pdf * removing redundant is_permutation check * First pass over comments in the review * minor tweak to release notes * trying to get links in release notes to work * trying to get links in release notes to work * pass over documentation * improving tests * treating other review comments * removing accidentally added param * Use specific testing assertions * Specific assertion stragglers * changing the assert Co-authored-by: Kevin Krsulich <kevin@krsulich.net> Co-authored-by: Jake Lishman <jake@binhbar.com> commit 7cab49fb7f223798d31ff295f1b9d0b2f7e15fed Author: Matthew Treinish <mtreinish@kortar.org> Date: Thu Feb 24 07:09:27 2022 -0500 Use VF2Layout in all preset passmanagers (#7213) * Use VF2Layout in all preset passmanagers With the introduction of the VF2Layout pass we now have a very fast method of searching for a perfect layout. Previously we only had the CSPLayout method for doing this which could be quite slow and we only used in level 2 and level 3. Since VF2Layout operates quickly adding the pass to each preset pass manager makes sense so we always use a perfect layout if available (or unless a user explicitly specifies an initial layout or layout method). This commit makes this change and adds VF2Layout to each optimization level and uses a perfect layout if found by default. Fixes #7156 * Revert changes to level 0 For optimization level 0 we don't actually want to use VF2Layout because while it can find a perfect layout it would be potentially surprising to users that the level which is supposed to have no optimizations picks a non-trivial layout. * Set seed on perfect layout test * Fix test failures and unexpected change in behavior This commit makes several changes to fix both unexpected changes in behavior and also update the default behavior of the vf2 layout pass. The first issues is that the vf2 pass was raising an exception if it was run with >2 qubit gates. This caused issues if we run with calibrations (or backends that support >2q gates) as vf2 layout is used as an opportunistic thing and if there is for example a 5q gate being used we shouldn't fail the entire transpile just because vf2 can't deal with it. It's only an issue if the later passes can't either, it just means vf2 won't be able to find a perfect layout. The second change is around the default seeding. For the preset pass managers to have a consistent output this removed the randomization if no seed is specified and just use an in order comparison. This is necessary to have a consistent layout for testing and reproducability, while we can set a seed everywhere, the previous behavior was more stable as it would default to trivial layout most of the time (assuming that was perfect). When we add multiple vf2 trials and are picking the best choice among those for a given time budget we can add back in the default seed randomization. The last change made here is that several tests implicitly expected a trivial layout (mainly around device aware transpilation or calibrations). In those cases the transpile wasn't valid for an arbitrary layout, for example if a calibrated gate is only defined on a single qubit. Using vf2 layout in those cases doesn't work because the gate is only defined on a single qubit so picking a non-trivial layout correctly errors. To fix these cases the tests are updated to explicitly state they require a trivial layout instead of assuming the transpiler will implicitly give them that if it's a perfect layout. * Add missing release note * Apply suggestions from code review in release note Co-authored-by: Jake Lishman <jake@binhbar.com> * Make vf2 layout stop reason an Enum * Update releasenotes/notes/vf2layout-preset-passmanager-db46513a24e79aa9.yaml Co-authored-by: Kevin Krsulich <kevin@krsulich.net> * Add back initial layout to level1 Since there seems to be a pretty baked in asumption for level 1 that it will use the trivial layout by default if it's a perfect mapping. This was causing the majority of the test failures and might be an unexpected breakage for people. However, in a future release we should remove this (likely when vf2layout is made noise aware). To anticipate this a FutureWarning is emitted when a trivial layout is used to indicate that this behavior will change in the future for level 1 and if you're relying on it you should explicitly set the layout_method='trivial'. * Tweak seeds to reduce effect of noise on fake_yorktown with aer * Update release note * Use id_order=True on vf2_mapping() for VF2Layout pass Using id_order=False orders the nodes by degree which biases the mapping found by vf2 towards nodes with higher connectivity which typically have higher error rates. Until the pass is made noise aware to counter this bias we should just use id_order=True which uses the node id for the order (which roughly matches insertion order) which won't have this bias in the results. * Revert "Use id_order=True on vf2_mapping() for VF2Layout pass" VF2 without the Vf2++ heuristic is too slow for some common use cases that we probably don't want to use it by default. Instead we should use some techniques to improve the quality of the results. The first approach will be applying a score heuristic to a found mapping. A potential follow on after that could be to do some pre-filtering of noisy nodes. This reverts commit 53d54c3a3648288e40d3ae21e76206f88cb7b981. * Set real limits on calling vf2 and add quality heuristic This commit adds options to set limits on the vf2 pass, both the internal call limit for the vf2 execution in retworkx, a total time spent in the pass trying multiple layouts, and the number of trials to attempt. These are then set in the preset pass manager to ensure we don't sit spinning on vf2 forever in the real world. While the pass is generally fast there are edge cases where it can get stuck. At the same time this adds a rough quality heuristic (based on readout error falling back to connectivity) to select between multiple mappings found by retworkx. This addresses the poor quality results we were getting with vf2++ in earlier revisions as we can find the best from multiple mappings. * Remove initial layout default from level 1 Now that vf2 layout has limited noise awareness and multiple trials it no longer will be defaulting to the worst qubits like it was with only a single sample when vf2++ is used. This commit removes the implicit trivial layout attempt as it's no longer needed. * Remove unused imports * Use vigo instead of yorktown for oracle tests * Revert "Remove initial layout default from level 1" This is breaking the pulse tutorials, as fixing that is a more involved change and probably indicates we should continue to assume trivial layout by default if perfect for level 1 and raise a warning to users that it's going away to give everyone who needs a default trivial layout time to adjust their code. This commit reverts only using vf2 for level 1 and adds back in a trivial layout stage. This reverts commit df478357b9c59fe88f8d464c361de7a2e0c03976. This reverts commit 65ae6ee0da156ecb1659e0abfb0b19df3bbbd367. * Fix warning and update docs to not emit one This commit fixes the warning so it's only emitted if the trivial layout is used, previously it would also be emitted if an initial layout was set. The the docs are updated to not emit the warning, in some cases code examples are updated to explicitly use a trivial layout if that's what's needed. In others there were needless jupyter-execute directives being used when there was no visualization and a code-block is just as effective (which avoids the execution during doc builds). * Add debug logging and fix heurstic usage * Add better test coverage of new pass features * Only run a single trial if the graphs are the same size If the interaction graph and the coupling graph are the same size currently the score heuristic will produce the same results since they just look at the sum of qubit noise (or degree). We don't need to run multiple trials or bother scoring things since we'll just pick the first mapping anyway. * Fix rebase error * Use enum type for stop reason condition * Add comment about call_limit value * Undo unecessary seed change * Add back default seed randomization After all the improvements to the VF2Layout pass in #7276 this was no longer needed. It was added back prior to #7276 where the vf2 layout pass was not behaving well for simple cases. * Permute operator bits based on layout The backendv2 transpilation tests were failing with vf2 layout enabled by default because we were no longer guaranteed to get an initial layout by default. The tests were checking for an equivalent operator between the output circuit and the input one. However, the use of vf2layout was potentially changing the bit order (especially at higher optimization levels) in the operator because a non trivial layout was selected. This caused the tests to fail. This commit fixes the test failures by adding a helper function to permute the qubits back based on the layout property in the transpiled circuit. * Fix docs build * Remove warning on use of TrivialLayout in opt level 1 The warning which was being emitted by an optimization level 1 transpile() if a trivial layout was used was decided to be too potentially noisy for users, especially because it wasn't directly actionable. For the first step of using vf2 layout everywhere we decided to leave level1 as trying a trivial layout first and then falling back to vf2 layout if the trivial layout isn't a perfect match. We'll investigate whether it makes sense in the future to change this behavior and come up with a migration plan when that happens. * Apply suggestions from code review Co-authored-by: Jake Lishman <jake@binhbar.com> * Cleanup inline comment numbering in preset pass manager modules * Fix lint * Remove operator_permuted_layout() from backendv2 tests This commit removes the custom operator_permuted_layout() function from the backendv2 tests. This function was written to permute the qubits based on the output layout from transpile() so it can be compared to the input circuit for equivalence. However, since this PR was first opened a new constructor method Operator.from_circuit() was added in #7616 to handle this directly in the Operator construction instead of doing it out of band. THis commit just leverages the new constructor instead of having a duplicate local test function. * Remove unused import Co-authored-by: Jake Lishman <jake@binhbar.com> Co-authored-by: Kevin Krsulich <kevin@krsulich.net> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 9ec0c4eed3cf2f72e9f1f3c24233bfbe62df3511 Author: Omar Costa Hamido <omarcostinha@gmail.com> Date: Wed Feb 23 22:07:55 2022 +0000 Use pulse configuration in fake Bogota, Rome, Manila and Santiago (#7688) * Update fake_bogota.py - refer to the defs file and turn it into a FakePulseBackend * Update fake_manila.py - refer to the defs file and turn it into a FakePulseBackend * Update fake_rome.py - refer to the defs file and turn it into a FakePulseBackend * Update fake_santiago.py - refer to the defs file and turn it into a FakePulseBackend * Update fake_bogota.py - make sure we are using FakePulseLegacyBackend where it is needed. * Create bogota-manila-rome-santiago-as-fakepulsebackends-2907dec149997a27.yaml - add release notes * Update releasenotes/notes/bogota-manila-rome-santiago-as-fakepulsebackends-2907dec149997a27.yaml no need for prelude 🎹🎵 😕 Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * Update releasenotes/notes/bogota-manila-rome-santiago-as-fakepulsebackends-2907dec149997a27.yaml 🧐 using proper notation. Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * Fix typo Co-authored-by: Matthew Treinish <mtreinish@kortar.org> Co-authored-by: Jake Lishman <jake@binhbar.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 3591fa635db75e3d7f1afea19359acb2fb3a6740 Author: Jake Lishman <jake.lishman@ibm.com> Date: Wed Feb 23 19:21:22 2022 +0000 Rework `QuantumCircuit._append` and bit resolver (#7618) The previous resolver of indices for bits involved catching several exceptions even when resolving a valid specifier. This is comparatively slow for inner-loop code. The implementation also assumed that if a type could be cast to an integer, the only way it could be a valid specifier was as an index. This broke for size-1 Numpy arrays, which can be cast to `int`, but should be treated as iterables. Since `QuantumCircuit.append` necessarily checks the types of all its arguments, it is unnecessary for `QuantumCircuit._append` to do so as well. This also allows anywhere that is constructing a `QuantumCircuit` from known-safe data (such as copying from an existing circuit, or building templates) to do so without the checks. This is now documented as its contract. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit 2a67dc1342517b88367f88475bff6a524f46d295 Author: Naoki Kanazawa <nkanazawa1989@gmail.com> Date: Thu Feb 24 02:37:53 2022 +0900 Move QPY serializer to own module (#7582) * Move qpy to own module. qpy_serialization.py is splint into several files for maintenability. This commit also adds several bytes Enum classes for type keys in the header, and provides several helper functions. Some namedtuple class names are updated because, for example, INSTRUCTION will be vague when we add schedule, i.e. it's basically different program and has own instruction that has different data format. Basically CIRCUIT_ prefix is added to them. * manually cherry-pick #7584 with some cleanup - change qiskit.qpy.objects -> qiskit.qpy.binary_io - TUPLE -> SEQUENCE (we may use this for list in future) - add QpyError - add _write_register in circuit io to remove boilerplate code * respond to review comments - expose several private methods for backward compatibility - use options for symengine - rename alphanumeric -> value - rename write, read methods and remove alias - improve container read * remove import warning * replace alphanumeric with value in comments and messages. * private functions import Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit a6cb1a37f52fd98f90d0685d1fdcba447a5345b7 Author: Naoki Kanazawa <nkanazawa1989@gmail.com> Date: Tue Feb 22 15:50:33 2022 +0900 Fix ASAP/ALAP scheduling pass (#7655) * fix scheduling pass This commit updates both ASAP and ALAP passes not to allow measurement instructions to simultaneously write in the same register. In addition, delay appended after end of circuit is removed since this instruction has no effect. * Update behavior of passes Added `BaseScheduler` as a parent class of ASAP and ALAP passes. These scheduler can take two control parameters `clbit_write_latency` and `conditional_latency`. These represent I/O latency of clbits. In addition, delays in the very end of the scheduled circuit is readded because Dynamical Decoupling passes inserts echo sequence there. More unittests and reno are also added. * fix ASAP conditional bug The conditional bit start time was only looking cregs. But this should start right before the gate. Co-authored-by: Toshinari Itoko <itoko@jp.ibm.com> * respond to review comments - fix typo - update model drawing in comment - more comment * update documentation and add todo comment * update logic to insert conditional bit assert * add more docs on topological ordering * Update qiskit/transpiler/passes/scheduling/base_scheduler.py Co-authored-by: Toshinari Itoko <15028342+itoko@users.noreply.github.com> * lint fix Co-authored-by: Toshinari Itoko <itoko@jp.ibm.com> Co-authored-by: Toshinari Itoko <15028342+itoko@users.noreply.github.com> Co-authored-by: Ikko Hamamura <ikkoham@users.noreply.github.com> commit 6eb12965fd2ab3e9a6816353467c8e1c4ad2a477 Author: Matthew Treinish <mtreinish@kortar.org> Date: Mon Feb 21 12:47:46 2022 -0500 Bump minimum supported symengine version for built-in pickle support (#7682) * Bump minimum supported symengine version for built-in pickle support The new symengine 0.9 release added native support in the package for pickling symengine objects. Previously we had been converting symengine objects to sympy objects so we could pickle them. With native support for pickle in symengine now we no longer need this which besides removing unnecessary should hopefully make pickling (which we do internally as part of using multiprocessing) more reliable. This also seems to fix the hanging we were seeing with multiprocessing with Python 3.9 on Linux. While investigating that issue it points to the underlying cause being a bug in cPython with the `fork()` based start method, but we were only able to reliably trigger it after switching to symengine in #6270 and having to rely on importing symengine to pickle the symengine objects. Since we're no longer doing that after bumping the minimum symengine version this removes the default disabling of parallel dispatch with Python 3.9. While I'm not 100% confident this fixes the bug, in my testing locally I haven't been able to reproduce the hang we were encountering (but this is ancedotal at best). If we do encounter issues with multiprocess hanging in the future we can look at rewriting the internals of `parallel_map()` or switching it back to disabled by default. Fixes #6188 * Fix typos in release notes Co-authored-by: Jake Lishman <jake@binhbar.com> Co-authored-by: Jake Lishman <jake@binhbar.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> commit a185ee6e3dd8d1f87c773e4e894a458ed1930ad2 Author: Matthew Treinish <mtreinish@kortar.org> Date: Mon Feb 21 11:33:56 2022 -0500 Add fake backends for new IBM Quantum systems (#7392) * Add fake backends for new IBM Quantum systems (#6808) This commit adds new fake backend classes for new IBM Quantum systems: Cairo, Hanoi, Kolkata, Nairobi, and Washington. Just as with the other fake backends these new classes contain snapshots of calibration and error data taken from the real system, and can be used for local testing, compilation and simulation. Legacy backends are not added for these new fake backends as the legacy backend interface is deprecated and will be removed in a future release so there is no need to expose that for the new backends (it was only added for compatibility testing on the old fake backends). * Update qiskit/test/mock/backends/washington/fake_washington.py Co-authored-by: Ali Javadi-Abhari <ajavadia@users.noreply.github.com> * Update releasenotes/notes/new-fake-backends-04ea9cb26374e385.yaml Co-authored-by: Luciano Bello <bel@zurich.ibm.com> * Fix lint Co-authored-by: Ali Javadi-Abhari <ajavadia@users.noreply.github.com> Co-authored-by: Luciano Bello <bel@zurich.ibm.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

mtreinish added 4 commits July 22, 2020 09:09

mtreinish requested a review from nonhermitian July 22, 2020 18:46

mtreinish requested a review from a team as a code owner July 22, 2020 18:46

Merge branch 'master' into multiproc-stochastic-swap

6799c67

mtreinish added the performance label Jul 22, 2020

mtreinish changed the title ~~Parallelize swap trials in stochastic swap~~ [WIP] Parallelize swap trials in stochastic swap Jul 22, 2020

mtreinish added 8 commits July 22, 2020 15:06

Switch to starmap to handle inner child process args

81095e4

Use a single rng for each process

9d29363

Instead of initializing a new rng object every time _swap_trial is called this commit adds rng initialization as a global for each child process when the Pool is created. This should avoid unecessary overhead initializing a new rng for each iteration.

Merge branch 'multiproc-stochastic-swap' of github.com:mtreinish/qisk…

a9eb0e6

…it-core into multiproc-stochastic-swap

mtreinish added 3 commits July 25, 2020 07:57

Remove RawArray usage for shared cmap data

c712730

This commit removes the usage of RawArray for the shared objects between the trial workers. This was unecessary since there is no modification of the arrays being shared and adds unecessary overhead.

1ucian0 marked this pull request as draft September 3, 2020 16:29

Merge branch 'master' into multiproc-stochastic-swap

38056ba

mtreinish added the on hold Can not fix yet label Nov 18, 2020

Merge remote-tracking branch 'origin/main' into multiproc-stochastic-…

585681f

…swap

mtreinish mentioned this pull request Jan 12, 2022

Improve ExperimentData handling of jobs and analysis qiskit-community/qiskit-experiments#599

Merged

mtreinish closed this Jan 31, 2022

mtreinish mentioned this pull request Feb 14, 2022

Implement multithreaded stochastic swap in rust #7658

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Parallelize swap trials in stochastic swap #4781

[WIP] Parallelize swap trials in stochastic swap #4781

mtreinish commented Jul 22, 2020

mtreinish commented Jul 24, 2020 •

edited

Loading

mtreinish commented Jul 24, 2020

mtreinish commented Jan 31, 2022

[WIP] Parallelize swap trials in stochastic swap #4781

[WIP] Parallelize swap trials in stochastic swap #4781

Conversation

mtreinish commented Jul 22, 2020

Summary

Details and comments

mtreinish commented Jul 24, 2020 • edited Loading

mtreinish commented Jul 24, 2020

mtreinish commented Jan 31, 2022

mtreinish commented Jul 24, 2020 •

edited

Loading