krotov.parallelization module¶

Support routines for running the optimization in parallel across the objectives

The time-propagation that is the main numerical effort in an optimization with Krotov’s method can naturally be performed in parallel for the different objectives. There are three time-propagations that happen inside optimize_pulses():

A forward propagation of the initial_state of each objective under the initial guess pulse.
A backward propagation of the states \(\ket{\chi_k}\) constructed by the chi_constructor routine that is passed to optimize_pulses(), where the number of states is the same as the number of objectives.
A forward propagation of the initial_state of each objective under the optimized pulse in each iteration. This can only be parallelized per time step, as the propagated states from each time step collectively determine the pulse update for the next time step, which is then used for the next propagation step. (In this sense Krotov’s method is “sequential”)

The optimize_pulses() routine has a parameter parallel_map that can receive a tuple of three “map” functions to enable parallelization, corresponding to the three propagation listed above. If not given, qutip.parallel.serial_map() is used for all three propations, running in serial. Any alternative “map” must have the same interface as qutip.parallel.serial_map().

It would be natural to assume that qutip.parallel.parallel_map() would be a good choice for parallel execution, using multiple CPUs on the same machine. However, this function is only a good choice for the propagation (1) and (2): these run in parallel over the entire time grid without any communication, and thus minimal overhead. However, this is not true for the propagation (3), which must synchronize after each time step. In that case, the “naive” use of qutip.parallel.parallel_map() results in a communication overhead that completely dominates the propagation, and actually makes the optimization slower (potentially by more than an order of magnitude).

The function parallel_map_fw_prop_step() provided in this module is an appropriate alternative implementation that uses long-running processes, internal caching, and minimal inter-process communication to eliminate the communication overhead as much as possible. However, the internal caching is valid only under the assumption that the propagate function does not have side effects.

In general,

parallel_map=(
    qutip.parallel_map,
    qutip.parallel_map,
    krotov.parallelization.parallel_map_fw_prop_step,
)

is a decent choice for enabling parallelization for a typical multi-objective optimization.

You may implement your own “map” functions to exploit parallelization paradigms other than Python’s built-in multiprocessing, provided here. This includes distributed propagation, e.g. through ipyparallel clusters. To write your own parallel_map functions, review the source code of optimize_pulses() in detail.

In most cases, it will be difficult to obtain a linear speedup from parallelization: even with carefully tuned manual interprocess communication, the communication overhead can be substantial. For best results, it would be necessary to use parallel_map functions implemented in Cython, where the GIL can be released and the entire propagation (and storage of propagated states) can be done in shared-memory with no overhead.

Summary¶

Classes:

`Consumer`	A process-based task consumer
`FwPropStepTask`	A task that performs a single forward-propagation step

Functions:

parallel_map_fw_prop_step

parallel_map function for the forward-propagation by one time step

__all__: Consumer, FwPropStepTask, parallel_map_fw_prop_step

Reference¶

class krotov.parallelization.Consumer(task_queue, result_queue, data)[source]¶

Bases: multiprocessing.context.Process

A process-based task consumer

Parameters

task_queue (multiprocessing.JoinableQueue) – A queue from which to read tasks.
result_queue (multiprocessing.Queue) – A queue where to put the results of a task
data – cached (in-process) data that will be passed to each task

run()[source]¶

Execute all tasks on the task_queue.

Each task must be a callable that takes data as its only argument. The return value of the task will be put on the result_queue. A None value on the task_queue acts as a “poison pill”, causing the Consumer process to shut down.

class krotov.parallelization.FwPropStepTask(i_state, pulse_vals, time_index)[source]¶

Bases: object

A task that performs a single forward-propagation step

The task object is a callable, receiving the single tuple of the same form as task_args in parallel_map_fw_prop_step() as input. This data is internally cached by the Consumer that will execute the task.

Parameters

i_state (int) – The index of the state to propagation. That is, the index of the objective from whose initial_state the propagation started
pulse_vals (list[float]) – the values of the pulses at time_index to use.
time_index (int) – the index of the interval on the time grid covered by the propagation step

The passed arguments update the internal state (data) of the Consumer executing the task; they are the minimal information that must be passed via inter-process communication to enable the forward propagation (assuming propagate in optimize_pulses() has no side-effects)

krotov.parallelization.parallel_map_fw_prop_step(shared, values, task_args)[source]¶

parallel_map function for the forward-propagation by one time step

Parameters

shared – A global object to which we can attach attributes for sharing data between different calls to parallel_map_fw_prop_step(), allowing us to have long-running Consumer processes, avoiding process-management overhead. This happens to be a callable (the original internal routine for performing a forward-propagation), but here, it is (ab-)used as a storage object only.
values (list) – a list 0..(N-1) where N is the number of objectives
task_args (tuple) –
A tuple of 7 components:
1. A list of states to propagate, one for each objective.
2. The list of objectives
3. The list of optimized pulses (updated up to time_index)
4. The “pulses mapping”, cf extract_controls_mapping()
5. The list of time grid points
6. The index of the interval on the time grid over which to propagate
7. A list of propagate callables, as passed to optimize_pulses(). The propagators must not have side-effects in order for parallel_map_fw_prop_step() to work correctly.