multiprocess.multiproc_distributor
Multiprocess executes tasks in parallel if multiple threads are available.
Attributes
Classes
Object passed between processes. |
|
Consumes the tasks to be run. |
|
Executes tasks locally in parallel. |
|
Task instance info for an instance run with the Multiprocessing distributor. |
Module Contents
- class multiprocess.multiproc_distributor.PickableTask(distributor_id: str, command: str, task_type: str = 'array')[source]
Object passed between processes.
- class multiprocess.multiproc_distributor.Consumer(task_queue: multiprocessing.JoinableQueue, response_queue: multiprocessing.Queue)[source]
Bases:
multiprocessing.ProcessConsumes the tasks to be run.
- task_queue: multiprocessing.JoinableQueue[PickableTask | None][source]
- class multiprocess.multiproc_distributor.MultiprocessDistributor(cluster_name: str, parallelism: int = 3, *args: tuple, **kwargs: dict)[source]
Bases:
jobmon.core.cluster_protocol.ClusterDistributorExecutes tasks locally in parallel.
It uses the multiprocessing Python library and queues to parallelize the execution of tasks. The subprocessing pattern looks like this:
LocalExec –> consumer1 —-> subconsumer1 –> consumer2 —-> subconsumer2 … –> consumerN —-> subconsumerN
- task_queue: multiprocessing.JoinableQueue[PickableTask | None][source]
- _get_subtask_id(distributor_id: int, array_step_id: int) str[source]
Get the subtask_id based on distributor_id and array_step_id.
- start() None[source]
Fire up N task consuming processes using Multiprocessing.
Number of consumers is controlled by parallelism.
- terminate_task_instances(distributor_ids: List[str]) None[source]
Terminate task instances.
Only terminate the task instances that are running, not going to kill the jobs that are actually still in a waiting or a transitioning state.
- Parameters:
distributor_ids – A list of distributor IDs.
- get_submitted_or_running(distributor_ids: List[str] | None = None) Set[str][source]
Get tasks that are active.
- submit_to_batch_distributor(command: str, name: str, requested_resources: Dict[str, Any]) str[source]
Submit the command on the cluster technology and return a distributor_id.
The distributor_id can be used to identify the associated TaskInstance, terminate it, monitor for missingness, or collect usage statistics. If an exception is raised by this method the task instance will move to “W” state and the exception will be logged in the database under the task_instance_error_log table.
- Parameters:
command – command to be run
name – name of task
requested_resources – resource requests sent to distributor API
- Returns:
A tuple indicating the distributor id, the full output file location, and full error location.
- submit_array_to_batch_distributor(command: str, name: str, requested_resources: Dict[str, Any], array_length: int) Dict[int, str][source]
Submit an array task to the multiprocess cluster.
Return: a mapping of array_step_id to distributor_id, output path, and error path
- class multiprocess.multiproc_distributor.MultiprocessWorkerNode[source]
Bases:
jobmon.core.cluster_protocol.ClusterWorkerNodeTask instance info for an instance run with the Multiprocessing distributor.