client.tool

Tool represents a project or model that will be run many times over.

The Tool may evolve over time.

Attributes

logger

Exceptions

InvalidToolError

Exception for Tools that do not exist in the DB.

InvalidToolVersionError

Exception for Tool version that is not valid.

Classes

Tool

Tool represents a project or model that will be run many times over.

Module Contents

client.tool.logger
exception client.tool.InvalidToolError

Bases: Exception

Exception for Tools that do not exist in the DB.

Initialize self. See help(type(self)) for accurate signature.

exception client.tool.InvalidToolVersionError

Bases: Exception

Exception for Tool version that is not valid.

Initialize self. See help(type(self)) for accurate signature.

class client.tool.Tool(name: str = _DEFAULT_TOOL_NAME, active_tool_version_id: str | int = 'latest', requester: jobmon.core.requester.Requester | None = None)

Tool represents a project or model that will be run many times over.

The Tool may evolve over time.

A tool is an application which is expected to run many times on variable inputs.

Which will serve a certain purpose over time even as the internal pipeline may change. Example tools are Dismod, Burdenator, Codem.

Parameters:
  • name – the name of the tool

  • active_tool_version_id – which version of the tool to attach task templates and workflows to.

  • requester – communicate with the FastApi services.

requester = None
name
tool_versions
get_new_tool_version() int

Create a new tool version for the current tool and activate it.

Returns: the version id for the new tool

property active_task_templates: Dict[str, jobmon.client.task_template.TaskTemplate]

Mapping of template_name to TaskTemplate for the active tool version.

property active_tool_version: jobmon.client.tool_version.ToolVersion

Tool version id to use when spawning task templates.

property default_compute_resources_set: Dict[str, Dict[str, Any]]

Default compute resources associated with active tool version.

property default_resource_scales_set: Dict[str, Dict[str, float]]

Default resource scales associated with active tool version.

property default_cluster_name: str

Default cluster_name associated with active tool version.

property default_max_attempts: int | None

Default max attempts of the active tool version.

set_active_tool_version_id(tool_version_id: str | int) None

Tool version that is set as the active one (latest is default during instantiation).

Parameters:

tool_version_id – which tool version to set as active on this object.

get_task_template(template_name: str, command_template: str, node_args: List[str] | None = None, task_args: List[str] | None = None, op_args: List[str] | None = None, default_cluster_name: str = '', default_compute_resources: Dict[str, Any] | None = None, default_resource_scales: Dict[str, float] | None = None, yaml_file: str | None = None, max_attempts: int | None = None) jobmon.client.task_template.TaskTemplate

Create or get task a task template.

Parameters:
  • template_name – the name of this task template.

  • command_template – an abstract command representing a task, where the arguments to the command have defined names but the values are not assigned. eg: ‘{python} {script} –data {data} –para {para} {verbose}’

  • node_args – any named arguments in command_template that make the command unique within this template for a given workflow run. Generally these are arguments that can be parallelized over.

  • task_args – any named arguments in command_template that make the command unique across workflows if the node args are the same as a previous workflow. Generally these are arguments about data moving though the task.

  • op_args – any named arguments in command_template that can change without changing the identity of the task. Generally these are things like the task executable location or the verbosity of the script.

  • default_cluster_name – the default cluster to run each task associated with this template on.

  • default_compute_resources – dictionary of default compute resources to run tasks with. Can be overridden at task level. dict of {resource_name: resource_value}. Must specify default_cluster_name when this option is used.

  • default_resource_scales – dictionary of default resource scales to adjust task resources with. Can be overridden at task level. dict of {resource_name: scale_factor}. Scale factor can be a numeric value, a Callable that will be applied to the existing resources, or an Iterator. Any Callable should take a single numeric value as its sole argument. Any Iterator should only yield numeric values. Any Iterable can be easily converted to an Iterator by using the iter() built-in (e.g. iter([80, 160, 190])).

  • yaml_file – path to YAML file that contains user-specified compute resources.

  • max_attempts – max_attempts for the tt

create_workflow(workflow_args: str = '', name: str = '', description: str = '', workflow_attributes: List | dict | None = None, max_concurrently_running: int = MaxConcurrentlyRunning.MAXCONCURRENTLYRUNNING, chunk_size: int = 500, default_cluster_name: str = '', default_compute_resources_set: Dict | None = None, default_resource_scales_set: Dict[str, float] | None = None, default_max_attempts: int | None = None) jobmon.client.workflow.Workflow

Create a workflow object associated with the active tool version.

Parameters:
  • workflow_args – Unique identifier of a workflow.

  • name – Name of the workflow.

  • description – Description of the workflow.

  • workflow_attributes – Any key/value pair that the user wants to record for this workflow

  • max_concurrently_running – How many running jobs to allow in parallel.

  • chunk_size – how many tasks to bind in a single request

  • default_cluster_name – name of cluster to run tasks on by default. Can be overridden at the task template or task level.

  • default_compute_resources_set – dictionary of default compute resources to run tasks with. Can be overridden at task template or task level. dict of {cluster_name: {resource_name: resource_value}}

  • default_resource_scales_set – dictionary of default resource_scales to adjust the resources with. Can be overridden at task template or task level. dict of {resource_name: scale_value}

  • default_max_attempts – the default max_attempts value to use when create wf

update_default_compute_resources(cluster_name: str, **kwargs: Any) None

Update default compute resources in place only overridding specified keys.

If no default cluster is specified when this method is called, cluster_name will become the default cluster.

Parameters:
  • cluster_name – name of cluster to modify default values for.

  • **kwargs – any key/value pair you want to update specified as an argument.

update_default_resource_scales(cluster_name: str, **kwargs: Any) None

Update default resource scales in place only overridding specified keys.

If no default cluster is specified when this method is called, cluster_name will become the default cluster.

Parameters:
  • cluster_name – name of cluster to modify default values for.

  • **kwargs – any key/value pair you want to update specified as an argument.

set_default_compute_resources_from_yaml(default_cluster_name: str, yaml_file: str, set_task_templates: bool = False, ignore_missing_keys: bool = False) None

Set default compute resources from a user provided yaml file for tool level.

Parameters:
  • default_cluster_name – name of cluster to set default values for.

  • yaml_file – the yaml file that is providing the default compute resource values.

  • set_task_templates – whether or not the user wants to set the default compute resource values for all of the TaskTemplates associated with Tool.

  • ignore_missing_keys – Whether or not to raise an error if a key is missing from the yaml file.

set_default_resource_scales_from_yaml(default_cluster_name: str, yaml_file: str, set_task_templates: bool = False, ignore_missing_keys: bool = False) None

Set default resource scales from a user provided yaml file for tool level.

Parameters:
  • default_cluster_name – name of cluster to set default values for.

  • yaml_file – the yaml file that is providing the default compute resource values.

  • set_task_templates – whether or not the user wants to set the default compute resource values for all of the TaskTemplates associated with Tool.

  • ignore_missing_keys – Whether or not to raise an error if a key is missing from the yaml file.

set_default_compute_resources_from_dict(cluster_name: str, compute_resources: Dict[str, Any]) None

Set default compute resources for a given cluster_name.

If no default cluster is specified when this method is called, cluster_name will become the default cluster.

Parameters:
  • cluster_name – name of cluster to set default values for.

  • compute_resources – dictionary of default compute resources to run tasks with. Can be overridden at task template or task level. dict of {resource_name: resource_value}

set_default_resource_scales_from_dict(cluster_name: str, resource_scales: Dict[str, float]) None

Set default compute resources for a given cluster_name.

If no default cluster is specified when this method is called, cluster_name will become the default cluster.

Parameters:
  • cluster_name – name of cluster to set default values for.

  • resource_scales – dictionary of default resource scales to adjust task resources with. Can be overridden at task level. dict of {resource_name: scale_value}

set_default_cluster_name(cluster_name: str) None

Set default cluster.

Parameters:

cluster_name – name of cluster to set as default.

set_default_max_attempts(value: int) None

Set default max_attempts.

Parameters:

value – value of max_attempts.

set_default_clu(cluster_name: str) None

Set default cluster.

Parameters:

cluster_name – name of cluster to set as default.