anatools.anaclient.datasets module¶
Dataset Functions
- cancel_dataset(self, datasetId, workspaceId=None)¶
Stop a running job.
- Parameters
datasetId (str) – Dataset ID of the running job to stop.
workspaceId (str) – Workspace ID of the running job. If none is provided, the default workspace will get used.
- Returns
Success or error message about stopping the job execution.
- Return type
str
- create_dataset(self, name, graphId, description='', runs=1, priority=1, seed=1, workspaceId=None)¶
Create a new dataset based off an existing staged graph. This will start a new job.
- Parameters
name (str) – Name for dataset.
graphId (str) – ID of the staged graph to create dataset from.
description (str) – Description for new dataset.
runs (int) – Number of times a channel will run within a single job. This is also how many different images will get created within the dataset.
priority (int) – Job priority.
seed (int) – Seed number.
workspaceId (str) – Workspace ID of the staged graph’s workspace. If none is provided, the current workspace will get used.
- Returns
Success or failure message about dataset creation.
- Return type
str
- delete_dataset(self, datasetId, workspaceId=None)¶
Delete an existing dataset.
- Parameters
datasetId (str) – Dataset ID of dataset to delete.
workspaceId (str) – Workspace ID that the dataset is in. If none is provided, the current workspace will get used.
- Returns
Success or failure message about dataset deletion.
- Return type
str
- download_dataset(self, datasetId, workspaceId=None, localDir=None)¶
Download a dataset.
- Parameters
datasetId (str) – Dataset ID of dataset to download.
workspaceId (str) – Workspace ID that the dataset is in. If none is provided, the default workspace will get used.
localDir (str) – Path for where to download the dataset. If none is provided, current working directory will be used.
- Returns
Success or failure message about dataset download.
- Return type
str
- edit_dataset(self, datasetId, description=None, name=None, workspaceId=None)¶
Update dataset description.
- Parameters
datasetId (str) – Dataset ID to update description for.
description (str) – New description.
name (str) – New name for dataset.
workspaceId (str) – Workspace ID of the dataset to get updated. If none is provided, the current workspace will get used.
- Returns
Success or failure message about dataset update.
- Return type
str
- get_dataset_jobs(self, datasetId=None, workspaceId=None)¶
Queries the workspace dataset jobs based off provided parameters.
- Parameters
datasetId (str) – Dataset ID to filter.
workspaceId (str) – Workspace ID of the dataset’s workspace. If none is provided, the current workspace will get used.
- Returns
Information about the dataset job based off the query parameters provided or a failure message.
- Return type
str
- get_dataset_log(self, datasetId, runId, saveLogFile=False, workspaceId=None)¶
Shows dataset log information to the user.
- Parameters
datasetId (str) – The dataset the run belongs to.
runId (str) – The run to retrieve the log for.
saveLogFile (bool) – If True, saves log file to current working directory.
workspaceId (str) – The workspace the run belongs to.
- Returns
Get log information by runId
- Return type
list[dict]
- get_dataset_runs(self, datasetId, state=None, workspaceId=None)¶
Shows all dataset run information to the user. Can filter by state.
- Parameters
datasetId (str) – The dataset to retrieve logs for.
state (str) – Filter run list by status.
workspaceId (str) – The workspace the dataset is in.
- Returns
List of run associated with datasetId.
- Return type
list[dict]
- get_datasets(self, datasetId=None, name=None, email=None, workspaceId=None)¶
Queries the workspace datasets based off provided parameters. Checks on datasetId, name, owner in this respective order within the specified workspace. If only workspace ID is provided, this will return all the datasets in a workspace.
- Parameters
datasetId (str) – Dataset ID to filter.
name (str) – Dataset name.
email (str) – Owner of the dataset.
workspaceId (str) – Workspace ID of the dataset’s workspace. If none is provided, the current workspace will get used.
- Returns
Information about the dataset based off the query parameters provided or a failure message.
- Return type
str
- upload_dataset(self, filename, description=None, workspaceId=None)¶
Uploads user dataset using multipart upload with 8 threads.
- Parameters
filename (str) – Path to the dataset folder or file for uploading. Must be zip or tar file types.
workspaceId (str) – WorkspaceId to upload dataset to. Defaults to current.
description (str) – Description for new dataset.
- Returns
datasetId – The unique identifier for this dataset.
- Return type
str