anatools.anaclient.datasets module

Dataset Functions

cancel_dataset(self, datasetId, workspaceId=None)

Stop a running job.

Parameters
  • datasetId (str) – Dataset ID of the running job to stop.

  • workspaceId (str) – Workspace ID of the running job. If none is provided, the default workspace will get used.

Returns

Success or error message about stopping the job execution.

Return type

str

create_dataset(self, name, graphId, description='', runs=1, priority=1, seed=1, workspaceId=None)

Create a new dataset based off an existing staged graph. This will start a new job.

Parameters
  • name (str) – Name for dataset.

  • graphId (str) – ID of the staged graph to create dataset from.

  • description (str) – Description for new dataset.

  • runs (int) – Number of times a channel will run within a single job. This is also how many different images will get created within the dataset.

  • priority (int) – Job priority.

  • seed (int) – Seed number.

  • workspaceId (str) – Workspace ID of the staged graph’s workspace. If none is provided, the current workspace will get used.

Returns

Success or failure message about dataset creation.

Return type

str

delete_dataset(self, datasetId, workspaceId=None)

Delete an existing dataset.

Parameters
  • datasetId (str) – Dataset ID of dataset to delete.

  • workspaceId (str) – Workspace ID that the dataset is in. If none is provided, the current workspace will get used.

Returns

Success or failure message about dataset deletion.

Return type

str

download_dataset(self, datasetId, workspaceId=None, localDir=None)

Download a dataset.

Parameters
  • datasetId (str) – Dataset ID of dataset to download.

  • workspaceId (str) – Workspace ID that the dataset is in. If none is provided, the default workspace will get used.

  • localDir (str) – Path for where to download the dataset. If none is provided, current working directory will be used.

Returns

Success or failure message about dataset download.

Return type

str

edit_dataset(self, datasetId, description=None, name=None, workspaceId=None)

Update dataset description.

Parameters
  • datasetId (str) – Dataset ID to update description for.

  • description (str) – New description.

  • name (str) – New name for dataset.

  • workspaceId (str) – Workspace ID of the dataset to get updated. If none is provided, the current workspace will get used.

Returns

Success or failure message about dataset update.

Return type

str

get_dataset_jobs(self, datasetId=None, workspaceId=None)

Queries the workspace dataset jobs based off provided parameters.

Parameters
  • datasetId (str) – Dataset ID to filter.

  • workspaceId (str) – Workspace ID of the dataset’s workspace. If none is provided, the current workspace will get used.

Returns

Information about the dataset job based off the query parameters provided or a failure message.

Return type

str

get_dataset_log(self, datasetId, runId, saveLogFile=False, workspaceId=None)

Shows dataset log information to the user.

Parameters
  • datasetId (str) – The dataset the run belongs to.

  • runId (str) – The run to retrieve the log for.

  • saveLogFile (bool) – If True, saves log file to current working directory.

  • workspaceId (str) – The workspace the run belongs to.

Returns

Get log information by runId

Return type

list[dict]

get_dataset_runs(self, datasetId, state=None, workspaceId=None)

Shows all dataset run information to the user. Can filter by state.

Parameters
  • datasetId (str) – The dataset to retrieve logs for.

  • state (str) – Filter run list by status.

  • workspaceId (str) – The workspace the dataset is in.

Returns

List of run associated with datasetId.

Return type

list[dict]

get_datasets(self, datasetId=None, name=None, email=None, workspaceId=None)

Queries the workspace datasets based off provided parameters. Checks on datasetId, name, owner in this respective order within the specified workspace. If only workspace ID is provided, this will return all the datasets in a workspace.

Parameters
  • datasetId (str) – Dataset ID to filter.

  • name (str) – Dataset name.

  • email (str) – Owner of the dataset.

  • workspaceId (str) – Workspace ID of the dataset’s workspace. If none is provided, the current workspace will get used.

Returns

Information about the dataset based off the query parameters provided or a failure message.

Return type

str

upload_dataset(self, filename, description=None, workspaceId=None)

Uploads user dataset using multipart upload with 8 threads.

Parameters
  • filename (str) – Path to the dataset folder or file for uploading. Must be zip or tar file types.

  • workspaceId (str) – WorkspaceId to upload dataset to. Defaults to current.

  • description (str) – Description for new dataset.

Returns

datasetId – The unique identifier for this dataset.

Return type

str