API

ls([prefix, project_id, show_hidden, s3_client]) List contents of project datasets.
get(project_path, local_path[, project_id]) Copy from a project’s datasets to the local filesystem.
put(local_path, project_path[, project_id]) Copy from the local filesystem to a project’s datasets.
open(project_path[, mode, temp_dir]) Open a file from a project’s datasets for reading.
mv(source_path, destination_path[, project_id]) Move a file within a project’s datasets.
cp(source_path, destination_path[, …]) Copy a file within a project’s datasets.
rm(project_path[, project_id, s3_client]) Remove a file from the project directory.
etag(project_path[, project_id]) Get a unique identifier for the current version of a file.
sherlockml.filesystem.ls(prefix='/', project_id=None, show_hidden=False, s3_client=None)

List contents of project datasets.

Parameters:
prefix : str, optional

List only files in the datasets matching this prefix. Default behaviour is to list all files.

project_id : str, optional

The project to list files from. You need to have access to this project for it to work. Defaults to the project set by SHERLOCKML_PROJECT_ID in your environment.

show_hidden : bool, optional

Include hidden files in the output. Defaults to False.

s3_client : botocore.client.S3, optional

Advanced - a specific boto client for AWS S3 to use.

Returns:
list

The list of files from the project datasets.

sherlockml.filesystem.get(project_path, local_path, project_id=None)

Copy from a project’s datasets to the local filesystem.

Parameters:
project_path : str

The source path in the project datasets to copy.

local_path : str or os.PathLike

The destination path in the local filesystem.

project_id : str, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by SHERLOCKML_PROJECT_ID in your environment.

sherlockml.filesystem.put(local_path, project_path, project_id=None)

Copy from the local filesystem to a project’s datasets.

Parameters:
local_path : str or os.PathLike

The source path in the local filesystem to copy.

project_path : str

The destination path in the project directory.

project_id : str, optional

The project to put files in. You need to have access to this project for it to work. Defaults to the project set by SHERLOCKML_PROJECT_ID in your environment.

sherlockml.filesystem.open(project_path, mode='r', temp_dir=None, **kwargs)

Open a file from a project’s datasets for reading.

This downloads the file into a temporary directory before opening it, so if your files are very large, this function can take a long time.

Parameters:
project_path : str

The path of the file in the project’s datasets to open.

mode : str

The opening mode, either ‘r’ or ‘rb’. This is passed down to the standard python open function. Writing is currently not supported.

temp_dir : str

A directory on the local filesystem where you would like the file to be saved into temporarily. Note that on SherlockML servers, the default temporary directory can break with large files, so if your file is upwards of 2GB, it is recommended to specify temp_dir=’/project’.

sherlockml.filesystem.mv(source_path, destination_path, project_id=None)

Move a file within a project’s datasets.

Parameters:
source_path : str

The source path in the project datasets to move.

destination_path : str

The destination path in the project datasets.

project_id : str, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by SHERLOCKML_PROJECT_ID in your environment.

sherlockml.filesystem.cp(source_path, destination_path, project_id=None, s3_client=None)

Copy a file within a project’s datasets.

Parameters:
source_path : str

The source path in the project datasets to copy.

destination_path : str

The destination path in the project datasets.

project_id : str, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by SHERLOCKML_PROJECT_ID in your environment.

s3_client : botocore.client.S3, optional

Advanced - a specific boto client for AWS S3 to use.

sherlockml.filesystem.rm(project_path, project_id=None, s3_client=None)

Remove a file from the project directory.

Parameters:
project_path : str

The path in the project datasets to remove.

project_id : str, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by SHERLOCKML_PROJECT_ID in your environment.

s3_client : botocore.client.S3, optional

Advanced - a specific boto client for AWS S3 to use.

sherlockml.filesystem.etag(project_path, project_id=None)

Get a unique identifier for the current version of a file.

Parameters:
project_path : str

The path in the project datasets.

project_id : str, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by SHERLOCKML_PROJECT_ID in your environment.

Returns:
str