Accessing data in Amazon S3ΒΆ

As you have full command line access in SherlockML, you can connect to external tools in the same way as you do from your local machine. The following shows how to read data from an S3 bucket.

Firstly we configure our AWS credentials on a SherlockML server. Create an .aws directory in your home directory, and within it a file named credentials:

$ mkdir -p ~/.aws
$ touch ~/.aws/credentials

Edit this file, and add the following contents, replacing YOUR_ACCESS_KEY and YOUR_SECRET_KEY with the appropriate values:

[default]
aws_access_key_id = YOUR_ACCESS_KEY
aws_secret_access_key = YOUR_SECRET_KEY

To list the S3 buckets you have access to from the command line, first install the AWS command line interface with pip:

$ pip install awscli

You can now list the buckets you have access to with:

$ aws s3 ls

Alternatively, from Python you can print the names of the buckets you have access to with:

import boto3

s3 = boto3.resource('s3')

for bucket in s3.buckets.all():
    print(bucket.name)

Here we use boto3, a Python library for interacting with AWS services. See the boto3 documentation for more details.