Access external resources from SherlockML

It’s common practice to limit network traffic to sensitive resources like databases as an added layer of security on top of authentication methods like a username and password. By permitting access only from a set of whitelisted IP addresses, attackers are unable to even attempt to log in unless they have access to one of the approved computers.

To enable users to impose this kind of network rule while permitting access from SherlockML, we provide an API which returns the list of IP addresses where your SherlockML user servers run, at https://sherlockml.com/api/cluster/ip-addresses.

Note

On a VPC deployment of SherlockML, replace sherlockml.com in the above URL with the URL you use to access SherlockML. For example, if you access SherlockML on https://example.sherlockml.net, the API will be at https://example.sherlockml.net/api/cluster/ip-addresses.

This API returns a JSON object containing the current IP addresses of the cluster, formatted as follows:

{
    "ipAddresses": [
        "101.2.3.4",
        "105.6.7.8"
    ]
}

Warning

The list of IP addresses will change as necessary software updates are applied or when the SherlockML compute cluster is scaled in size. You should therefore not assume that the IPs do not change, but rather set up a periodic task that updates your relevant network rules.

Scripting network rule updates

You can write scripts to automatically update network rules for access from SherlockML with Python. For example, you can retrieve the IP addresses from the above API with the requests module:

import requests
response = requests.get('https://sherlockml.com/api/cluster/ip-addresses')
sherlockml_ips = response.json()['ipAddresses']

You can then write some code that uses a relevant API to update network rules. For example, on Amazon Web Services (AWS), you can use boto3 to update AWS security group rules:

import boto3
EC2 = boto3.resource('ec2')
security_group = EC2.SecurityGroup('your-security-group-id')
for ip_address in sherlockml_ips:
    cidr = '{}/32'.format(ip_address)
    security_group.authorize_ingress(
        IpProtocol='tcp',
        FromPort=5432,  # For accessing a PostgreSQL database
        ToPort=5432,  # For accessing a PostgreSQL database
        CidrIp=cidr
    )

You should be sure to implement logic to remove access rules from IPs that are no longer in the list retrieved from SherlockML.

Note

See our example script for updating an AWS security group.