Servers

Servers are computing resources you can access on demand.

Running computations on your servers is just like running computations on your local computer, except you get to recreate your computer at any time with whatever memory and computing power you need. You can have many servers or none. You can create and terminate them in seconds.

Servers provide the power you need to work. The brains of the operation is the workspace - remembering your files across all project servers - the servers are just the muscle. Terminating servers has no effect on your files in the workspace. Anything stored on the server outside the workspace will be terminated with the server.

Creating a server

You can create servers from the workspace. To launch a new server from the Workspace, use the (+) button in the bottom left-hand corner of the workspace.

../_images/create-new-server.png

When you have multiple servers running, you can select in the workspace which server to use when running a notebook.

Choosing a server size

SherlockML offers you several different server size options.

../_images/servers-on-shared-infrastructure-dropdown.png

You can create small servers that run on a shared cluster, or larger servers that run on their own dedicated computer.

Servers on shared infrastructure are good for development and prototyping. They are very fast to create (ten to twenty seconds). If this is your first time using SherlockML, or if you are unsure what server to choose, these are probably a good choice.

Servers on dedicated infrastructure are suitable for workloads that require a lot of computation or memory. When you create a server on a dedicated infrastructure, SherlockML will create a new AWS server for you. These servers can take five to ten minutes to start.

../_images/servers-on-dedicated-infrastructure-dropdown.png

Within each category, SherlockML offers you different sizes. If you are unsure what size to choose, we recommend starting with a small server. If your server runs out of memory, you can then choose to switch to a larger server. You will know that your server has run out of memory because your Jupyter kernel or RStudio session (or whatever memory hungry process you are running) will get killed.

Note

Servers on dedicated instances are not available in SherlockML cloud. They are only available in custom deployments.

It only takes a few seconds to terminate a server and spin up a new one. For bespoke server configurations use the command line interface sml.

SherlockML server types

You can choose one of four different server types:

Jupyter servers are running Jupyter notebook with the conda package manager. You should use these if you intend to develop primarily in Python.

JupyterLab servers are identical to Jupyter servers, except that they are running JupyterLab instead of the Jupyter notebook.

RStudio servers run RStudio. Perhaps unsurprisingly, you should use these if you intend to develop in R.

Zeppelin servers run Apache Zeppelin. You can use these if you want to experiment with other languages, like Scala.

All these servers run on a Ubuntu operating system. You can install additional system packages via the apt package manager.

Sharing workspaces between servers

In SherlockML, servers are independent computers. Every time you create a new server, you get a fresh start. There are some directories that are persistent, though: they retain their data when you terminate a server:

  • the /project directory is shared between every server in your project. If multiple people have access to a project, they will all see the same /project directory. Use this for your day-to-day work.
  • the /home/sherlock directory is shared between every server that you own. Use this to store configuration files to customise your environment.

SherlockML servers under the hood

SherlockML is backed by Amazon Web Services. In most deployments, SherlockML is allocated a fixed pool of large Amazon EC2 nodes. When you create a server on shared infrastructure, a small fraction of these large EC2 nodes is allocated to that server. Critically, because development workflows are typically uneven (they use almost no resources almost all the time), SherlockML overallocates space on these EC2 nodes with the expectation that not every server is going to run CPU or memory-heavy workloads at the same time. Because each non-dedicated server only reserves space in this existing fixed pool, creating a new server does not increase the cost of SherlockML directly. Your cluster administrators control the size of the pool.

When you create a server on dedicated infrastructure, SherlockML will create a new EC2 server just for that instance. There is no overallocation, and, since the server has its own dedicated instance, it does not suffer from multi-tenancy issues like network bandwidth contention. However, because they exist on their own EC2 node, these servers have an associated cost. We quote the on-demand price for the corresponding node (this may be more than what you actually pay if you have reserved instances).