MySQL

Connecting to MySQL databases from the command line

You can connect to a MySQL database with the mysql client. This is preinstalled on SherlockML servers.

To connect to a MySQL database, you need to know:

  • its hostname: this is usually a string like customers.mydomain.com;
  • its port: the default port for MySQL databases is 3306;
  • the name of the database that you want to connect to. If this is a new server, the only database on it is likely to be named mysql;
  • your username and password for the database server (note that this is different to your SherlockML username). If you are unsure of these, you should ask your database administrator.

You can then connect to the database with:

$ mysql --host HOSTNAME --port PORT --user USERNAME --database DATABASE_NAME --password

You should then be prompted for a password. If the port is the default 3306, you may omit the --port PORT argument.

For instance, I can connect as user funny_bohr to database example on host exampledb.cqx0gge7x5bc.eu-west-1.r ds.amazonaws.com with the following command:

../../_images/mysql_example.png

If you get stuck, run mysql --help | less in a terminal to view a list of all available options.

Connecting to MySQL databases from Python

Unfortunately, there seems to be some confusion as to what the best Python MySQL client is. The official client, MySQLdb, does not support Python 3 yet. We recommend using a fork, mysqlclient-python, which exposes the same interface as the official client, and does support Python 3.

You can install mysqlclient-python on a SherlockML server with:

$ sudo apt install libmysqlclient-dev
$ pip install mysqlclient

You can now connect to your MySQL database:

import MySQLdb as mysql

connection = mysql.connect(
    host='customers.mydomain.com',  # host on which the database is running
    database='database_name',  # name of the database to connect to
    user='username',  # username to connect with
    passwd='password'  # password associated with your username; note
                       # the keyword is `passwd`, not `password`

Note

We close the connection to allow the database server to reclaim resources. This can be critical in a Jupyter notebook, since the kernel remains alive for a long time.

The mysql connection can also be used in a with statement to ensure it gets closed automatically:

import mysql

with mysql.connect(
    host='customers.mydomain.com', database='database_name',
    user='username', password='password'
) as connection:
    cursor = connection.cursor()
    cursor.execute('SELECT * FROM customers')
    customers = list(cursor.fetchall())
    print('We have {} customers'.format(len(customers)))

You can also use the read_sql() function in pandas to read the result of a query directly into a DataFrame:

import mysql
import pandas

with mysql.connect(
    host='customers.mydomain.com', database='database_name',
    user='username', password='password'
) as connection:
    df = pandas.read_sql('SELECT * FROM customers', connection)

print(df)

Note

We recommend avoiding pasting database passwords and other connection details in many notebooks in a project. Have a look at Factoring connection details into a package for a recommended strategy for managing database connection details.