Microsoft SQL Server ==================== Connecting to Microsoft SQL Server databases from the command line ------------------------------------------------------------------ Microsoft provides the `sqlcmd `_ tool for connecting the SQL Server databases from the command line. You need to install it on your Faculty server first: .. code-block:: bash $ curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add - $ curl https://packages.microsoft.com/config/ubuntu/18.04/prod.list | sudo tee /etc/apt/sources.list.d/msprod.list $ sudo apt-get update $ sudo apt-get install mssql-tools unixodbc-dev The last step above will ask you to accept some license agreements. The command line tool gets installed to ``/opt/mssql-tools/bin/sqlcmd``. To connect to a SQL Server database, you need to know: - its `server`: this is usually a string like ``customers.mydomain.com``; - its `port`: the default port for SQL Server databases is 1433; - the name of the database that you want to connect to. If this is a new server, the only database on it is likely to be named ``master``; - your username and password for the database server (note that this is different to your Faculty username). If you are unsure of these, you should ask your database administrator. You can then connect to the database with: .. code-block:: bash $ /opt/mssql-tools/bin/sqlcmd -S SERVER,PORT -U USERNAME -d DATABASE_NAME You should then be prompted for a password. If the port is the default 1433, you may omit the ``,PORT`` part of the server argument. For instance, I can connect to database ``demos`` running on host ``customer-database.example.com`` with username ``admin``: .. thumbnail:: images/mssqlserver_example.png If you get stuck, run ``/opt/mssql-tools/bin/sqlcmd -? | less`` in a terminal to view a list of all available options. Connecting to Microsoft SQL Server databases from Python -------------------------------------------------------- To connect to Microsoft SQL Server databases from Python, you can use the Python module `pymssql `_. You need to install it on your Faculty server first: .. code-block:: bash $ conda install pymssql --channel conda-forge .. note:: We recommend installing pymssql from the 'conda-forge' conda channel rather from pip or the standard conda repositories to ensure you get all the required dependencies. .. code-block:: python import pymssql connection = pymssql.connect( server='customers.mydomain.com', # host on which the database is running database='database_name', # name of the database to connect to user='username', # username to connect with password='password' # password associated with your username ) cursor = connection.cursor() cursor.execute('SELECT * FROM customers') customers = list(cursor.fetchall()) print('We have {} customers'.format(len(customers))) # This is data science! connection.close() .. note:: We close the connection to allow the database server to reclaim resources. This can be critical in a Jupyter notebook, since the kernel remains alive for a long time. The pymssql connection object can also be used in a ``with`` statement to ensure it gets closed automatically: .. code-block:: python import pymssql with pymssql.connect( server='customers.mydomain.com', database='database_name', user='username', password='password' ) as connection: cursor = connection.cursor() cursor.execute('SELECT * FROM customers') customers = list(cursor.fetchall()) print('We have {} customers'.format(len(customers))) You can also use the ``read_sql()`` function in pandas to read the result of a query directly into a DataFrame: .. code-block:: python import pymssql import pandas with pymssql.connect( server='customers.mydomain.com', database='database_name', user='username', password='password' ) as connection: df = pandas.read_sql('SELECT * FROM customers', connection) print(df) .. note:: We recommend avoiding pasting database passwords and other connection details in many notebooks in a project. Have a look at :ref:`package_connection_details` for a recommended strategy for managing database connection details.