Connecting to MongoDB from the command line¶
The MongoDB shell is the standard tool for connecting to MongoDB databases from the command line.
To install the MongoDB client on a SherlockML server, run:
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6 $ echo "deb [arch=amd64] https://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.6 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.6.list $ sudo apt-get update $ sudo apt install mongodb-org-shell
This installs version 3.6 of the MongoDB client. This version of the client will not work with servers running MongoDB version 2.6 or earlier. To install a different version of the MongoDB client, refer to the MongoDB documentation for that version, following the instructions for Ubuntu 16.04. You only need to install mongodb-org-shell, rather than the entire mongodb-org package.
To connect to a MongoDB database, you need to know:
- its hostname: this is usually a string like
- its port: the default port for MongoDB is 27017;
- the name of the database that you want to connect to;
- your username and password for the database server (note that this is different to your SherlockML username). If you are unsure of these, you should ask your database administrator.
Once you have the MongoDB client installed, connect to the database with:
$ mongo --host HOSTNAME
This will open a MongoDB shell, which you can use to authenticate:
// switch to the admin database for your MongoDB server // (this is probably 'admin', but if not, you should talk to your server administrator) > use admin > db.auth('USERNAME', 'PASSWORD'); // log in > use my_org_db // switch to the database you want to use > db.customers.count(); // explore your database
Connecting to MongoDB from Python¶
The official Python package for interacting with MongoDB databases is PyMongo. Install it on a SherlockML server with:
$ pip install pymongo
You can then connect to a MongoDB database with:
from pymongo import MongoClient client = MongoClient( 'HOSTNAME', username='USERNAME', password='PASSWORD' ) # You can now connect to your database and do some data science db = client.get_database('my_org_db') db['customers'].count() client.close() # Close the connection to free up space on the server
We close the connection to allow the database server to reclaim resources. This can be critical in a Jupyter notebook, since the kernel remains alive for a long time.
We recommend avoiding pasting database passwords and other connection details in many notebooks in a project. Have a look at Factoring connection details into a package for a recommended strategy for managing database connection details.