Original article

An often overlooked part of machine learning is the fundamental setup of a functional, clean environment.

I’ve seen the question asked time and time again and was recently asked myself — how do we set up all of these things, from start to finish?

We will learn how to add new environments to Jupyter Lab too.

So this article will cover exactly that. We will go from installing Python through to setting up Pandas, TensorFlow, PyTorch, and more — and even adding a separate Machine Learning environment in Jupyter Lab (above).

Check out the video version of this article here

Installing Python

We have a plethora of options for how we install Python. However, for keeping things simple while maintaining a fine level of control, we cannot beat the Anaconda distribution.

Anaconda gives us an easily managed environment that includes tools such as Spyder and Jupyter. You can download it here.

Anaconda's installation steps may vary between each OS, but once installed, the remaining steps should be essentially the same (this is true for Windows and Linux — I haven’t done it on Mac).

Check this guide for help with installation on Linux

Anaconda Prompt window. Typing python -V should display the current version of Python.

Once installed, we should be able to open up Anaconda Prompt. Type to get your current Python version and make sure everything is set up correctly.

Creating a New Environment

Anaconda allows us to create different instances of Python called environments. After installing Anaconda, we have a single core environment called base.

We see this environment name whenever we open Anaconda prompt.

Nothing is stopping us from installing new Python packages (such as Pandas/TensorFlow) within the base environment. However, it is recommended to instead use different virtual environments either for different projects or use-cases.

Our use case is machine learning, and so we will create a new environment for this using the command , like so:

conda create -n mlenv python=3.8 anaconda

A list of packages will be displayed, and conda will ask if we want to proceed — we type to continue.

We accept the proposed list of packages installations with y + [ENTER]

conda activate

After everything has been installed, we will be able to switch to our new environment by typing :

We activate our environment with conda activate mlenv

We should see that has been replaced with — this means we are now working from inside our new virtual environment. So we can get started with installing all of the packages we need for ML.

For most packages, it makes sense to attempt a — if this doesn’t work, try .

A few essentials that we almost always need are Numpy, Pandas, and Matplotlib. We can install them all using :

conda install pandas matplotlib — Numpy is included as a Pandas dependency, so there is no need to include it explicitly. It will be installed with Pandas.

Depending on what you are working on/with, you will probably need some of the most popular packages for ML too. We will install TensorFlow, Transformers, and PyTorch.

We can install TensorFlow easily with :

Conda does not recognize the most recent versions of the Transformers library, so we instead install that with :

And finally, we have PyTorch. PyTorch is a slightly more complex installation — but made easy by accessing the PyTorch installation guide here.

The PyTorch install locally guide will give us the commands we need to run to install everything we need for PyTorch.

We will need to specify our OS, package manager (Conda), language (Python), and whether we have CUDA or not.

A Note on CUDA

If you have an Nvidia GPU, CUDA lets you use it to speed-up any machine learning tasks (PyTorch/TensorFlow). You can read TensorFlow’s GPU setup guide and Nvidia’s CUDA installation guide for help with installation.

If you have an Nvidia GPU, I would recommend using CUDA 10.2 — unless you are using the latest RTX 30 series.

Support for these is sketchy (at the time of writing), and you will need CUDA 11.0, alongside the nightly builds of PyTorch/TensorFlow, etc. Reach out on Twitter if you’re having issues with this.

If you do not have a GPU or don’t care for CUDA, select None on the PyTorch page.

Finally, we will be given the code we need to run within the Run this Command box, take that and run it in Anaconda prompt to install:

conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch

That is it for our ML packages. We now need to finish integrating our new environment with Jupyter.

Adding The Environment to Jupyter

We’re almost there. All we need to do now is add our new ML environment to Jupyter. Fortunately, it’s super easy.

All we need to do is add a Jupyter to our environment, like so:

python -m ipykernel install --user --name mlenv --display-name “ML environment”

All we’re doing here is:

Installing to our ML environment
Specifying the Anaconda-known name of our ML environment
Giving it a display name to be displayed in Jupyter

Now, all we need to do is switch back to our environment, which will remain as the default whenever we open Anaconda Prompt. And enter to open our Jupyter environment:

We use conda activate base to switch back to our base environment and jupyter lab to open a Jupyter Lab instance.

We’ve added our new ML environment to the Jupyter lab launcher window.

We will now be able to see our new environment in the Jupyter Lab launcher. Clicking on this will initialize a notebook using our new ML environment with all of the tools we need for ML ready to go!

That’s all you need to start working with your very own machine learning environment in Python!

Whenever you find a new package or you need to make changes to an existing package in the environment, we can simply open Anaconda Prompt and type to begin making changes.

I hope you enjoyed this article! If you have any questions, let me know via Twitter. If you’d like more content like this, I post on YouTube and Medium too.

Thanks for reading!