How to Setup Python For Machine Learning
Learn how to set up the perfect machine learning environment in Python.
An often overlooked part of machine learning is the fundamental setup of a functional, clean environment.
I’ve seen the question asked time and time again and was recently asked myself — how do we set up all of these things, from start to finish?
So this article will cover exactly that. We will go from installing Python through to setting up Pandas, TensorFlow, PyTorch, and more — and even adding a separate Machine Learning environment in Jupyter Lab (above).
We have a plethora of options for how we install Python. However, for keeping things simple while maintaining a fine level of control, we cannot beat the Anaconda distribution.
Anaconda gives us an easily managed environment that includes tools such as Spyder and Jupyter. You can download it here.
Anaconda's installation steps may vary between each OS, but once installed, the remaining steps should be essentially the same (this is true for Windows and Linux — I haven’t done it on Mac).
Check this guide for help with installation on Linux
Once installed, we should be able to open up Anaconda Prompt. Type to get your current Python version and make sure everything is set up correctly.
Creating a New Environment
Anaconda allows us to create different instances of Python called environments. After installing Anaconda, we have a single core environment called base.
We see this environment name whenever we open Anaconda prompt.
Nothing is stopping us from installing new Python packages (such as Pandas/TensorFlow) within the base environment. However, it is recommended to instead use different virtual environments either for different projects or use-cases.
Our use case is machine learning, and so we will create a new environment for this using the command , like so:
A list of packages will be displayed, and conda will ask if we want to proceed — we type to continue.
After everything has been installed, we will be able to switch to our new environment by typing :
We should see that has been replaced with — this means we are now working from inside our new virtual environment. So we can get started with installing all of the packages we need for ML.
For most packages, it makes sense to attempt a — if this doesn’t work, try .
A few essentials that we almost always need are Numpy, Pandas, and Matplotlib. We can install them all using :
We can install TensorFlow easily with :
Conda does not recognize the most recent versions of the Transformers library, so we instead install that with :
And finally, we have PyTorch. PyTorch is a slightly more complex installation — but made easy by accessing the PyTorch installation guide here.
We will need to specify our OS, package manager (Conda), language (Python), and whether we have CUDA or not.
A Note on CUDA
If you have an Nvidia GPU, CUDA lets you use it to speed-up any machine learning tasks (PyTorch/TensorFlow). You can read TensorFlow’s GPU setup guide and Nvidia’s CUDA installation guide for help with installation.
If you have an Nvidia GPU, I would recommend using CUDA 10.2 — unless you are using the latest RTX 30 series.
Support for these is sketchy (at the time of writing), and you will need CUDA 11.0, alongside the nightly builds of PyTorch/TensorFlow, etc. Reach out on Twitter if you’re having issues with this.
If you do not have a GPU or don’t care for CUDA, select None on the PyTorch page.
Finally, we will be given the code we need to run within the Run this Command box, take that and run it in Anaconda prompt to install:
That is it for our ML packages. We now need to finish integrating our new environment with Jupyter.
Adding The Environment to Jupyter
We’re almost there. All we need to do now is add our new ML environment to Jupyter. Fortunately, it’s super easy.
All we need to do is add a Jupyter to our environment, like so:
All we’re doing here is:
- Installing to our ML environment
- Specifying the Anaconda-known name of our ML environment
- Giving it a display name to be displayed in Jupyter
Now, all we need to do is switch back to our environment, which will remain as the default whenever we open Anaconda Prompt. And enter to open our Jupyter environment:
We will now be able to see our new environment in the Jupyter Lab launcher. Clicking on this will initialize a notebook using our new ML environment with all of the tools we need for ML ready to go!
That’s all you need to start working with your very own machine learning environment in Python!
Whenever you find a new package or you need to make changes to an existing package in the environment, we can simply open Anaconda Prompt and type to begin making changes.
Thanks for reading!