cft

5 Must-Have Applications for Data Science

These are some must have apps for data science.


user

Emmett Boudreau

3 years ago | 3 min read

With Data Science being a growing topic in the software industry, and machine-learning being at the forefront of the technological sphere, there are new applications to make the job easier and faster being developed every day. And with that exciting growth, we are constantly at an influx of new creators, scientists, and analysts alike joining the ranks of lifetime learning.

Today I decided I wanted to share my personal top five favorite tools, some of them obvious and some of them not so obvious. Either way, hopefully somebody finds these recommendations for tools useful.

5. WSL

The first tool on this list is exclusive to Windows users, i.e. not me, but is a great way to maximize your workflow inside of Windows. For those new to the concept, Windows doesn’t have a traditional bash terminal because the operating system was programmed in a completely different way to Unix-based systems. This can be a serious detriment to the workflow of your average developer, and the case is no different with Data Scientists.

However, WSL allows you to run a virtual Linux terminal, usually Ubuntu, inside of Microsoft Windows. Normally, the suite on Windows requires constant application switching, “ Git-Bash”, “Anaconda Prompt”, etc. Of course there is nothing wrong with taking that route, but WSL makes this entire integration a lot easier, and allows for a significantly enhanced workflow especially when working with a team.

4. DB Browser

A lot of people may not have heard of DB-Browser. DB-Browser allows you to view the internals of a database, as well as get familiar with its schema without sending a single query. I use DB-Browser a lot when I have some mystery DB, or want to test some Queries to make sure its doing what I want correctly prior to pushing the code. DB-Browser is also universally available across Windows, Linux, and Mac, making it a great free tool that anyone can use.

3. R Studio

If you’re a prominent R programmer, or prefer Python, Scala, MATLAB, or Julia, but frequently use R, R studio is definitely a DS tool to look into. The only significant downside to R studio is that it isn’t cheap, and certainly isn’t free. Regardless of pricing or usage, R studio is definitely a cool environment to work in which I enjoy a-lot.

2. Docker

Docker is another one you might have thought about, but not seen coming on this list. As this is a more opinionated list, I figure I should remind you that Docker certainly isn’t always the best choice for everything. However, as someone who loves Dev-ops and Linux alike, Docker is a great tool for setting up virtual environments to get your work done. Not only do we have the advantage of language package managers like the Python package index, we also have the advantage of the Linux package manager.

Although these benefits are certainly there, for most it might be a better idea to just use Pip/virtual env. These tools are definitely useful for a quick setup, tracking pip wheels, and deployment. There are disadvantages and advantages to using either, but in my case, I recommend Docker.

1. Jupyter

And in the conclusion that everyone saw coming, I present to you:

Jupyter

Of course Jupyter likely needs no introduction, but it makes the top of my list because it wouldn’t make any sense not to have it there. Jupyter allows you to utilize Conda virtual environments inside of a virtual cell-by-cell executing virtual kernel. It’s simply a must for Data Science, but of course you likely already knew that.

Also to note is that Jupyter is cross-platform, and can be used anywhere. Jupyter also supports extensions that can allow the execution of any language, making it a tool that you can use with R, Scala, Julia, and C in addition to Python. I can’t tell you how many times a day I jump into Jupyter to debug something, or test a function before I try to use it. The quick easy setup is also a plus.

Conclusion

There are many tools that make a Data-Scientist’s job just a hair easier. These are my top five based on how often I use each one, but I would be very interested to see what software other Data-Scientists love to utilize, and the ones that I use that they also enjoy. Feel free to share that below, as I’d absolutely love to know.

I’m so excited to see the future of where software centered around DS will go in the next couple of years and beyond. And with the significant jumps in the past year already, I imagine we’ll be seeing some really cool stuff in the coming years!

Upvote


user
Created by

Emmett Boudreau


people
Post

Upvote

Downvote

Comment

Bookmark

Share


Related Articles