When you know nothing, Big Data are not enough.

We are all learning how having an huge amount of data doesn’t always corresponds to have useful know


Piero Paialunga

3 years ago | 3 min read

Photo by Philippe Katzenberger on Unsplash

A wonderful quote my Bachelor’s Physics degree professor used to report to us was the following:

“ In god we trust, all others must bring data”

This is both fascinating and true. As we all know, science is not a democracy. If we disagree about something, but you bring data, and data confirms your theory, then you’re right, and I’m wrong. It doesn’t matter if I’m a PhD student and you are unemployed. And fortunately, we have loads of data to work with, and loads of data that can confirm our theory, about everything! And the wonderful part is that we are able to study, analyse, and apply powerful Machine Learning algorithms to huge dataset using our laptop: you don’t even need an elite quality set up to do that!

So we have a lot of data. Sometimes they are messy and dirty but we still love them just the way they are. And sometimes we get high on data, and we think that scientific models are no longer necessary and that knowledge is overrated. Why bother so much to find how an algorithm decided to output a certain result? Programmer has even a motto about it : “ If it works, don’t touch it” . We feel so powerful about all the data we have, that sometimes we think we are able to “skip the knowledge” to “get to the point”.

But sometimes, you just can’t “skip the knowledge”.

Sometimes people are dying from a virus you can’t scientifically comprehend. Sometimes this huge quantity of data is so messy, and incomplete, and incoherent, it is impossible to “get to the point”. Sometimes, even if you had all your perfect and clean data you wouldn’t know how to start to solve your huge problem.

In days like these, it’s absolutely unequivocal that we have no idea of what’s going on, even if we have a lot of data about it. We don’t know how much time this pandemic will go on, we don’t know how to face it, we have loads of drugs but none of them seem really helpful. In some way, we are all realising we are far away from a super-intelligence that will kill us all, as we are far away from a super-intelligence that could find a cure for us.

Corona-virus is changing a lot of our lives, and it will for a long time, even when it will stop killing people (hopefully soon).

Economy will change itself, our social interactions will change, sport will change, and science will change too.

Scientists (at least the one that are not so closed-minded to think it differently despite all the evidence) know the relevance of Artificial Intelligence. Mathematicians apply artificial intelligence tools in their studies (especially the more geometrical ones) to verify their intuition. Physics uses Machine Learning , Complex and Neural Networks in all the complex systems, where the number of variables is too high to treat problems analytically. A new frontier in medicine is working on a new approach to the cure, mapping the human genome and prescribing drugs specifically on the patient. And the list goes on.

At the same time this virus taught us that it is extremely important to base our scientific progress on knowledge . We need to be able to work side by side with the machines, taking advantage on the results that they furnish to us, and work hard to get knowledge. We need to make them perform better, feed them with new data, and keep them as a useful instrument to build a scientifically consistent model.

Of course, technology has to work in other ways. The miracle that happens when we search for a page on Google and we never look for page n.2 is just one powerful tool that we can use daily, but it is still enough to get us convinced about it. We don’t need to overthink small functioning details when we build a software for an industry, of course, but we need to create a fast, efficient, smart algorithm to do the task and ‘get to the point’ .

As a Big Data and Physics student I use to think about myself in two different ways. When I use different Machine Learning algorithms and make them compete against each other, I feel like an artist that is using some colours, shades and curves to make his work of art complete. On the other hand, when I have a scientific paper in front of me, I have to be precise, diffident, hungry of knowledge and emotionally distant from the problem.

And the difference is not sharp, of course, but is there. And it needs to be there to make everything complete and consistent. This is the proper way, to get “scientifically” to the point.


Created by

Piero Paialunga







Related Articles