This piece will delve into some popular and lesser-known terms and technologies used in AI and Machine Learning.

Conventional software programs are coded by developers with specific instructions about tasks that the programs must carry out. While this works well in most situations that can be defined very precisely, limitations are encountered beyond a certain level of complexity.

A human programmer can’t take every possible future use case into account when writing the code. If the environment changes, the programs won’t be able to attain the required performance or the desired objectives, given the significantly changed background conditions.

The development of machine learning began as a way around this problem: it is based on adaptive algorithms that can learn from data without being reliant on rule-based programming. The system can detect patterns, make associations, and gain insights from the data.

So, it is about creating what are generally meaningful connections between input and output using AI. The prerequisites for this type of learning process are high processing power and having a sufficiently large amount of data. Both have only been available for the past few years, thanks to big data, and it is therefore unsurprising that machine learning has also made tremendous progress in recent years.

Of course, human intelligence is much more differentiated, but the focus on five key cognitive capacities resulted in a great leap forward in AI research. Instead of attempting to program “general intelligence,” as found in humans, the focus shifted to precisely-defined tasks.

AI is now no longer at the primary research stage but has become part of our everyday lives. Whether it is speech recognition algorithms as used by Apple’s Siri and Amazon’s Alexa, or the interpretation of findings in the medical sector, facial recognition in CCTV footage, or pharmacological research, working without AI technologies has now become unthinkable in many industries.

Machine Learning

Machine learning is a term for various processes used to determine previous unknown inter-relationships between input and output data. In addition to traditional critical applications such as regression, cluster formation, time-series analysis, and factor, it integrates more sophisticated methods such as neuronal networks, evolutionary approaches, and support vector machines.

In its basic form, a machine learning algorithm is fed with information which it must analyze and recognize to obtain a specific result. One example is the classification of spam, as used by the majority of big e-mail providers: The program is presented with thousands of e-mails that are classified as either “spam” or “not spam.” In this way, the algorithm “learns” to identify spam by identifying certain elements whose presence distinguishes spam from legitimate e-mails.

So, the algorithms attempt to detect patterns in existing databases, to classify data, or to make predictions. Some examples are music or buying recommendations in the case of online platforms, and the optimization of marketing campaigns or customer service. Tracing the right patterns is of crucial importance here, as patterns can almost always be found in databases and links made to other events. However, whether these will be meaningful when it comes to solving a specific problem is another question.

Supervised Learning

Supervised learning is by far the most common type of machine learning, the algorithm is presented with an input (for example, images), together with the desired output (label). Example: If a computer is to distinguish between images of cats and dogs, it is presented with hundreds of images of cats and dogs from an extensive range of angles, each with the appropriate label “CAT” or “DOG.”

By this means, the algorithm should be able to develop a rule that it can use to make a clear distinction between dogs and cats in the future. The learning algorithm also needs to be able to abstract or generalize from the training data. “Overfitting” may occur if the existing digital knowledge becomes obscured by the input of too much data. This strategy has already been in use for many years in applications where a camera, rather than a conventional laser scanner, is used to detect and evaluate barcodes or data matrix codes.

Unsupervised Learning

Unsupervised learning tries to detect appropriate patterns simply by looking at the input data, and thereby to reduce the huge quantity of data in the real world in terms of dimensions and complexity, without noticeable losses. This is a more complex task without an objective.

Semi-Supervised Learning

Semi-supervised learning is a mixture of the two processes described above. This means that, in addition to the task of recognizing labeled data, the job of reducing the dimensions for quicker, more efficient, and potentially more robust recognition is simultaneously carried out.

Reinforcement Learning

With reinforcement learning, an algorithm independently learns the best strategy to attain an objective in the future. The algorithm is not presented with the result but is given an indication (reinforcement) as to what extent the algorithm is nearing its objective or moving away from it.

Reinforcement learning algorithms based on deep neural networks (“deep reinforcement learning”) have proven to be particularly successful here. The neuronal network detects patterns in the data and, from these, develops a model of the part of the world that is described by the data. The training data are not presented to the algorithm, and instead, it obtains these data through interaction with and feedback from the environment. This method is especially well suited to problems of classification, prediction, and production.

Deep Learning in Neural Networks

For a long time, tasks that even a child would have no problem with, such as detecting image content or voice recognition, were a stumbling block for machines. This has changed over recent years thanks to deep learning, an approach based on neuronal networks technology. In this context, the term “deep” refers to the number of hidden layers in the network — neuronal networks that are based on deep residual learning. Currently, the most complex method used for object recognition may contain a thousand or more such layers. Machine learning can help solve classification, prediction and generation problems

Neural Networks and Deep Learning: Fundamental Principles

Neuronal networks are based on the model of the human brain and aim to solve problems similar to humans. They are networks of closely-connected processing elements, the neurons, which are known as “nodes.” These receive information from the environment or other neurons, process it, and pass it on to other nodes or the environment.

Only the outer layers, i.e., the input layer and the output layer, are accessible to the observer. What happens within the network is invisible, and it is often difficult to do plausibility checks for validation and verification. The artificial neurons are modeled and arranged in multiple layers behind or above one another.

Each level of the network contributes to attaining the (hopefully) correct output. This extraction of characteristics takes place independently within the individual levels. In its turn, the output from the individual layers then serves as the input for the next level. Through large volumes of high-quality training data, the network learns to complete specific tasks.

For example, It is relatively difficult for a computer to register the significance of a photo at first glance in the same way that humans do. Discerning an exact shape from a group of pixels is a highly complex task, which it is almost impossible to do directly.

In deep learning, the neuronal network breaks up the image into many partial images, each of which is processed by one layer, for example, edges, corners, contours, etc. So, for example, the first hidden layer might detect edges by comparing the brightness of adjacent pixels and pass this information on to the second hidden layer, which then searches for corners or contours (after all, these are nothing more than a group of edges).

Based on this information, the third hidden layer then looks for groups of corners and edges that are typical of a specific object, etc., until a certain object is finally identified.

Each connection between the nodes is assigned a weighting, and this is modified in the course of the learning process. A positive weighting means that a neuron is exercising an excitatory influence on another neuron, while a negative weighting means that the influence is inhibitory in nature.

Where the weighting is zero, a neuron is not exercising any influence on another neuron. Just as in everyday life, where we might not notice a mosquito sting or a tick bite, the neuronal networks also require an “activation function” that eliminates the smallest values and concentrates on the inter-relationships that are actually significant.

Cognitive Computing

Some authors cite a further variant of AI: cognitive computing.

This is understood as systems that take over specific tasks or make specific decisions as assistants or in place of humans, for example, in claims management for an insurance company or in diagnostics at a hospital. These systems can handle ambiguity and vagueness and have a high degree of autonomy within their area of knowledge.

Predictive Analysis

Today, big data technologies are elements within an agile supply chain. Only by using these technologies has it become possible to process the vast quantities of data generated — for example, by sensors — to depict the real world there and then and to make sound decisions.

Big data technologies enable forecasts and sophisticated scenario analyses and, in this way, permit precise capacity planning and the optimization of supply chains and inventories. Predictive analytics is based primarily on data mining, a traditional area of use for artificial intelligence. It is a matter of detecting patterns in data volumes.

This involves using statistical calculations, potentially elements of game theory, semantic processes, and operations research methods. The preliminary stage of predictive analytics is known as descriptive analytics; the subsequent stage goes by the name of prescriptive analytics, where the AI makes recommendations for action based on the inter-relationships detected.

A high level of data integrity is the prerequisite for a meaningful result. A prediction is only ever as good as the data made available to the system. This integrity is not necessarily guaranteed. In too many cases, data entry is still carried out manually and therefore contains errors, even if these are unintentional.

However, particularly in logistics, the continuously generated data constitute an excellent basis for adapting the system repeatedly to the modified environmental conditions. Although AI systems can compensate well for individual errors in large data volumes, if many minor discrepancies accumulate, there is the risk of obtaining incorrect results. It is, therefore, not merely a matter of collecting data but also of “understanding” it from the outset.

One commonplace example is the use of different measurement units. If some data records work with the measurement unit “meters,” while others use the unit of “feet,” care must be taken to ensure that standardization to a pre-determined standard unit of measurement takes place in advance. For this, suitably qualified and experienced data scientists are required.

This is a relatively new job description for specialists who can handle vast quantities of data and derive the highest possible benefit from these thanks to their up-to-date IT skills, sound knowledge of the mathematical and statistical processes, and accurate information about the contractee’s technical environment.

The 1986 Challenger catastrophe is a prime example of what happens when a problem is not analyzed and recognized in its entirety. Looking only at the failure measurements for the questionable O-ring, which triggered the explosion, did not provide enough information to correctly assess the risk of launching the shuttle at temperatures below 37 °F.

If the data from the failures had been compared in context to the values from the successful tests, the corresponding grouping of the poor values in the lower temperature range would have provided great insight. The grouping of the values in the upper-temperature range would immediately have led the scientists to recognize the fact that temperature was a critical factor for this component.

These are just some of the terms used in AI and Machine Learning. Feel free to follow me for daily pieces regarding the world of Data Science.