AI algorithms are increasingly being used in a wide range of areas for making decisions that impact our day to day life. Some examples are —Recruitment, Healthcare, Criminal Justice, Credit Risk Scoring etc. It's being used by not just private businesses but also governments.

One of the supposed benefits of using AI or machines in general for making decisions is — they may be impartial, objective and may not carry the same biases as humans do and hence may be more “fair”. Some of the recent studies have shown that AI systems can be biased as well.

Imagenet, the public Image database which feeds into various computer vision applications like Face Detection, recently removed 600K images when an art project exposed racial bias in its images. All this has lead to increased awareness about bias in AI and has raised basic questions related to trust against many AI systems.

What is Bias?

Before we go deeper into this topic, its important to define bias. Here is a definition from Wikipedia

Bias is disproportionate weight in favor of or against an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair.

There has been a lot of studies on the kind of biases humans suffer from, Nobel Laureate Daniel Kahneman talks about a variety of biases in human intuition in his excellent book Thinking Fast and Slow

An example of a bias from which humans commonly suffer from is — Confirmation bias. It is defined as a tendency to search, interpret and recall information in a way which confirms or strengthen one’s pre-existing beliefs. While one may think that the advent of internet and explosion of information (data) available to us would lead us to the “truth”, confirmation bias is what keeps people selectively interpret information and hinged to what they already believe.

Case Studies of AI Bias

A famous case study in AI Bias is the COMPAS system which is used by US courts to assess the likelihood of a defendant becoming a recidivist. An investigation of the software found that, while the system was designed for maximizing overall accuracy, the false positive rates for African Americans was twice that of for Caucasians.

Another popular case study is — Amazon’s AI recruitment tool. Amazon had developed a tool to help with recruitment which helped in screening of resumes. It was trained on 10 years data of Amazon’s internal recruitment decisions. Amazon later decided to scrap the tool when it was discovered that the tool was more likely to pick men over women

Mathematical Definition of Bias

While all that we have reviewed so far is good, but the definition of bias and fairness has a long history of debate in law, social science and philosophy. It's difficult to come to a consensus on exact definition of these. What makes it even more tricky in the context of AI models is we need to define it in mathematical terms as AI systems only understand numbers and mathematical operations.

It's important to distinguish biases from random errors. It's not uncommon for AI systems to make errors, this is because they are a simplification of the complex real world. But biases are systematic errors, which occur in a somewhat predictable way. Biases may cause one to act unfairly against an individual or a group.

Some people have attempted to define the absence of bias in AI systems by balancing of false positive and false negative rates in predictions across groups example gender, race etc. But this definition is not generic enough due to various forms in which Bias may manifest itself. Frankly it's still an open research area.

How does Bias creep into AI systems?

A natural question is how do AI systems get Biased? Can they be Biased even if they are not explicitly codified to be biased? Following is a list of some possible reasons why an AI system can get biased:

Choice of Attributes: If one uses attributes like age, gender, ethnicity in algorithms, the algorithm can learn the relationship between these attributes and the target which may cause the algorithms to get biased.
Data Collection / Sampling: A lot of times, data may be collected or sampled in a way which leads to over or under-representation of a group which may lead to biases. There is a famous case study about Street Bump app by Boston municipal to detect potholes which suggested wealthier neighborhood had more bumps simply because people living in affluent neighborhood were more likely to use the app.
Implicit Bias in Data: Most of the AI systems learn patterns from the data which is provided to train them. A number of times the data may be generated by humans with their in-built biases. If this is the cases the resultant AI system would also reflect those biases as an AI system is only as good as the data fed to train them. Imagenet incident which I talked about earlier is a good example of this type.
Optimization Metric: AI systems typically train to maximize/minimize a certain metric, example error rate on the whole data. While the AI system is busy doing that, there is no guarantee that it would not do anything which we as humans would not deem “fair”.

How can we avoid AI bias?

While it's still an open research area, here are some steps you could take to mitigate bias in your AI applications:

Use Representative Data: Try to actively look for sources of bias in the data and/or the way it was collected and sampled. Look carefully into how the data was annotated and motivation & incentive of people annotating the data.
Do an Audit of the Model: Scrutinize the predictions of your model, compare and contrast false positive and false negative rates of your predictions across various sub-groups. Legally protected sub-groups could be a good place to start.
Focus on Model Explainability: A lot of AI systems can be akin to black boxes with limited insights into why they make a certain prediction. There is a lot of research work happening on model explainability — getting a better handle on why a model is making a certain prediction may help in the detection and removal of biases
Third Party tools: There are various third party tools which can help you asses bias at various stages in the AI life cycle. Look at IBM’s AI Fairness 360 Open Source Tool Kit as an example
Hire Diverse Teams: Having diversity in your team may help bring in diverse perspectives in looking for avenues for biases.

Summary

AI applications have seen a phenomenal growth off late. There have been genuine instances which have raised questions about the fairness of decisions made by AI systems. It's important to handle the bias issue so that AI systems continue to enjoy the trust of organizations and the masses. There is a lot of promising work going on in the area, let’s hope all that will results in making AI applications fairer and the world a better place to live for everyone.