Inclusive Machine Learning: addressing model fairness

What is fairness in AI? How can we address it? Implementing a model fairness use case.


Fabricio Pretto

3 years ago | 7 min read

Artificial Intelligence (AI) and Machine Learning (ML) systems are increasingly being used across all sectors and societies.

Alongside this growth, model fairness has been gaining awareness over the past years. This field aims to assess how fair the model is when treating pre-existing biases in data: is it fair that a job-matching system favors male candidates for CEO interviews, because that matches historical data?

Fig. 1: Number of papers published from 2011 to 2017 (image by Moritz Hardt) (Above)

In my previous article I addressed ML model´s interpretability. This time we will take a step further and assess how our trained model treats potentially sensitive (biased) features.

Auditing a model is not always black and white — features that may be sensitive in a context may not be that much in other. Few people would argue that gender shouldn’t determine whether a person gets a job.

However, is it unfair that an insurance company pricing model charges more to men because historical data shows that they have more claims than women? Or is it correctly accounting for their more reckless driving? Certainly it’s at least arguable.

There are dozens of use cases where the fairness definition is not absolutely clear. Identifying appropriate fairness criteria for a system requires accounting for user experience, cultural, social, historical, political, legal, and ethical considerations, several of which may have trade-offs.

In this article we will address model fairness using the FairML library, developed by Julius Adebayo. The entire code used can be found in my GitHub


  1. Dataset and Model Training
  2. FairML Intuition
  3. Assessing Model Fairness
  4. Recommended practices

1. Dataset and Model Training

The dataset used for this article is the Adult Census Income from UCI Machine Learning Repository. The prediction task is to determine whether a person makes over $50K a year.

Since the focus of this article is not centered in the modelling phase of the ML pipeline, minimum feature engineering was performed in order to model the data with an XGBoost.

The performance metrics obtained for the model are the following:

Fig. 2: Receiving Operating Characteristic (ROC) curves for Train and Test sets.

Fig. 3: XGBoost performance metrics

The model’s performance seems to be pretty acceptable.

In my previous article we discussed several techniques for addressing model interpretability. Among other libraries, we used SHAP to obtain the feature importances in the model outputs:

Fig. 4: SHAP Feature Importance

There are several features in the dataset that could be considered as ‘sensitives’ to include in the model, some of them more controversial than others. For instance, features like Nationality, Race and Gender are probably the most sensitive ones in determining an individual’s income.

Moreover, even though features like Age and Marital Status may have good predictive power by covering up certain individual’s aspects, such as years of work experience or education, they could also be considered sensitive.

So, how can we assess the degree to which the model is relying on these sensitive features to make the predicions?

2. FairML Intuition

Like most interpretation algorithms, the basic idea behind FairML is to measure how the model’s predictions vary with perturbations made in the inputs. If a small change in a feature dramatically modifies the output, then the model is sensitive to that feature.

However, if the features are correlated, the indirect effects between them might still not be accounted for in the interpretation model. FairML addresses this multicollinearity problem using orthogonal projection.

Orthogonal Projection

Fig. 5: Orthogonal projection of vector a on vector b

An orthogonal projection is a type of vector projection that maps a vector onto the orthogonal (perpendicular) direction of another vector. If a vector a is projected onto a vector b (in Euclidean space), the component of a that lies in the direction of b is obtained.

This concept is very important in FairML since it allows to completely remove the linear dependence between features. If 2 vectors are orthogonal to each other, then there is no linear combination of one vector that can produce the other. The component of a orthogonal to b, can be calculated as a2 = a - a1

Orthogonal projection guarantees that there will be no hidden collinearity effects. It is important to note that this is a linear transformation, so it does not account for non-linear dependencies between features. To solve this, FairML uses basis expansion and a greedy search over such expansions.

FairML Process

Fig. 6: FairML methodology for addressing model fairness (image by Julius Adebayo)

If F is a model trained with 2 features x1 and x2, to calculate the dependence of F on x1, first x2 is made orthogonal to x1 to remove all dependencies between the two.

Secondly, the variation in the model output is analyzed using the orthogonal component of x2 and making perturbations in x1. The change in output between the perturbed input and the original input indicates the dependence of the model on x1. The dependence of F on x2 can be estimated in the same way.

3. Assessing Model Fairness

Now that we know how FairML works, let’s use it to evaluate our model. Firstly, we will install the Python package and import the required modules.

# FairML install
pip install Import modules
from fairml import audit_model
from fairml import plot_dependencies

Secondly, we will audit the model. The audit_model method receives 2 required and 5 optional inputs:


  • predict_function: black-box model function that has a predict method.
  • input_dataframe: dataframe with shape (n_samples, n_features)


  • distance_metric: one of [‘mse’, ‘accuracy’] (default=‘mse’)
  • direct_input_pertubation_strategy: refers to how to zero out a single variable. Options = [‘constant-zero’ (replace with a random constant value), ‘constant-median’ (replace with median constant value), ‘global-permutation’ (replace all values with a random permutation of the column)].
  • number_of_runs: number of runs to perform (default=10).
  • include_interactions: flag to enable checking model dependence on interactions (default=False).
  • external_data_set: data that did not go into training the model, but that you’d like to see what impact that data has on the black box model (default=None).
# Model Audit
importances, _ = audit_model(clf_xgb_array.predict, X_train)

The audit_model method returns a dictionary where keys are the column names of the input dataframe (X_train) and values are lists containing model dependence on that particular feature. These lists are of size number_of_runs.

The process carried out for each feature is as described in the previous section. One drawback of this methodology is that it is computationally expensive to run when the number of features is high.

FairML allows to plot the dependence of the output on each feature (excluding the effect of the correlation with the other predictors):

# Plot Feature Dependencies
title="FairML Feature Dependence",

Fig. 7: FairML Feature Dependence

Red bars indicate that the feature contributes to an output 1 (Income > 50K), while light blue bars indicate that it contributes to an output 0 (Income <= 50k).

It is observed that this algorithm, by removing the dependence between features through orthogonal projection, identifies that the model has a high dependence on sensitive features such as race_White, nac_United-States and sex_Male. In other words, according to the trained model, a white man born in the United States will have a higher probability of having an income greater than USD 50k, which constitutes a very strong bias.

It is very important to notice the relevance of the orthogonal projection in the algorithm, since features such as race_White and nac_United-States did not appear to be so relevant in SHAP’s Feature Importance or in the other interpretation algorithms. This is probably because the effects of these are hidden in other features. By removing multicollinearity and evaluating the individual dependence on each feature, it is possible to identify the intrinsic effects of each one.

4. Recommended practices

Fairness in AI and ML is an open area of research. As a main contributor to this field, GoogleAI recommends some best practices in order to address this issue:

  • Design your model using concrete goals for fairness and inclusion: engage with social scientists, humanists, and other relevant experts for your product to understand and account for various perspectives.
  • Use representative datasets to train and test your model: identify prejudicial or discriminatory correlations between features, labels, and groups.
  • Check the system for unfair biases: while designing metrics to train and evaluate your system, also include metrics to examine performance across different subgroups (use diverse testers and stress-test the system on difficult cases).
  • Analyze performance: even if everything in the system is carefully crafted to address fairness issues, ML-based models rarely operate with 100% perfection when applied to real, live data. When an issue occurs in a live product, consider whether it aligns with any existing societal disadvantages, and how it will be impacted by both short- and long-term solutions.


This article is meant to help data scientists get a better understanding of how their machine learning models treat pre-existing biases in data.

We have presented the FairML intuition on how it addresses this issue and implemented a fairness evaluation of an XGBoost model trained in the Adult Census Income dataset. Finally, we summarized some of the best practices GoogleAI recommends in this growing field.

As an ending note, I would like to leave a quote from Google’s responsible AI practices:

AI systems are enabling new experiences and abilities for people around the globe. Beyond recommending books and television shows, AI systems can be used for more critical tasks, such as predicting the presence and severity of a medical condition, matching people to jobs and partners, or identifying if a person is crossing the street.

Such computerized assistive or decision-making systems have the potential to be fairer and more inclusive at a broader scale than decision-making processes based on ad hoc rules or human judgments. The risk is that any unfairness in such systems can also have a wide-scale impact.

Thus, as the impact of AI increases across sectors and societies, it is critical to work towards systems that are fair and inclusive for all.

I hope this article serves its purpose as a general guide into addressing fairness in black-box models and puts a grain of sand into a more fair and inclusive use of AI. The entire code can be found in my GitHub


Created by

Fabricio Pretto







Related Articles