Understanding Fairness

All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

Article I in the Universal Declaration of Human Rights

To lay the ground for algorithmic bias, we first ask, "What does fairness mean?" And boy is this a big one. There are tons of definitions, so how do we know which one to pick? Why can't we all just agree on one?

This section acts as a primer to fairness, covering a few key concepts. It tries to answer the following questions:

What is a widely used framework for fairness?
How can we quantify fairness?
Can't we just combine all of the fairness definitions?
How do we design for fairness without context?
How do we learn more about the context?

Disparate Treatment, Disparate Impact

Let’s begin with a not-so-mathematical idea. A common paradigm for thinking about fairness in US labor law is disparate treatment and disparate impact.

Both terms refer to practices that cause a group of people sharing protected characteristics to be disproportionately disadvantaged. The phrase “protected characteristics” refers to traits such as race, gender, age, physical or mental disabilities, where differences due to such traits cannot be reasonably justified. Ideally, we should have a set of sensitive traits that we can check against. But in reality, what constitutes “protected characteristics” varies by context, culture and country. Next, the phrase “disproportionately disadvantaged” dismisses differences in treatment due to statistical randomness. To be frank, this is really vague but we will try to go into details in the next section.

The difference between disparate treatment and disparate impact can be summarized as explicit intent. Disparate treatment is explicitly intentional, while disparate impact is implicit or unintentional.

Disparate treatment vs disparate impact.

What does this mean for AIS?

Let's use Amazon's Prime Free Same-Day service as an example . The Free Same Day service is a fantastic mind-blowing innovation that provides free same-day delivery. Since it's in its early stages, Amazon wants to trial the service before rolling it out to everyone. Suppose Amazon implements a model that decides which lucky neighborhoods should get first dibs on the Prime Free Same-Day service.

Disparate Treatment

Using race to decide who should get this service is certainly unjustified. So if Amazon had explicitly used racial composition of neighborhoods as an input feature for the model, that would be disparate treatment. In other words, disparate treatment occurs when protected characteristics are used as input features.

Obviously, disparate treatment is relatively easy to spot and resolve once we determine the set of protected characteristics. We just have to make sure none of protected characteristics is explicitly used as an input feature.

Disparate Impact

On the other hand, Amazon might have been cautious about racial bias and deliberately excluded racial features for their model. In fact, we can quote Craig Berman, Amazon’s vice president for global communications, on this:

Amazon, he says, has a “radical sensitivity” to any suggestion that neighborhoods are being singled out by race. “Demographics play no role in it. Zero.”

Amazon says its plan is to focus its same-day service on ZIP codes where there’s a high concentration of Prime members, and then expand the offering to fill in the gaps over time.

Amazon Doesn’t Consider the Race of Its Customers. Should It? - David Ingold and Spencer Soper, 2016

Focusing on ZIP codes with high density of Prime members makes perfect business sense. But what if the density of Prime members correlates with racial features? The images below from the 2016 Bloomberg article by David Ingold and Spencer Soper shows a glaring racial bias in the selected neighborhoods.

Amazon Doesn’t Consider the Race of Its Customers. Should It? - David Ingold and Spencer Soper, 2016

Despite not using any racial features, the resulting model appears to make recommendations that disproportionately exclude predominantly black ZIP codes. This unintentional bias can be seen as disparate impact.

In general, disparate impact occurs when protected characteristics are not used as input features but the resulting outcome still exhibits disproportional disadvantages.

Disparate impact is more difficult to fix since it can come from multiple sources, such as:

  • A non-representative dataset
  • A dataset that already encodes unfair decisions
  • Input features that are proxies for protected characteristics

More on sources of bias in Understanding Bias II.

Okay, but how do we know how much disparity is unfair?

To answer that question, we have to review what we meant earlier by “disproportionately disadvantaged”. In general, this has been rather hand-wavy, with good reason! What is unfair in one case might be justified in another, depending on the specific circumstances. And there are just so many factors to consider:

Let's say an insurance company uses an AIS that predicts whether an insuree will get into an accident within the next year. Insurees predicted as accident-prone could be charged higher premiums.

  • If the model excessively predicts males as accident-prone, are males disproportionately disadvantaged?
  • If the accuracies are different between age groups, are the age groups with worse accuracies disproportionately disadvantaged?
  • What if the model overestimates accident-likelihood for certain races and underestimates it for other races? This means the first group pays higher premiums than they should, while the second group underpays. Then do we say the former group is disproportionately worse off and the latter is disproportionately better off?

On the other hand, there have been many attempts at trying to formalize and quantify fairness. Especially now that we have more computer scientists getting in on the game. The next section looks at some of these fairness metrics and features a small explorable!

What is a widely used framework for fairness?

The terms "disparate treatment" and "disparate impact" are commonly used in US labor law, dividing discrimination into intentional and unintentional. Avoiding disparate treatment entails removing protected characteristics from the input features to the AIS. Avoiding disparate impact is slightly more complicated and we will discuss this in a later section (Understanding Bias II).

A Fair Fat Pet Predictor

Fat pet predictor.

Suppose for a moment that our company organizes diet boot camps for overweight cats and dogs. We want to develop an AI system to help owners diagnose if a pet is overweight. Pets diagnosed as fat are then sent to our boot camps, which means less food and no treats boohoo. Furthermore, we know that dogs are more likely to be fat, as compared to cats. In fact, cats only have a 40% chance of being overweight, while dogs have a 60% chance of being overweight.

40% of cats are overweight.

Some Basics Before We Start

You can skip this section if you understand what are TP, FP, TN and FN. Otherwise, click on this nifty button on the right.

Tuning Our Model for Fairness

In the charts below, we can tune our AI system’s accuracy for cats and dogs (if only it was so easy!). The charts on the left and right represent the resulting predictions for cats and dogs respectively. (Hover over yellow keywords to see the tooltips on the charts.)

Many More Metrics

The experiment above introduced five fairness metrics:

  • Group Fairness
  • Equalised Odds
  • Conditional Use Accuracy Equality
  • Overall Accuracy Equality
  • Treatment Equality

In addition to these, there are plenty more fairness metrics enumerated by Verma and Rubin and Narayanan . Some notable metrics include:


This goes beyond true or false predictions and considers the score assigned by the model. For any predicted score, all sensitive groups should have the same chance of actually being positive.

Suppose our fat pet predictor predicts a fatness score from 0 to 1 where 1 is fat with high confidence. If a cat and a dog are both assigned the same score, they should have the same probability of being actually fat.

Well-calibration is a stricter form of calibration, with the added condition where the chance of being actually positive is equal to the score.

For our fat pet predictor to be well-calibrated, the predicted fatness score has to be equal to the probability of actually being fat. For example, if a cat and a dog are both assigned the a score of 0.8, they should both have an 80% chance of being actually fat.

Fairness Through Awareness

This fairness metric is based on an intuitive rule - “treating similar individuals similarly”. Here, we first define distance metrics to measure the difference between individuals and difference between their predictions. An example of a distance metric could be the sum of absolute differences between normalized features. Then, this metric states that for a model to be fair, the distance between predictions should be no greater than the distance between the individuals.

Unfortunately, this leaves the difficult question of how to define appropriate distance metrics for the specific problem and application.

There are a ton of metrics to measure fairness.

Is it Justified?

The awesome thing about these metrics is that they can be put into a loss function. Then we can train a model to optimize the function and voilà we have a fair model. Except, no it doesn’t work like that.

A major issue with these metrics (besides the question of how to pick one) is that they neglect the larger context. In the previous section, we explained:

The phrase “protected characteristics” refers to traits such as race, gender, age, physical or mental disabilities, where differences due to such traits cannot be reasonably justified.

Suppose an Olympics selection trial requires applicants to run 10km in 40 minutes. This selection criterion seems reasonably justified. Running speed tends to be an appropriate measure of athleticism. But the ability to run that fast is probably negatively correlated with age. Someone looking at the data alone might flag a bias against very elderly applicants. Without understanding the context, it is difficult to see how this bias might be reasonably justified.

The fairness metrics can be a systematic way to check for bias, but they are only a piece of the puzzle. A complete assessment for fairness needs us to get down and dirty with the problem at hand.

How can we quantify fairness?

Most of the fairness metrics focus on equality in the rates of true positives, true negatives, false positives, false negatives, or some combination of these. But remember that these metrics are insufficient when they exclude the larger context of the AIS and neglect contextual justifications.

For more comprehensive reviews of existing metrics, check out Narayanan and Verma et al. .

The Impossibility Theorem

Some fairness definitions are mutually exclusive.

For our fictional fat pet predictor, we had complete control over the system’s accuracy. Even so, you may have noticed that it was impossible to fulfill all five fairness metrics at the same time. This is sometimes known as the Impossibility Theorem of Fairness.

In ProPublica's well-known article Machine Bias , the subtitle reads:

There’s software used across the country to predict future criminals. And it’s biased against blacks.

ProPublica's article documented the "significant racial disparities" found in COMPAS, a recidivism prediction model sold by NorthPointe. But in their response, Northpointe disputed ProPublica's claims. Later on, we would discover that NorthPointe and ProPublica had different ideas about what constituted fairness. Northpointe used Conditional Use Accuracy Equality, while ProPublica used Treatment Equality (see previous demo for details). Northpointe's response can be found here.

Turns out, it is impossible to satisfy both definitions of fairness, given populations with different base rates of recidivism . This is similar to our previous example of fat pets. Now, different base rates of recidivism do not mean that certain individuals are more prone to re-offending by virtue of race. Instead of racial predisposition, such trends are more likely due to unequal treatment and circumstances from past and present biases. In our fat pets example, dogs might have a higher base rate for obesity not because dogs have fat genes but because dog owners tend to be overly enthusiastic about feeding their pets.

So fairness is impossible?

So fairness is impossible?

The point of all these is not to show that fairness does not make sense or that it is impossible. After all, notions of fairness are heavily based on context and culture. Different definitions that appear incompatible simply reflect this context-dependent nature.

But this also means that it is super critical to have a deliberate discussion about what constitutes fairness. This deliberate discussion must be nested in the context of how and where the AIS will be used. For each AIS, the AI practitioners, their clients and users of the AIS need to base their conversations on the same definition of fairness. We cannot assume that everyone has the same idea of fairness. While it could be ideal for everyone to have a say in what definition of fairness to use, sometimes this can be difficult. At the very least, AI practitioners should be upfront with their users about fairness considerations in the design of the AIS. This includes what fairness definition was used and why, as well as potential shortcomings.

Can't we just combine all of the fairness definitions?

Nope, sorry. In some cases, certain fairness metrics are actually mutually exclusive. AI practitioners have to be careful about which one they use. There is no easy answer since it all depends on the context. Finally, because it is so important and subjective, be open about which metric was used and how it was chosen!

Context-Free Fairness

Computer scientists might often prefer general algorithms that is agnostic to context and application. The agnostic nature of unstructured deep learning is often cited as a huge advantage compared to labor-intensive feature engineering. So the importance of context in understanding fairness can be a bane to computer scientists, who might like to “[abstract] away the social context in which these systems will be deployed” .

Abstraction" is when we make a complex thing simple by ignoring all the unnecessary details.

But as Selbst et al. write in their work on fairness in sociotechnical systems:

Fairness and justice are properties of social and legal systems like employment and criminal justice, not properties of the technical tools within. To treat fairness and justice as terms that have meaningful application to technology separate from a social context is therefore to make a category error, or as we posit here, an abstraction error. [emphasis mine]

On a similar note, in Peter Westen’s The Empty Idea of Equality , he writes:

For [equality] to have meaning, it must incorporate some external values that determine which persons and treatments are alike […]

In other words, the treatment of fairness, justice and equality cannot be separated from the specific context of the problem at hand.

Five Failure Modes

In their work, Selbst et al. identify what they term “five failure modes” or “traps” that might ensnare the AI practitioner trying to build a fair AIS. What follows is a summary of the failure modes. We strongly encourage all readers to conduct a close reading of Selbst et al.’s original work. A copy can be found on co-author Sorelle Friedler’s website here.

Framing Trap

Failure to model the entire system over which a social criterion, such as fairness, will be enforced

A fair AIS must take into account the larger sociotechnical context in which the AIS might be used, otherwise it is meaningless. For example, an AIS to filter job applicants should also consider how its suggestions would be used by the hiring manager. The AIS might be “fair” in isolation but subsequent “post-processing” by the hiring manager might distort and undo the “fairness”.

Portability Trap

Failure to understand how repurposing algorithmic solutions designed for one social context may be misleading, inaccurate, or otherwise do harm when applied to a different context

This refers to our earlier observation that computer scientists often prefer general algorithms agnostic to context and application, which Selbst et al. refer to as “portability”. The authors contend that the quality of portability must sacrifice aspects of fairness because fairness is unique to time and space, unique to cultures and communities, and not readily transferable.

Formalism Trap

Failure to account for the full meaning of social concepts such as fairness, which can be procedural, contextual, and contestable, and cannot be resolved through mathematical formalisms

This trap stems from the computer science field’s preference for mathematical definitions, such as the many definitions of fairness that we have seen earlier. The authors suggest that such mathematical formulations fail to capture the intrinsically complex and abstract nature of fairness, which is, again, nested deeply in the context of the application.

Ripple Effect Trap

Failure to understand how the insertion of technology into an existing social system changes the behaviors and embedded values of the pre-existing system

This is related to the Framing Trap in that the AI practitioner fails to properly account for “the entire system”, which in this case includes how existing actors might be affected by the AIS. For instance, decision-makers might be biased towards agreeing with the AIS’s suggestions (a phenomenon known as automation bias) or the opposite might be true and decision-makers might be prone to disagreeing with the AIS’s suggestions. Again, this stems from designing an AIS in isolation without caring enough about the context.

Solutionism Trap

Failure to recognize the possibility that the best solution to a problem may not involve technology

Hence we crowned the most important question in this entire guide as, “When is AI not the answer?” (mentioned here). AI practitioners are naturally biased towards AI-driven solutions, which could be an impedement when the ideal solution might be far from AI-driven.

How do we design for fairness without context?

Nope we can't. Gotcha that was a trick question. The same decision can be both fair and unfair depending on the larger context, so context absolutely matters. As such, it is difficult to give advice on how to pick a fairness metric without knowing what is the context. Check out the next section for some questions to help with understanding the context.

Learning about the Context

By the time you read this, “context” should have been burned into your retina. But just in case you cheated and came straight here without reading any of the previous sections:


So here is a list of questions and prompts to help you learn more about the sociotechnical context of your application. Don’t be limited to these though, go beyond them to understand at much about the problem as you can. Also, these prompts should be discussed as a group rather than answered in isolation. Involve as many people as you can!

Involve as many people as you can!

General Context

  • What is the ultimate aim of the application?
  • What are the pros and cons of an AIS versus other solutions?
  • How is the AIS supposed to be used?
  • What is the current system that the AIS will be replacing?
  • Create a few user personas - the technophobe, the newbie etc. - and think about how they might react to the AIS across the short-term and long-term.
  • Think of ways that the AIS can be misused by unknowning or malicious actors.

About Fairness

  • What do false positives and false negatives mean for different users? Under what circumstances might one be worse than the other?
  • Try listing out some examples of fair and unfair predictions. Why are they fair/unfair?
  • What are the relevant protected traits in this problem?
  • Which fairness metrics should we prioritize?
  • When we detect some unfairness with our metrics - is the disparity justified?

Bonus Points!

  • Find a bunch of real potential users and ask them all the prompts above.
  • Post all of your answers online and iterate it with public feedback
  • Ship your answers with the AIS when it is deployed
Involve as many people as you can!
How do we learn more about the context?

See above. Most of all, take a genuine interest in your application and its users!