Resources
Offline Guide
An offline copy of this guide. Interactive components, such as the explorable, are left out. But it has been nicely formatted for you to print on sheets of dead tree matter.
Download the offline guide
Other Websites and Guides
Useful or interesting links related to algorithmic bias.
- Survival of the Best Fit - a game about algorithmic bias in hiring
- Google’s People + AI Guidebook and Inclusive ML Guide
- The Financial Modelers’ Manifesto written by Emanuel Derman and Paul Wilmott was written for quants and financial engineers amidst the fallout of the subprime mortgage crisis, but the lessons are very applicable to today’s AI engineers
The Modelers' Hippocratic Oath
- I will remember that I didn't make the world, and it doesn't satisfy my equations.
- Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.
- I will never sacrifice reality for elegance without explaining why I have done so.
- Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.
- I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension.
Organizations and Conferences
- The AI Now Institute is working actively on AI ethics and has many great publications
- FAT ML and ACM FAT* are two of the main conferences in AI ethics - check out the conference websites for related publications
Datasets
Datasets for the more bias-aware.
- Gapminder’s Dollar Street images, which was used by DeVries et al. [2] in Does Object Recognition Work for Everyone? and comprises over 16,000 images from 60 different countries across 138 categories - a downloadable set can be found via my GitHub repository
- Google’s Open Images Extended - Crowdsourced, - Google has also provided some notes on possible biases in this dataset - retrieved from the Kaggle FAQ:
While we have targeted specific geographical locations in the collection of the Challenge Stage 1 dataset, it does have some particular areas of over and under representation that we found in preliminary analysis and wish to describe briefly here. These include:
- Images of people tend to under-represent people who appear to be elderly.
- Images tagged Child tend to be seen mostly in the context of play.
- Some Person-related categories, including Bartender, Police Officer, and several sports related tags, appear to be predominantly (but by no means entirely) male.
- Some Person-related categories, including Teacher, appear to be predominantly (but by no means entirely) female.
- Some Person-related categories, including Teacher, appear to be predominantly (but by no means entirely) female.
- Images with people seem to be taken predominantly in urban rather than rural areas.
- Images of people in traditional locale-specific dress such as Sari’s in India are relatively under-represented in this Challenge Stage 1 data set.
- In images tagged Wedding, there does not appear to be representation of same-sex marriages.
- Joy Buolamwini’s Gender Shades dataset [1] can be requested here
Tools for diagnosing and mitigating algorithmic bias, complete with detailed tutorials.
Readings
Academic publications related to algorithmic bias that I found useful.
- Do Artifacts have Politics? (Winner, 1980) [4]
- Bias in Computer Systems (Friedman and Nissenbaum, 1996) [3]
- Technologies of Humility (Jasanoff, 2007) [5]
- Big Data’s Disparate Impact (Barocas and Selbst, 2016) [6]
- Inherent Trade-offs in the Fair Determination of Risk Scores (Kleinberg et al., 2016) [7]
- Interventions over Predictions: Reframing the Ethical Debate for Actuarial Risk Assessment (Barabas et al., 2017) [8]
- Fairness Definitions Explained (Verma et al., 2018) [9]
- Fairness and Abstraction in Sociotechnical Systems (Selbst et al., 2019) [10]
- A Framework for Understanding Unintended Consequences of Machine Learning (Suresh and Guttag, 2019) [11]
References
- Gender shades: Intersectional accuracy disparities in commercial gender classification [link]
Buolamwini, J. and Gebru, T., 2018. Conference on fairness, accountability and transparency, pp. 77-91. - Does Object Recognition Work for Everyone? [PDF]
DeVries, T., Misra, I., Wang, C. and van der Maaten, L., 2019. arXiv preprint arXiv:1906.02659. - Bias in computer systems [link]
Friedman, B. and Nissenbaum, H., 1996. ACM Transactions on Information Systems (TOIS), Vol 14(3), pp. 330-347. ACM. - Do artifacts have politics? [PDF]
Winner, L., 1980. Daedalus, pp. 121-136. JSTOR. - Technologies of humility [PDF]
Jasanoff, S., 2007. Nature, Vol 450(7166), pp. 33. Nature Publishing Group. - Big data's disparate impact [PDF]
Barocas, S. and Selbst, A.D., 2016. Calif. L. Rev., Vol 104, pp. 671. HeinOnline. - Inherent trade-offs in the fair determination of risk scores [PDF]
Kleinberg, J., Mullainathan, S. and Raghavan, M., 2016. arXiv preprint arXiv:1609.05807. - Interventions over predictions: Reframing the ethical debate for actuarial risk assessment [PDF]
Barabas, C., Dinakar, K., Ito, J., Virza, M. and Zittrain, J., 2017. arXiv preprint arXiv:1712.08238. - Fairness definitions explained [link]
Verma, S. and Rubin, J., 2018. 2018 IEEE/ACM International Workshop on Software Fairness (FairWare), pp. 1-7. - Fairness and abstraction in sociotechnical systems [PDF]
Selbst, A.D., Boyd, D., Friedler, S.A., Venkatasubramanian, S. and Vertesi, J., 2019. Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 59-68. - A Framework for Understanding Unintended Consequences of Machine Learning [PDF]
Suresh, H. and Guttag, J.V., 2019. arXiv preprint arXiv:1901.10002.