Unfairly trained Artificial Intelligence (AI) systems can reinforce bias, therefore AI systems must be trained fairly. Experts say AI fairness is a dataset issue for each specific machine learning model. AI fairness is a newly recognized challenge. The big cloud providers are in the process of developing and announcing tools to help address AI fairness.
Facebook announced internal software tools development to search for bias in training datasets in May 2018. Since then, Amazon, Microsoft, Google and most recently IBM have announced open source tools to examine bias and fairness in trained models.
Here’s what these tools are designed to do, where they stand with respect to each other, and why IBM’s trust and transparency announcement is important.
The AI Fairness Challenge
The core challenge for AI is that deep learning models are “black boxes”. It is very difficult—and often simply not possible—for mere humans to understand how individual training data points influence each output classification (inference) decision. The term “opaque” is also used to describe this hidden classification behavior. It’s hard to trust a system when you can’t understand how it makes decisions.
In the machine learning developer community, the opposite of opaque is “transparent”. Transparent deep learning models would expose their classification process in an understandable fashion, but research into creating transparent models is still in its very early stages.
In January 2018, a large group of Chinese organizations contributed to the White Paper on Artificial Intelligence Standardization. The white paper acknowledges ethical issues in AI, without yet offering remedies, saying:
MORE FROM FORBES
“We should also be wary of AI systems making ethically biased decisions. For example, if universities use machine learning algorithms to assess admissions, and the historical admissions data used for training (intentionally or not) reflect some bias from previous admission procedures (such as gender discrimination), then machine learning may exacerbate these biases during repeated calculations, creating a vicious cycle. If not corrected, biases will persist in society in this way.”
White paper contributors include: Alibaba Cloud (Aliyun), Baidu, China Telecom, Huawei, IBM (China), Intel (China), Tencent and many more. I believe these organizations are also working to address bias and discrimination in training AI systems, but none have publicly announced tools yet.
The State of AI Fairness
Facebook identified only one of its internal anti-bias software tools by name in its May 2018 announcement. “Fairness Flow” measures how a model interacts with specific groups of people. Facebook’s team worked with several schools and institutes to develop its tool. Facebook has not yet publicly released its Fairness Flow tool.
AWS published a blog in July 2018 that frames machine learning fairness in terms of accuracy, false positive and false negative rates. But AWS has not yet released developer tools to address assessing fairness in other aspects of model training.
Microsoft Research published a paper in July 2018 describing a fairness algorithm for binary classification systems, along with an open source Python library implementing the algorithm. Microsoft’s work incorporates pre-processing training data and post-processing model output predictions. However, it is not implemented as a high-level developer tool; it is for Python developers familiar with deep learning code.
In September 2018, Google’s People + AI Research (PAIR) initiative went one step further than simply providing a developer library, announcing its “What-If tool.” What-If enables developers to analyze input datasets and trained TensorFlow models visually and includes fairness assessments. Google’s What-If tool is now part of its open source TensorBoard web application.
A week after Google’s What-If announcement, IBM one-upped Google by announcing visual developer tools that work with any machine learning model. IBM’s branded AI OpenScale tools enable developers to analyze any machine learning model using any integrated development environment (IDE). IBM also open sourced its machine learning fairness tools as the AI Fairness 360 toolkit. IBM containerized its machine learning tool chain using Kubernetes orchestration and can run it in any public cloud (unsurprisingly, its AI OpenScale tutorials run in Watson Studio on IBM Cloud).
Transparency and Open Source for AI Fairness
Ultimately, the best answer to addressing bias in trained machine learning models will be to build transparent models. But because we don’t know how to do that yet, today’s deep learning models are black boxes. Bias and fairness assessment tools therefore must examine each model’s input dataset and the output inference results. I believe more tools will follow this path.
For the time being, IBM’s open source AI fairness toolkits set a good example by working with any model type on any public cloud.
This article was written for Forbes.com, to view the original article click here.