Understanding why a model makes certain conclusions is frequently just as essential as whether or not those judgments are correct in machine learning. For example, a machine-learning model may correctly forecast the malignant status of a skin lesion, but it could have done so based on an unrelated blip on a clinical photo.
While tools exist to assist professionals in deciphering a model’s logic, these methods frequently only provide information on one decision at a time, which must be manually analysed. Millions of data inputs are typically used to train models, making it nearly hard for a human to analyse enough judgments to spot patterns.
Now, MIT and IBM Research researchers have developed a mechanism that allows users to aggregate, classify, and rank these unique explanations in order to quickly examine the behaviour of a machine-learning model. Their method, known as Shared Interest, uses quantitative indicators to assess how well a model’s thinking resembles that of a human.
Shared Interest could make it simple for a user to spot troubling trends in a model’s decision-making – for example, the model may be readily confused by distracting, irrelevant characteristics such as background objects in images. The user might rapidly and quantitatively decide whether a model is trustworthy and ready to be implemented in a real-world setting by combining these insights.
“In developing Shared Interest, our goal is to be able to scale up this analysis process so that you could understand on a more global level what your model’s behavior is,” says lead author Angie Boggust, a graduate student in the Visualization Group of the Computer Science and Artificial Intelligence Laboratory (CSAIL).
Boggust collaborated on the study with her advisor, Arvind Satyanarayan, a computer science assistant professor who directs the Visualization Group, as well as IBM Research’s Benjamin Hoover and senior author Hendrik Strobelt. At the Conference on Human Factors in Computing Systems, the work will be presented.
Boggust started working on this project during a summer internship at IBM under Strobelt’s supervision. Following their return to MIT, Boggust and Satyanarayan expanded on the project and continued working with Strobelt and Hoover, who assisted in the deployment of case studies that demonstrate how the technique may be applied in practise.
Saliency methods are prominent techniques that demonstrate how a machine-learning model made a certain choice, and Shared Interest makes use of them. Saliency approaches highlight portions of an image that were relevant to the model when it made its choice if the model is classifying images. These areas are represented by a saliency map, which is a form of heatmap that is commonly superimposed on the original image. If the model identified the image as containing a dog and highlighted the dog’s head, it suggests those pixels were relevant to the model for determining whether the image contained a dog.
Saliency approaches are compared to ground-truth data in Shared Interest. Ground-truth data are often human-generated annotations that surround the pertinent areas of each image in an image dataset. The box would have surrounded the entire dog in the preceding scenario. Shared Interest contrasts the model-generated saliency data with the human-generated ground-truth data for the same image when testing an image classification model to evaluate how well they agree.
The method quantifies the alignment (or misalignment) using numerous indicators before categorising a decision into one of eight categories. The categories range from perfectly human-aligned (the model makes a correct prediction, and the highlighted area in the saliency map is identical to the human-generated box) to completely distracted (the model makes a correct prediction and the highlighted area in the saliency map is identical to the human-generated box) (the model makes an incorrect prediction and does not use any image features found in the human-generated box).
“On one end of the spectrum, your model made the decision for the same reason that a person did, while on the other end of the spectrum, your model and the human are making this decision for completely different reasons. You can utilise that quantification to sort through all of the photographs in your dataset if you quantify it for all of them,” Boggust explains.
When working with text-based data, the method highlights essential words rather than image regions.
Three case studies were utilised by the researchers to demonstrate how Shared Interest could be valuable to both nonexperts and machine-learning researchers.
They utilised Shared Interest in the first case study to help a dermatologist decide whether or not to trust a machine-learning model developed to diagnose cancer from images of skin blemishes. The dermatologist was able to instantly see instances of the model’s right and poor predictions because to Shared Interest. Finally, the dermatologist came to the conclusion that the model could not be trusted since it produced too many predictions based on visual artefacts rather than actual lesions.
“The value here is that using Shared Interest, we are able to see these patterns emerge in our model’s behavior. In about half an hour, the dermatologist was able to make a confident decision of whether or not to trust the model and whether or not to deploy it,” Boggust says.
In the second case study, they collaborated with a machine-learning researcher to demonstrate how Shared Interest may be used to evaluate a saliency approach by uncovering previously unknown flaws. Using this method, the researcher was able to assess thousands of correct and bad decisions in a fraction of the time it would have taken using traditional manual methods.
They employed Shared Interest in the third case study to delve deeper into a specific image classification example. They were able to conduct a what-if analysis by changing the image’s ground-truth area to identify which image attributes were most significant for specific predictions.
The researchers were impressed with Shared Interest’s performance in these case studies, although Boggust warns that the technique is only as good as the saliency methodologies upon which it is built. Shared Interest will inherit the limitations of those strategies if they contain bias or are erroneous.
The researchers hope to apply Shared Interest to many forms of data in the future, including tabular data seen in medical records. They also seek to employ Shared Interest to aid in the improvement of present saliency strategies. Boggust thinks that this study will spur greater research into quantifying machine-learning model behaviour in ways that are understandable to humans.
The MIT-IBM Watson AI Lab, the United States Air Force Research Laboratory, and the United States Air Force Artificial Intelligence Accelerator have all contributed to this research.