Richard M. Karp, Raymond E. Miller
Journal of Computer and System Sciences
Recent work shows unequal performance of commercial face classification services in the gender classification task across intersectional groups defined by skin type and gender. Accuracy on dark-skinned females is significantly worse than on any other group. In this paper, we conduct several analyses to try to uncover the reason for this gap. The main finding, perhaps surprisingly, is that skin type is not the driver. This conclusion is reached via stability experiments that vary an image's skin type via color-theoretic methods, namely luminance mode-shift and optimal transport. A second suspect, hair length, is also shown not to be the driver via experiments on face images cropped to exclude the hair. Finally, using contrastive post-hoc explanation techniques for neural networks, we bring forth evidence suggesting that differences in lip, eye and cheek structure across ethnicity lead to the differences. Further, lip and eye makeup are seen as strong predictors for a female face, which is a troubling propagation of a gender stereotype.
Richard M. Karp, Raymond E. Miller
Journal of Computer and System Sciences
Sankar Basu
Journal of the Franklin Institute
Salvatore Certo, Anh Pham, et al.
Quantum Machine Intelligence
Hannaneh Hajishirzi, Julia Hockenmaier, et al.
UAI 2011