Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

It is often thought that traditional recidivism prediction tools used in criminal sentencing, though biased in many ways, can straightforwardly avoid one particularly pernicious type of bias: direct racial discrimination. They can avoid this by excluding race from the list of variables employed to predict recidivism. A similar approach could be taken to the design of newer, machine learning-based (ML) tools for predicting recidivism: information about race could be withheld from the ML tool during its training phase, ensuring that the resulting predictive model does not use race as an explicit predictor. However, if race is correlated with measured recidivism in the training data, the ML tool may ‘learn’ a perfect proxy for race. If such a proxy is found, the exclusion of race would do nothing to weaken the correlation between risk (mis)classifications and race. Is this a problem? We argue that, on some explanations of the wrongness of discrimination, it is. On these explanations, the use of an ML tool that perfectly proxies race would (likely) be more wrong than the use of a traditional tool that imperfectly proxies race. Indeed, on some views, use of a perfect proxy for race is plausibly as wrong as explicit racial profiling. We end by drawing out four implications of our arguments.


Working paper

Publication Date



discrimination, profiling, machine learning, algorithmic fairness, racial bias, practical ethics, crime prediction