Caroline Stedman's Profile
Caroline Stedman
Submitted
Activity Feed
Thanks for putting together this article, Amina. I enjoyed reading it!
Your comment about the potential invasive and personal insights that could be gained by an employer linking health data with other information they have on their employees really struck me. While I believe that anonymization of the health data is key, I worry that simply removing names would not fully solve the issue. I keep coming back to the research of Latanya Sweeney, who showed that 87% of people in the United States are uniquely identified by {date of birth, gender, ZIP}. It would likely be fairly simple for an employer to re-identify their employees even in the absence of a name (given the wealth of information they have on their employees outside of the health data). I would therefore hope that employers might consider a more rigorous anonymization approach (perhaps Differential Privacy), even if it means slightly less precise analyses of the data.
Thank you for laying out all sides of this issue, Paula.
On top of the fact that the Compas model was trained on data that is likely rife with historical bias, I also believe that a huge issue with this model is that it was deployed in a context in which there is no ground truth against which to validate it. The model is trying to predict recidivism, but there is no metric of recidivism for those who are detained. Therefore, how can we ever know the true relationship between the characteristics/traits of a person and their rate of recidivism? I find it very difficult to trust an algorithm that cannot truly be validated against any metric or ground truth.