Our past Technical Chair, discussed Florian Tramèr’s upcoming talk: Discovering Unwarranted Associations in Data-Driven Applications with the FairTest Testing Toolkit at MLconf Seattle, scheduled for May 20th.
Will machine learning fix machine learning? (I mean the ethical side)
FT) Can a machine learn to be fair or just? Answering this question seems to first require an apt and complete definition of what constitutes unethical behavior. For instance, given two models, by what metrics would we compare their “ethicality”? Another challenge is to agree upon what constitutes the ground truth for fair behaviour. Researchers from the machine learning and data mining communities have started looking into these questions, but formally defining fairness has proven to be a very difficult task, given the notion’s contextual (and even cultural, see below) aspect.
With FairTest we have set out to first tackle the task of efficiently and systematically detecting spurious inferences made by machine learning algorithms. Our goal is to help point developers to underlying fairness issues, and hopefully provide preliminary paths to fixing them. I am confident that further progress will be made by the community in understanding ethical issues related to machine learning, as well as in fixing or regulating these issues.
Ethics is something cultural. I mean unethical actions in some cultures are ethical in others. What is the culture of an algorithm? Is it the culture of the author? Most of the recent fallouts did not really reflect programmer’s view, they acted spontaneously. Are the algorithms allowed to have their own culture?
FT) An algorithm can definitely be influenced by the culture of its author, or the cultural setting underlying its training examples. For instance, a model trained over data mirroring historical social biases (e.g. on race or gender) will perceive such biases as ground truth and likely perpetrate them. The problem is in controlling which aspects of our culture (e.g. its ethics rather than its biases) end up “absorbed” by the algorithm. On a different note, algorithms can also bring us to rethink some of our cultural norms. This is being evidenced, for instance, by debates around privacy, a notion that is being significantly reshaped by the avenue of the digital world.
Expert psychologists often resort to manipulative techniques in order to target vulnerable social groups such as kids in order to push them to spend more. Yes, I am a parent and I have felt that a lot of ads try to do that. Why is it fair for a group of humans to do this and not for a group of algorithms?
FT) In principal, I don’t believe there should be a distinction between the two. Something that is deemed unfair for an algorithm should be deemed unfair for a human being, and vice versa. On a side note, in my opinion, some forms of advertisement on children (e.g. for junk food) are unethical. However, one factor that absolutely should be taken into account here is scale.
Algorithmic decision making (including targeted advertisement) is predicted to have an ever-growing impact on our everyday lives, implying that even minor unfair effects will have the potential to cause tremendous harm. Quoting Stephen Hawking, “the trend seems to be toward […] technology driving ever-increasing inequality”. In particular, the impact of algorithms on the employment market (e.g. hiring processes, automation, etc.) will likely be tremendous.
This will require us to rethink certain of our notions and regulations around fairness and ethicality, which were not necessarily defined with internet-level scale in mind. This nuance in scale can perhaps be better understood with an analogy to digital privacy. Compare innocently reading a text over someone’s shoulder, to reading everyone’s texts by means of a mass surveillance system. The difference in scale introduced by algorithms may bring a vastly different meaning to unfairness, in the same way that it completely changed our notion of privacy invasion.
When does statistical inference become unethical?
FT) A fundamental tension is that machine learning is, by definition, about discrimination (in the statistical sense): A model is built to learn to classify (i.e. discriminate) new observations. So when is this discrimination not ok? A general concern is when a pattern or rule is generalized to a non-homogenous group (maybe for lack of sufficient data or features).
A prominent example of this phenomenon is the notion of redlining: a service being offered or denied based solely on geographical area, without consideration for relevant discrepancies inside these areas. From a statistical point of view, sensitive features such as race or gender (or features strongly correlated with these) end up being used as proxies for unmeasurable, yet actually relevant, quantities.
Companies are required, by the law, to show that they take preventative measures in regards to ethics and compliance. Could the FairTest algorithm presented in this MLconf be the new ethics regulation for high Tech? Or maybe sentiment analysis monitoring is a better way?
FT) It seems somewhat early to answer such a question, as the foremost challenge to overcome remains awareness, in my opinion. Concerns about the ethics of machine learning are starting to be recognized (as evidenced by this event and others), and tools such as FairTest will help us in better understanding how prominent and wide-spread these issues currently are.
However, machine learning methods are increasingly being applied in extremely diverse settings, for which domain-specific regulations will probably be needed. It is thus likely that more than a single tool or method will be necessary to meet the broad needs or requirements for ensuring ethical usage of machine learning.

Florian Tramèr, Researcher, EPFL
Florian Tramèr is a research assistant at EPFL in Switzerland, hosted by Prof. J-P. Hubaux. He received his Masters in Computer Science from EPFL in 2015 and will be joining Stanford University as a PhD student this Fall. During his Master Thesis, Florian collaborated with researchers at EPFL, Columbia University and Cornell Tech to design and implement FairTest, a testing toolkit to discover unfair and unwarranted behaviours in modern data-driven algorithms. Florian’s interests lie primarily in Cryptography and Security, with recent work devoted to the study of security and privacy in the areas of genomics, ride-hailing services, and Machine-Learning-as-a-Service platforms.