By Stéphane Amarsy,
CEO of D-AIM
Artificial intelligence and its decision-making powers are notorious for increasing the risks of discriminatory practices that reproduce or even exacerbate societal biases.
However, it is particularly difficult to have proof of individual discrimination. Unless, of course, we accept the use of appraisal procedures that are otherwise prohibited because they are deemed to be unfair, such as situational or discrimination testing. As a method for investigation and placing subjects in real-life contexts, these tests are used to identify a discriminatory situation. They can include sensitive data such as ethnic origin, disability, gender, sexual orientation, religion and even union membership. Although these tests do not respect the principle of fair evidence, they are by far the most effective, if not the only way to prove discrimination. The simplest case is to compare the behavior of a third party towards two people with exactly the same profile for all relevant characteristics except the one suspected of leading to potential discrimination. Naturally, when different data crossing indirectly results in discrimination, the results must be analyzed in all their complexity.
This method, used by many non-profit organizations, was recognized by the French courts following a ruling by the French Supreme Court in June 2002 for a case of racial discrimination at Pym’s nightclub in Tours. Although considered an unfair practice, it cannot be dismissed as simply a means of gathering evidence. One of the best-known uses of this method is the resume test. This involves applying to job offers with two identical applications whose resumes differ only in name, gender or age, any of which can be a potential indicator of discrimination. For the test to be valid, the resume and application must be genuine; only the competitor resume is modified accordingly.
It is equally possible to test algorithms in this manner—I would strongly recommend it—to systematically assess their power of discrimination. While this approach isn’t technically challenging, it does have to be set up properly. A test sample can be reassessed simply by exchanging the modalities of a potentially discriminatory variable such as gender or ethnicity. The algorithm is then re-applied to this modified test sample. It shows discrimination by identifying the individuals for whom the decision has changed—positively or negatively. The algorithm deduces unfair discrimination relative to the sample’s possible biases, which can have legal consequences for the user entity.
This is, of course, only valid if the potentially discriminatory variable is known, which is not always the case. Removing this variable from a model doesn’t necessarily lead to a fair decision; the algorithm may get this information due to data crossing, which prevents the bias from being identified.
What’s more, the law requires proof of discriminatory intent. Algorithmic discrimination is not always the result of intentionality because it only uses data available to it.
Use data to create fair algorithms
Algorithm bias can be detected by different factual methods, even if subjectivity remains when interpreting their respective results. This means measuring the level of dependence between the potentially discriminatory variable and the decision learned by the algorithm. The greater the link, the stronger the discrimination effect will be. This inconsistency can be mitigated either by changing the decision rule or by changing the learning sample. Modifying the rule means ensuring that the algorithm does not overlearn this link. Instead, the absence of a link between the prediction and the potentially discriminating variable will be favored. Modifying the sample, on the other hand, means favoring independence between the data and the problematic variable. It ensures that any algorithm which uses the data to learn cannot reproduce a bias with respect to the potentially discriminating variable. Understandably, this leads to a loss of information that impacts the model’s predictive power and subsequent quality. A balance between the prediction error and the desired non-discrimination must still be maintained.
Learn more about Ethics and AI