If you are interested in mitigation capabilities, we’re happy to have a conversation about your needs and what approaches would work best for you. Within the Warrior product, we offer postprocessing methods; we do encourage exploration of alternate (pre- or in- processsing) methods if your data science team has the bandwidth to do so.
We currently have one postprocessing method available for use in the product: the Threshold Mitigator.
This algorithm is an extension of the Threshold Optimizer implemented in the
fairlearn library, which in turn is based on Moritz Hardt’s 2016 paper introducing the method.
The intuition behind this algorithm is based on the idea that if the underlying black-box is biased, then different groups have different predicted probability distributions — let’s call the original predicted probability for some example \(x\) the score \(s_x\). Then, for a fixed threshold \(t\), \(P(s_x > t | x \in A) > P(s_x > t | x \in B)\), where \(B\) indicates membership in the disadvantaged group. For example, the default threshold is \(t = 0.5\), where we predict \(\hat Y = 1\) if \(s > 0.5\).
However, we might be able to mitigate bias in the predictions by choosing group-specific thresholds for the decision rule.
So, this algorithm generally proceeds as follows:
Try every possible threshold \(t\) from 0 to 1, where for any \(x\), we predict \(\hat Y = 1\) if and only if \(s_x > t\).
At each of those thresholds, calculate the group-specific positivity, true positive, and false positive rates.
Let’s say we’re trying to satisfy demographic parity. We want the group-wise positivity rates to be equal — \(P(\hat Y = 1 | A) = P(\hat Y = 1 | B)\). Then, for any given positivity rate, the threshold \(t_a\) that achieves this positivity rate might be different from the threshold \(t_b\) that achieves this positivity rate. 12*
This also hints at why why we need the notion of tradeoff curves, rather than a single static threshold: if we want to achieve a positivity rate of \(0.3\), we’ll need different \(t_a, t_b\) than if we wanted to achieve a positivity rate of \(0.4\).
Then, once we’ve generated the curves, we pick a specific point on the curve, such as positivity rate of \(0.3\). When making predictions on future and use the corresponding \(t_a, t_b\) to make predictions
Fairness constraints: We can generate curves satisfying each of the following constraints (fairness metrics):
Demographic Parity (equal positivity rate)
Equal Opportunity (equal true positive rate)
Equalized Odds (equal TPR and FPR)
Sensitive attributes: This algorithm can handle sensitive attributes that take on any number of discrete values (i.e. is not limited to binary sensitive attributes). For continuous sensitive attributes, we can specify buckets for the continuous values, then treat them like categorical attributes.
The tradeoff curve¶
Generating the curve
Each group has its own curve.
To generate a single curve, we generate all possible thresholds (i.e.
[0, 0.001, 0.002, 0.003, ..., 0.999, 1.00]) — default 1000 total thresholds; as described above, for each of those thresholds, we calculate many common fairness metrics for the result if we were to use that threshold to determine predictionsF
Visualizing the curve
Depending on the fairness constraint we’re trying to satisfy, the tradeoff curve will look different. For equal opportunity and demographic parity, where there is a single quantity that must be equalized across the two groups, the tradeoff curve plots the quantity to equalize on the x-axis, and accuracy on the y-axis.
Understanding the curve artifact
The curve artifact can be pulled down according to our docs here. This mitigation approach is somewhat constrained in the sense that we will generate a separate set of curves for each sensitive attribute to mitigate on; for each constraint to follow. Then, each set of curves entails a single curve for each sensitive attribute value.
Conceptually, the curves are organized something like the below; in the API, however, you’ll be getting a single
< list of (x, y) coordinate pairs that define the curve >at a time, with additional information about the attribute being mitigated; the constraint being targeted; and the feature value the specific list of pairs is for.
**mitigating on gender -- demographic parity** gender == male < list of (x, y) coordinate pairs that define the curve > gender == female < list of (x, y) coordinate pairs that define the curve > gender == nb < list of (x, y) coordinate pairs that define the curve > **mitigating on gender -- equalized odds** gender == male < list of (x, y) coordinate pairs that define the curve > gender == female < list of (x, y) coordinate pairs that define the curve > gender == nb < list of (x, y) coordinate pairs that define the curve > **mitigating on age -- demographic parity** age < 35 < list of (x, y) coordinate pairs that define the curve > age >= 35 < list of (x, y) coordinate pairs that define the curve >
Summary of curves by constraint
Equal opportunity (equal TPR): TPR vs accuracy; accuracy-maximizing solution is the point along the x-axis with the highest accuracy (y-axis) for both groups.
Demographic parity (equal selection rates): selection rate vs accuracy; accuracy-maximizing solution is the point along the x-axis with the highest accuracy (y-axis) for both groups.
Equalized odds (equal TPR and FPR): FPR vs TPR (canonical ROC curve); accuracy-maximizing solution is the point on the curves that are closest to the top left corner of the graph (i.e. low FPR and high TPR).
Choosing a set of thresholds (point on curve)¶
As mentioned above, a single “point” on the solution curve for a given constraint and sensitive attribute corresponds to several thresholds mapping feature value to threshold. But how do we choose which point on the curve to use?
The default behavior — and, in fact, the default in fairlearn — is to automatically choose the accuracy-maximizing thresholds subject to a particular fairness constraint. This might work in most cases; however, accuracy is not necessarily the only benchmark that an end-user will care about. One client, for example, needs to satisfy a particular positivity rate, and is fine with sacrificing accuracy to do so. What our feature introduces that goes beyond what’s available in fairlearn is the ability to try different thresholds; see what hypothetical predictions would be; and see what hypothetical results/metrics (eg positivity rate, TPR, etc) would look like if a certain set of thresholds was applied. Ultimate choice of thresholds is up to data scientists’ needs, etc.