Model Evaluation Functions¶
-
Metrics derived from binary confusion matrices
TPR, FPR, TNR, FNR
sensitivity, specificity, precision, recall, f1
accuracy rate, balanced accuracy rate
Regression¶
All regression evaluation metrics will follow the below request body structure.
Query Request:
{
"select": [
{
"function": "[rmse|mae|rSquared]",
"alias": "<alias_name> [optional string]",
"parameters": {
"ground_truth_property": "<attribute_name> [string]",
"predicted_property": "<attribute_name> [string]"
}
}
]
}
Query Response:
{
"query_result": [
{
"<function_name/alias_name>": "<evaluation_value> [float]"
}
]
}
RMSE¶
Get the RMSE between a prediction attribute and a ground truth attribute.
Sample Request:
{
"select": [
{
"function": "rmse",
"alias": "error",
"parameters": {
"ground_truth_property": "FICO_actual",
"predicted_property": "FICO_predicted"
}
}
]
}
Sample Response:
{
"query_result": [
{
"error": 0.76
}
]
}
MAE¶
Get the Mean Absolute Error between a prediction attribute and a ground truth attribute.
This function takes an optional parameter aggregation
that allows swapping the aggregation from "avg"
to either "min"
or "max"
. This can be helpful if you’re looking for extremes, as in the lowest or highest absolute error, respectively.
Additionally, this functions supports optional params normalizationMax
and normalizationMin
that accept numbers and will perform min/max normalization on the values before aggregation if both params are provided.
Query Request:
{
"select": [
{
"function": "mae",
"alias": "<alias_name> [optional string]",
"parameters": {
"predicted_property": "<predicted_property_name> [string]",
"ground_truth_property": "<ground_truth_property_name> [string]",
"aggregation": "[avg|min|max] (default avg, optional)",
"normalizationMin": "<value> [optional number]",
"normalizationMax": "<value> [optional number]"
}
}
]
}
Sample Request:
{
"select": [
{
"function": "mae",
"alias": "error",
"parameters": {
"ground_truth_property": "FICO_actual",
"predicted_property": "FICO_predicted"
}
}
]
}
Sample Response:
{
"query_result": [
{
"error": 0.76
}
]
}
R Squared¶
Get the R Squared value between a prediction attribute and a ground truth attribute.
Sample Request:
{
"select": [
{
"function": "rSquared",
"alias": "rsq",
"parameters": {
"ground_truth_property": "FICO_actual",
"predicted_property": "FICO_predicted"
}
}
]
}
Sample Response:
{
"query_result": [
{
"rsq": 0.94
}
]
}
Binary Classification¶
Confusion Matrix¶
Calculates the confusion matrix for a classification model. For binary classifiers, users must specify a probability threshold
to count a prediction as a positive class.
Query Request:
{
"select": [
{
"function": "confusionMatrix",
"alias": "<alias_name> [optional string]",
"parameters": {
"threshold": "<value [float]> [required only for binary classifiers]"
}
}
]
}
Query Response:
{
"query_result": [
{
"<function_name/alias_name>": {
"true_positive": "<count> [int]",
"false_positive": "<count> [int]",
"true_negative": "<count> [int]",
"false_negative": "<count> [int]"
}
}
]
}
Sample Request: Calculate the confusion matrix for a binary classifier with a threshold of 0.5 (standard threshold for confusion matrix).
{
"select": [
{
"function": "confusionMatrix",
"parameters": {
"threshold": 0.5
}
}
]
}
Sample Response:
{
"query_result": [
{
"confusionMatrix": {
"true_positive": 100480,
"false_positive": 100076,
"true_negative": 100302,
"false_negative": 99142
}
}
]
}
Confusion Matrix Rate¶
Calculates the confusion matrix rates for a classification model. For binary classifiers, users must specify a probability threshold
to count a prediction as a positive class.
Query Request:
{
"select": [
{
"function": "confusionMatrixRate",
"alias": "<alias_name> [optional string]",
"parameters": {
"threshold": "<value [float]> [required only for binary classifiers]"
}
}
]
}
Query Response:
{
"query_result": [
{
"<function_name/alias_name>": {
"true_positive_rate": "<rate> [float]",
"false_positive_rate": "<rate> [float]",
"true_negative_rate": "<rate> [float]",
"false_negative_rate": "<rate> [float]",
"accuracy_rate": "<rate> [float]"
}
}
]
}
Sample Request: Calculate the confusion matrix for a binary classifier with a threshold of 0.5 (standard threshold for confusion matrix).
{
"select": [
{
"function": "confusionMatrixRate",
"parameters": {
"threshold": 0.5
}
}
]
}
Response:
{
"query_result": [
{
"confusionMatrixRate": {
"true_positive_rate": 0.5033513340213003,
"false_positive_rate": 0.49943606583557076,
"true_negative_rate": 0.5005639341644292,
"false_negative_rate": 0.4966486659786997
}
}
]
}
Confusion Matrix Variants¶
If you only want a specific metric derived from a confusion matrix, you can use one of the following functions:
truePositiveRate
falsePositiveRate
trueNegativeRate
falseNegativeRate
accuracyRate
balancedAccuracyRate
f1
sensitivity
specificity
precision
recall
For example, to return the truePositiveRate
:
{
"select": [
{
"function": "truePositiveRate",
"parameters": {
"threshold": 0.5,
"ground_truth_property":"class_a",
"predicted_property":"ground_truth_a"
}
}
]
}
Response:
{
"query_result": [
{
"truePositiveRate": 0.5033513340213003
}
]
}
AUC¶
The Area Under the ROC Curve can also be computed for binary classifiers.
Sample Query:
{
"select": [
{
"function": "auc",
"parameters": {
"ground_truth_property":"class_a",
"predicted_property":"ground_truth_a"
}
}
]
}
Response:
{
"query_result": [
{
"auc": 0.9192331426352897
}
]
}
Multiclass Classification¶
Multiclass Accuracy Rate¶
Calculates the global accuracy rate.
Query Request:
{
"select": [
{
"function": "accuracyRateMulticlass",
"alias": "<alias_name> [optional string]"
}
]
}
Query Response:
{
"query_result": [
{
"accuracyRateMulticlass": "<rate> [float]"
}
]
}
Example:
{
"select": [
{
"function": "accuracyRateMulticlass"
}
]
}
Response:
{
"query_result": [
{
"accuracyRateMulticlass": 0.785
}
]
}
Multiclass Confusion Matrix¶
Calculates the confusion matrix for a multiclass model in regards to a single clas. The predicted attribute and ground truth attribute must be passed as parameters.
Query Request:
{
"select": [
{
"function": "confusionMatrixMulticlass",
"alias": "<alias_name> [optional string]",
"parameters": {
"predicted_property": "<predicted_property_name>",
"ground_truth_property": "<ground_truth_property_name>"
}
}
]
}
Query Response:
{
"query_result": [
{
"<function_name/alias_name>": {
"true_positive": "<count> [int]",
"false_positive": "<count> [int]",
"true_negative": "<count> [int]",
"false_negative": "<count> [int]"
}
}
]
}
Example:
{
"select": [
{
"function": "confusionMatrixMulticlass",
"parameters": {
"predicted_property": "predicted_class_A",
"ground_truth_property": "gt_predicted_class_A"
}
}
]
}
Response:
{
"query_result": [
{
"confusionMatrix": {
"true_positive": 100480,
"false_positive": 100076,
"true_negative": 100302,
"false_negative": 99142
}
}
]
}
Multiclass Confusion Matrix Rate¶
Calculates the confusion matrix rates for a multiclass classification model in regards to a single predicted class.
Query Request:
{
"select": [
{
"function": "confusionMatrixRateMulticlass",
"alias": "<alias_name> [optional string]",
"parameters": {
"predicted_property": "predicted_class_A",
"ground_truth_property": "gt_predicted_class_A"
}
}
]
}
Query Response:
{
"query_result": [
{
"<function_name/alias_name>": {
"true_positive_rate": "<rate> [float]",
"false_positive_rate": "<rate> [float]",
"true_negative_rate": "<rate> [float]",
"false_negative_rate": "<rate> [float]",
"accuracy_rate": "<rate> [float]",
"balanced_accuracy_rate": "<rate> [float]",
"precision": "<rate> [float]",
"f1": "<rate> [float]"
}
}
]
}
Example calculating the confusion matrix rates:
{
"select": [
{
"function": "confusionMatrixRateMulticlass",
"parameters": {
"predicted_property": "predicted_class_A",
"ground_truth_property": "gt_predicted_class_A"
}
}
]
}
Response:
{
"query_result": [
{
"confusionMatrixRateMulticlass": {
"true_positive_rate": 0.6831683168316832,
"false_positive_rate": 0.015653220951234198,
"true_negative_rate": 0.9843467790487658,
"false_negative_rate": 0.31683168316831684,
"accuracy_rate": 0.9378818737270875,
"balanced_accuracy_rate": 0.8337575479402245,
"precision": 0.8884120171673819,
"f1": 0.7723880597014925
}
}
]
}
If you only want a specific value from the confusion matrix rate function, you can use one of the following functions:
truePositiveRateMulticlass
falsePositiveRateMulticlass
trueNegativeRateMulticlass
falseNegativeRateMulticlass
For example, to return the truePositiveRate
:
{
"select": [
{
"function": "truePositiveRateMulticlass",
"parameters": {
"predicted_property": "predicted_class_A",
"ground_truth_property": "gt_predicted_class_A"
}
}
]
}
Response:
{
"query_result": [
{
"truePositiveRate": 0.5033513340213003
}
]
}
Multiclass F1¶
Calculates the components needed to compute a F1 score for a multiclass model.
In this example, the model has 3 classes: class-1
, class-2
, class-3
and
the corresponding ground truth labels class-1-gt
, class-2-gt
, class-3-gt
.
Query Request:
{
"select": [
{
"function": "count",
"alias": "count"
},
{
"function": "confusionMatrixRateMulticlass",
"alias": "class-1",
"parameters": {
"predicted_property": "class-1",
"ground_truth_property": "class-1-gt"
}
},
{
"function": "countIf",
"alias": "class-1-gt",
"parameters": {
"property": "multiclass_model_ground_truth_class",
"comparator": "eq",
"value": "class-1-gt"
},
"stage": "GROUND_TRUTH"
},
{
"function": "confusionMatrixRateMulticlass",
"alias": "class-2",
"parameters": {
"predicted_property": "class-2",
"ground_truth_property": "class-2-gt"
}
},
{
"function": "countIf",
"alias": "class-2-gt",
"parameters": {
"property": "multiclass_model_ground_truth_class",
"comparator": "eq",
"value": "class-2-gt"
},
"stage": "GROUND_TRUTH"
},
{
"function": "confusionMatrixRateMulticlass",
"alias": "class-3",
"parameters": {
"predicted_property": "class-3",
"ground_truth_property": "class-3-gt"
}
},
{
"function": "countIf",
"alias": "class-3-gt",
"parameters": {
"property": "multiclass_model_ground_truth_class",
"comparator": "eq",
"value": "class-3-gt"
},
"stage": "GROUND_TRUTH"
}
]
}
Query Response:
{
"query_result": [
{
"count": 7044794,
"class-1-gt": 2540963,
"class-2-gt": 2263918,
"class-3-gt": 2239913,
"class-1": {
"true_positive_rate": 0.4318807475748368,
"false_positive_rate": 0.3060401245073361,
"true_negative_rate": 0.6939598754926639,
"false_negative_rate": 0.5681192524251633,
"accuracy_rate": 0.5994314383074935,
"balanced_accuracy_rate": 0.5629203115337503,
"precision": 0.4432575070302042,
"f1": 0.437495178612114
},
"class-2": {
"true_positive_rate": 0.42177322676881407,
"false_positive_rate": 0.3514795196528837,
"true_negative_rate": 0.6485204803471163,
"false_negative_rate": 0.578226773231186,
"accuracy_rate": 0.5756528863725469,
"balanced_accuracy_rate": 0.5351468535579652,
"precision": 0.3623427088234848,
"f1": 0.38980575845890253
},
"class-3": {
"true_positive_rate": 0.26144274353512836,
"false_positive_rate": 0.2805894672521546,
"true_negative_rate": 0.7194105327478454,
"false_negative_rate": 0.7385572564648716,
"accuracy_rate": 0.5737983254017079,
"balanced_accuracy_rate": 0.4904266381414869,
"precision": 0.3028268576818381,
"f1": 0.2806172238153916
}
}
]
}
With this result, you can calculate the weighted F1 score by multiplying each classes’s F1 score by the count of the ground truth and dividing by the total count. In this example, that would be
(class-1.f1 * class-1-gt + class-2.f1 * class-2-gt + class-3.f1 * class-3-gt) / count
and with numbers:
(0.437495178612114 * 2540963 +
0.38980575845890253 * 2263918 +
0.2806172238153916 * 2239913) / 7044794
= 0.3722898785
Object Detection¶
Mean Average Precision¶
Calculates Mean Average Precision for an object detection model. This is used as measure of accuracy for object detection models.
threshold
determines minimum IoU value to be considered a match for a label. predicted_property
and ground_truth_property
are optional parameters and should be the names of the predicted and ground truth attributes for the model. They default to "objects_detected"
and "label"
respectively if nothing is specified for these parameters.
Query Request:
{
"select": [
{
"function": "meanAveragePrecision",
"alias": "<alias_name> [Optional]",
"parameters": {
"threshold": "<threshold> [float]",
"predicted_property": "<predicted_property> [str]",
"ground_truth_property": "<ground_truth_property> [str]"
}
}
]
}
Query Response:
{
"query_result": [
{
"<function_name/alias_name>": "<result> [float]"
}
]
}
Example:
{
"select": [
{
"function": "meanAveragePrecision",
"parameters": {
"threshold": 0.5,
"predicted_property": "objects_detected",
"ground_truth_property": "label"
}
}
]
}
Query Response:
{
"query_result": [
{
"meanAveragePrecision": 0.78
}
]
}
Bias¶
Bias Mitigation¶
Calculates mitigated predictions based on conditional thresholds, returning 0/1 for each inference.
Note that this function returns null
for inferences that don’t match any of the provided conditions.
Query Request:
{
"select":
[
{
"function": "biasMitigatedPredictions",
"alias": "<alias_name> [Optional]",
"parameters":
{
"predicted_property": "<predicted_property> [str]",
"thresholds":
[
{
"conditions":
{
"property": "<attribute_name> [string or nested]",
"comparator": "<comparator> [string] Optional: default 'eq'",
"value": "<string or number to compare with property>"
},
"threshold": "<threshold> [float]"
}
]
}
}
]
}
Query Response:
{
"query_result": [
{
"<function_name/alias_name>": "<result> [int]"
}
]
}
Example:
{
"select":
[
{
"function": "biasMitigatedPredictions",
"parameters":
{
"predicted_property": "prediction_1",
"thresholds":
[
{
"conditions":
[
{
"property": "SEX",
"value": 1
}
],
"threshold": 0.4
},
{
"conditions":
[
{
"property": "SEX",
"value": 2
}
],
"threshold": 0.6
}
]
}
}
]
}
Response:
{
"query_result":
[
{
"SEX": 1,
"biasMitigatedPredictions": 1
},
{
"SEX": 2,
"biasMitigatedPredictions": 0
},
{
"SEX": 1,
"biasMitigatedPredictions": 0
}
]
}