skeval.metrics¶
comparison module¶
- skeval.metrics.comparison.score_error(real_scores: ~typing.Mapping[str, float], est_scores: ~typing.Mapping[str, float], comparator: ~typing.Callable[[~typing.Any, ~typing.Any], float] | ~typing.Mapping[str, ~typing.Callable[[~typing.Any, ~typing.Any], float]] = <function mean_absolute_error>, verbose: bool = False) Dict[str, float][source]¶
Compares estimated and real scores using a user-defined comparison function.
This function iterates through the metrics present in both real_scores and est_scores dictionaries and computes the error between them using the provided comparator function(s).
- Parameters:
real_scores (dict) – A dictionary of scores computed with true labels. Example: {‘accuracy’: 0.9, ‘f1’: 0.85}
est_scores (dict) – A dictionary of scores estimated without true labels. Example: {‘accuracy’: 0.88, ‘f1’: 0.82}
comparator (callable or dict, default=mean_absolute_error) – The function or dictionary of functions used to compare the real and estimated scores. - If callable, it’s applied to all common metrics. - If dict, it maps a metric name to a specific comparator function.
verbose (bool, default=False) – If True, prints the real score, estimated score, and the resulting error for each metric.
- Returns:
A dictionary containing the comparison results (errors) for each common metric.
- Return type:
dict
- Raises:
ValueError – If comparator is not a callable or a dictionary of callables.
Examples
>>> real = {'accuracy': 0.95, 'precision': 0.90, 'recall': 0.85} >>> estimated = {'accuracy': 0.91, 'precision': 0.92, 'f1_score': 0.88}
>>> # Example 1: Using the default comparator (mean_absolute_error) >>> errors = score_error(real, estimated) >>> for metric, error in sorted(errors.items()): ... print(f"{metric}: {error:.4f}") accuracy: 0.0400 precision: 0.0200
>>> # Example 2: Using a dictionary of different comparators >>> from sklearn.metrics import mean_squared_error >>> comparators = { ... 'accuracy': mean_absolute_error, ... 'precision': mean_squared_error ... } >>> errors_custom = score_error(real, estimated, comparator=comparators, verbose=True) [accuracy] Real: 0.95, Estimated: 0.91, Error: 0.040000000000000036 [precision] Real: 0.9, Estimated: 0.92, Error: 0.0004000000000000003 >>> for metric, error in sorted(errors_custom.items()): ... print(f"{metric}: {error:.4f}") accuracy: 0.0400 precision: 0.0004
scorers module¶
- skeval.metrics.scorers.make_scorer(func: Callable[[...], float], **kwargs: Any) Callable[[Any, Any], float][source]¶
Wraps a metric function with fixed keyword arguments into a simple scorer.
This utility is useful for creating a unified scorer interface from metric functions that require specific arguments (like average=’macro’ for f1_score).
- Parameters:
func (callable) – A metric function from a library like scikit-learn, such as accuracy_score, f1_score, etc.
**kwargs (dict) – Keyword arguments to be permanently passed to the metric function whenever the scorer is called.
- Returns:
A new scorer function that accepts only y_true and y_pred as arguments.
- Return type:
callable
Examples
>>> from sklearn.metrics import f1_score >>> import numpy as np
>>> # Ground truth and predictions for a multi-class problem >>> y_true = np.array([0, 1, 2, 0, 1, 2]) >>> y_pred = np.array([0, 2, 1, 0, 0, 1])
>>> # Create a scorer for F1-score with 'macro' averaging >>> macro_f1_scorer = make_scorer(f1_score, average='macro')
>>> # Use the new scorer >>> score = macro_f1_scorer(y_true, y_pred) >>> print(f"Macro F1 Score: {score:.4f}") Macro F1 Score: 0.2667
>>> # The result is identical to calling f1_score directly with the argument >>> direct_call_score = f1_score(y_true, y_pred, average='macro') >>> print(f"Direct call F1 Score: {direct_call_score:.4f}") Direct call F1 Score: 0.2667 >>> np.isclose(score, direct_call_score) True