| AUC |
Area Under the Receiver Operating Characteristic Curve |
AUC quantifies the overall ability of a binary classifier to distinguish between positive and negative classes by computing the area under the ROC curve, with values ranging from 0.5 (random) to 1 (perfect classification). |
| Precision |
Precision |
Defined as TP / (TP + FP), precision indicates the proportion of positive identifications that were actually correct. High precision indicates a low false positive rate. |
| Sensitivity |
Sensitivity (Recall, True Positive Rate) |
Defined as TP / (TP + FN), it measures the proportion of actual positives correctly identified by the model, reflecting the model’s completeness in detecting positives. |
| Specificity |
Specificity (True Negative Rate) |
Defined as TN / (TN + FP), it assesses the proportion of actual negatives correctly identified. A higher specificity implies fewer false positives. |
| DT |
Decision Tree |
A tree-structured model that splits data based on feature thresholds to predict a target variable. It uses recursive partitioning to maximize information gain or minimize impurity (e.g., Gini or entropy). |
| RF |
Random Forest |
An ensemble of decision trees trained on bootstrapped subsets with feature randomness, improving generalization by averaging predictions to reduce overfitting. |
| SVM |
Support vector machine |
A supervised classifier that finds the optimal hyperplane to separate classes by maximizing the margin between support vectors, applicable in both linear and non-linear spaces via kernel tricks. |
| XGBoost |
eXtreme Gradient Boosting |
An efficient and scalable implementation of gradient boosting that uses second-order derivatives, regularization, and tree pruning for accurate and fast predictive modeling. |
| ANN |
Artificial Neural Networks |
A class of models inspired by biological neurons, composed of layers of interconnected nodes (neurons) that learn hierarchical representations through weighted summation and activation functions. |
| Cost-NN |
Cost-Sensitive Neural Network |
A neural network trained with misclassification cost weights to penalize minority class errors more heavily, often used in imbalanced data contexts. |
| dFusionModel |
RF-based fusion of XGBoost submodels |
A meta-classifier that combines outputs from RF and XGBoost submodels using majority voting or weighted averaging to enhance robustness and accuracy. |
| LASSO LR |
LASSO Logistic Regression |
Logistic regression with L1 regularization that shrinks coefficients to zero, performing variable selection and preventing overfitting in high-dimensional settings. |