Journal of Sports Science and Medicine

Supplementary Materials

Supplementary Table 3. Table of machine learning and statistical definitions.

Abbreviation	Full Name	Definition
AUC	Area Under the Receiver Operating Characteristic Curve	AUC quantifies the overall ability of a binary classifier to distinguish between positive and negative classes by computing the area under the ROC curve, with values ranging from 0.5 (random) to 1 (perfect classification).
Precision	Precision	Defined as TP / (TP + FP), precision indicates the proportion of positive identifications that were actually correct. High precision indicates a low false positive rate.
Sensitivity	Sensitivity (Recall, True Positive Rate)	Defined as TP / (TP + FN), it measures the proportion of actual positives correctly identified by the model, reflecting the model’s completeness in detecting positives.
Specificity	Specificity (True Negative Rate)	Defined as TN / (TN + FP), it assesses the proportion of actual negatives correctly identified. A higher specificity implies fewer false positives.
DT	Decision Tree	A tree-structured model that splits data based on feature thresholds to predict a target variable. It uses recursive partitioning to maximize information gain or minimize impurity (e.g., Gini or entropy).
RF	Random Forest	An ensemble of decision trees trained on bootstrapped subsets with feature randomness, improving generalization by averaging predictions to reduce overfitting.
SVM	Support vector machine	A supervised classifier that finds the optimal hyperplane to separate classes by maximizing the margin between support vectors, applicable in both linear and non-linear spaces via kernel tricks.
XGBoost	eXtreme Gradient Boosting	An efficient and scalable implementation of gradient boosting that uses second-order derivatives, regularization, and tree pruning for accurate and fast predictive modeling.
ANN	Artificial Neural Networks	A class of models inspired by biological neurons, composed of layers of interconnected nodes (neurons) that learn hierarchical representations through weighted summation and activation functions.
Cost-NN	Cost-Sensitive Neural Network	A neural network trained with misclassification cost weights to penalize minority class errors more heavily, often used in imbalanced data contexts.
dFusionModel	RF-based fusion of XGBoost submodels	A meta-classifier that combines outputs from RF and XGBoost submodels using majority voting or weighted averaging to enhance robustness and accuracy.
LASSO LR	LASSO Logistic Regression	Logistic regression with L1 regularization that shrinks coefficients to zero, performing variable selection and preventing overfitting in high-dimensional settings.