Table 4. Synthesis of individual studies focusing exclusively in performance prediction. |
| Study |
General Aim |
Outcomes
Predicted |
Key Performance
Metrics |
Interpretability
/ Key Insights |
Main Results
& Conclusions |
| (Cornforth
et al., 2015) |
Performance
Prediction |
Prediction
of in-game performance in elite Australian football players using pre-match
HRV measures (time, frequency, nonlinear domains) plus environmental/field
data. |
Best correlations
with GA wrapper + regression algorithms: Walk r=0.76, Jog r=0.75,
Cruise r=0.73, Player Load r=0.72, Match Distance r=0.73. PCA improved slightly
over all-variables approach, but GA wrapper yielded the highest predictive
performance (mean r=0.60 vs. 0.49–0.53). |
Highlighted
the value of advanced regression (esp. SMOreg, Gaussian Processes) combined
with feature selection. Identified HRV-derived features (esp. nonlinear
measures) plus environmental conditions (temperature, field size) as significant
contributors to match performance. |
Authors conclude
sophisticated regression models can predict match performance >0.70 correlation
from HRV and environmental data. Potential to support player selection decisions
and training load adjustments tailored to field dimensions and match-day
conditions. Early demonstration of sport informatics potential in team sport. |
| (Duncan et
al., 2024) |
Performance
Prediction |
Dribbling
skill (UGent dribbling test, skill differential with/without ball). |
Initial accuracy:
linear ~57%, ridge ~48%, lasso ~34%, RF ~68%,
boosted ~66%. When stratified by age band: RF 98.6%, boosted trees
96.1%, lasso 94.1%, linear 91.9%. |
Feature importance:
FMS score most influential, followed by coach overall rating, years of playing
experience, and APHV. Birth quartile and chronological age least important. |
ML showed
technical skills can be predicted with high accuracy from multidimensional
inputs, especially FMS. Supports theory that broad motor skill competence
underpins technical soccer ability. Coaches should emphasize FMS training
before sport-specific drills. Suggests a shift away from over-reliance on
physical testing alone. |
| (Sandamal
et al., 2024) |
Performance
Prediction |
Prediction
of soccer players’ performance in field-based tests: Dribbling Shuttle
Test (DSt), Goal Accuracy Test (GAt), and Yo-Yo Intermittent Recovery Test
Level 1 (YYIRT1). |
XGBoost consistently
outperformed RF and KNN across tests (highest R2 and lowest error).
RF showed moderate accuracy, KNN lowest. Performance varied between cohorts,
with Karakalpakstan athletes showing reduced predicted fitness values. |
SHAP global
explanations: anthropometric (sitting height, meso breadth), hematological,
and hormonal markers (E2, IGF-1, cortisol, testosterone) emerged as top
predictors. LIME local explanations confirmed hormonal differences: E2,
IGF-1, cortisol strongly impacted fitness in environmentally exposed group,
while testosterone was more influential in controls. |
Authors conclude
explainable ML (esp. XGBoost + SHAP/LIME) offers accurate and interpretable
fitness prediction in young soccer players. Results highlight negative effects
of environmental degradation (Aral Sea region) on hormonal balance and physical
performance. Study demonstrates value of explainable AI for screening and
tailoring training in vulnerable populations. Limitations: relatively small
cohorts, region-specific findings, no external validation. |
| (Sanjaykumar
et al., 2024) |
Performance
Prediction |
Prediction
of on-court performance based on demographic and physical attributes (age,
height, weight, fat %, muscle mass, bone mass, BMI). |
RF: R2=0.9418,
accuracy=94.18%, RMSE=2.67. XGBoost: R2=0.9276, acc=92.76%, RMSE=2.98.
Linear Regression weaker: R2=0.7531, acc=75.31%, RMSE=5.51. |
Correlation
analysis: Height (r=0.879), muscle mass (r=0.653), bone mass (r=0.622) strongly
positively related to performance. BMI not significant (r=0.04). RF captured
nonlinearities best; XGBoost close. |
Authors conclude
ML—especially Random Forest—provides accurate and objective
prediction of volleyball performance from physical attributes. Supports
more data-driven talent ID, moving beyond subjective scouting. Future work:
integrate skill and psychological factors, extend to diverse populations. |
|
ACC = Accuracy; AUC = Area Under the Curve; APHV = Age at Peak Height Velocity; BMI = Body Mass Index; DSt = Dribbling Shuttle Test; FMS = Fundamental Movement Skills; GA = Genetic Algorithm; GAt = Goal Accuracy Test; HRV = Heart Rate Variability; IGF-1 = Insulin-like Growth Factor 1; KNN = K-Nearest Neighbors; LASSO = Least Absolute Shrinkage and Selection Operator; LIME = Local Interpretable Model-agnostic Explanations; PCA = Principal Component Analysis; R2 = Coefficient of Determination; RF = Random Forest; RMSE = Root Mean Squared Error; SHAP = SHapley Additive exPlanations; SMOreg = Sequential Minimal Optimization regression; XGBoost = Extreme Gradient Boosting; YYIRT1 = Yo-Yo Intermittent Recovery Test, Level 1. |
|