| (Abidin and
Erdem, 2025) |
Selection
Prediction |
Stage 1:
Admission (Pass/Fail). Stage 2: Branch allocation (Football, Basketball,
Volleyball, Athletics, Other). |
Stage 1:
98.9% accuracy (SDL). Stage 2: 97.4% accuracy, MCC 96.6% (SCM-DL, 6 features). |
Feature selection
revealed 6 key features spanning device tests & coach ratings; novel
SCM-DL architecture captured hierarchical relations. |
Authors conclude
SCM-DL outperforms classical ML, can generalize to hierarchical datasets,
and helps coaches prioritize features. External validity remains untested. |
| (Altmann
et al., 2024) |
Selection
Prediction |
Selection
vs. deselection to the next age group (U12–U19) in elite German youth
soccer academy across 7 years. |
Best model
XGBoost: ROC-AUC 0.69, F1-score 0.84. Models more sensitive to “selected”
than “deselected.” |
Physical
& physiological factors (linear sprint, COD sprint, CMJ, aerobic
speed reserve) and soccer-specific skill most influential. Psychological
measures of medium importance; health, age, and position-related variables
inconsistent. |
Authors conclude
physical and skill- related measures are most decisive in selection/deselection;
psychological factors moderate contributors. Suggests focusing academy monitoring
on speed, power, endurance, and soccer-specific skill. Limitations: internal
validation only, moderate discriminative ability (AUC <0.70). |
| (Brown et
al., 2024) |
Selection
Prediction & Profiling |
Differences
between selected vs. non-selected youth male cricketers (U14–17)
and between White British (WB) vs. British South Asian (BSA) selected players
in County Age Group (CAG) programmes. |
Not accuracy-based:
model estimated probability shifts. Positive predictors of selection: athleticism,
wellbeing/cohesion, birth in Q2–Q3, older brothers. Negative predictors:
higher psych. scores, antisocial behaviour, younger brothers/older sisters.
Ethnic group differences observed in athleticism, wellbeing, distress, antisocial
behaviour. |
Multidimensional
input: 104 characteristics across 5 domains (physiological, perceptual-cognitive,
psychological, participation history, socio-cultural). Analysis identified
interaction between family structure, socio-cultural factors, and selection
outcomes. |
Authors conclude
both athletic and socio-cultural variables play significant roles in selection.
Highlight disparities: despite high BSA participation in grassroots cricket,
nder-representation persists at selection level. Suggest systemic bias may
influence CAG selection. Findings exploratory; sample small (N=82). |
| (Craig and
Swinton, 2021) |
Selection
Prediction |
Whether anthropometric
(height, mass, BMI) and physical performance tests (20m sprint, CMJ, YoYo
IR1) predict awarding of professional contracts in an elite Scottish soccer
academy over 10 years. |
Despite significant
mean differences (successful players taller, faster, higher CMJ), predictive
accuracy was near random: error proportion 0.43 (train), 0.45 (test) vs.
0.50 for random guessing. |
Relative
age effect (RAE) very strong: 50% of successful contracts born in Q1. CMJ,
stature, and sprint had small associations but high overlap with non- successful
players. No reliable case-level prediction possible. |
Authors conclude
that anthropometric and physical performance profiling alone cannot predict
professional contract success within already talented academy players. Recommend
data be used to guide training, not selection. Suggest holistic models integrating
technical, tactical, psychological, and sociocultural variables, plus coach
expertise. Stress need for addressing RAE bias (e.g., bio-banding, scout
education). |
| (Formenti
et al., 2022) |
Selection
Prediction |
Classification
of female junior volleyball players as regional vs. provincial level based
on volleyball-specific skills, physical performance, and cognitive functions. |
Decision
Tree: Precision 93%, Recall 73%, F1 = 0.83. Other models (LD, LR, SVM) performed
lower (Precision 47–63%, Recall 57–73%). |
DT identified
passing and spiking technique plus cognitive task response times (Flanker
congruent/incongruent, Visual search 10/15 items) as key discriminators.
Physical tests (COD, CMJ) contributed less. |
Authors conclude
that higher-level players outperform lower-level peers across volleyball
skills, COD, CMJ, and cognitive functions. ML results emphasize the role
of cognitive functions + technical skills (passing, spiking) in discriminating
competitive level. Practical recommendation: include training of both volleyball-specific
techniques and executive/perceptual skills in youth development. |
| (Jauhiainen
et al., 2019) |
Selection
Prediction |
Detection
of potential elite youth soccer players (academy contracts) from large dataset
of junior players (N=951, age 14). |
Best performance
with “phys large” dataset (N=951, 16 physical test variables):
AUC-ROC = 0.763 (±0.007), AUC-PR = 0.960, Sensitivity = 0.80, Specificity
= 0.61. Smaller sets (“phys+quest”, “quest”)
performed worse (AUC-ROC 0.58–0.66). |
Demonstrated
utility of anomaly detection for imbalanced TID problems (14 academy vs.
937 non-academy). Physical tests (jump, sprint, agility) more predictive
than questionnaire/self-assessment. Nonlinear SVM outperformed linear baseline. |
Authors conclude
that one-class SVM can moderately identify future academy players but specificity
remains limited (many false positives). Results promising but not sufficient
for stand-alone selection. Recommend larger datasets, longitudinal validation,
and integration of multidimensional variables. |
| (Jennings
et al., 2024) |
Selection
Prediction |
Drafted vs.
not-drafted players in the AFL National Draft (2021) using physical, GPS
(in-game movement), and technical involvement data. |
Neural networks
consistently outperformed logistic regression: NN specificity = 79 ±
13%, sensitivity = 61 ± 24%, accuracy = 76 ± 8% vs. LR specificity
= 73 ± 15%, sensitivity = 29 ± 14%, accuracy = 66 ±
11%. At draft-rate threshold (15%) and convergence threshold (35%), NN classified
more drafted players in 88% of comparisons. |
Neural networks
handled unfactored, high-dimensional inputs better than LR, capturing nonlinear
relationships. Logistic regression benefited only when data were factored
(dimensionality reduction). Key insight: sensitivity (identifying drafted
players) is paramount, and NN achieved superior balance of sensitivity and
specificity. |
Authors conclude
that NN models are more effective than logistic regression for predicting
draft outcome, particularly when identifying drafted players (sensitivity).
Practical implications: clubs may apply NN-based models to complement subjective
scouting and reduce bias. Limitations: data restricted to one state league,
psychosocial variables absent, career success beyond draft not considered. |
| (Owen et
al., 2022) |
Selection
Prediction |
Selection
vs. non-selection to regional U16 and U18 rugby squads based on 21 physiological
and 47 psychosocial factors. Analyses run for all players, forwards, and
backs. |
Physiological
models: 67.6% (all), 70.1% (forwards), 62.5% (backs). Psychosocial models:
62.3% (all), 73.7% (forwards), 60.4% (backs). Specificity higher than sensitivity
in all cases. |
Key physiological
predictors: greater hand grip strength, faster 10m & 40m sprints,
higher power and momentum. Key psychosocial predictors: lower burnout, reduced
exhaustion, lower reduced sense of accomplishment, lower life stress (forwards),
and lower difficulty describing feelings (forwards). For backs, lower interjected
regulation and lower burnout were features. |
Authors conclude
physiological factors (strength, speed, power) are more predictive of rugby
selection than psychosocial ones, but psychosocial variables (especially
lower burnout and stress) also play a significant role. Position-specific
differences exist (e.g., emotional regulation markers more relevant for
forwards). Recommend holistic, position-tailored selection frameworks including
psychosocial screening alongside physiological testing. |
| (Theagarajan
and Bhanu, 2021) |
Selection
Support |
Classification
of students’ sports-specific talent category (basketball, volleyball,
football, athletics, kabaddi, weightlifting) based on anthropometric and
physical fitness attributes. |
Random Forest
highest: 96.2% accuracy; SVM 95.5%; KNN 95.2%; Decision Tree 92.6%; Naïve
Bayes 89.8%. |
Feature importance
analysis showed attributes like height, weight, speed, and endurance strongly
influenced classification. Models could allocate students to most likely
successful sport pathway. |
Authors conclude
ML, especially RF and SVM, can reliably classify school-level athletes into
suitable sports, providing data-driven support for talent identification
and allocation. Limitations: small, single-institution dataset; attributes
mostly physical, excluding psychological/technical. Recommend broader variables
and longitudinal validation. |