Table 5. Synthesis of individual studies focusing in playing position/team formation prediction.
Study General Aim Outcomes Predicted Key Performance Metrics Interpretability / Key Insights Main Results & Conclusions
(Abidin, 2021) Selection Prediction & Team Formation Player position classification (Defender, Midfielder, Forward) and lineup formation for U13 Altınordu Football Academy players. Compared ML lineups with coach’s ideal lineup and 20 match lineups. RF best at 93.9% accuracy, κ=0.91; MLP 92.6%, LMT 90.5%. Adding Hit/it training data improved accuracy across all algorithms vs. baseline (e.g., RF 81.8% → 93.9%). For team formation, lineups of SMO & SimpleCART closest to coach (Pearson r≈0.975). Lineup similarity with match lineups averaged 89.36%. Demonstrated importance of combining coach evaluation + training device (Hit/it) data. Synthetic data generation addressed small sample. Lineup similarity analysis showed ML can approximate coach/team decisions without using match data. Authors conclude ML models (esp. RF, MLP, LMT) can reliably support player selection and lineup formation, potentially integrated into weekly coaching tools. Hit/it data deemed essential to boost predictive accuracy. External generalizability remains untested beyond single academy.
(Razali et al., 2017) Selection Support & Team Formation Prediction of most suitable playing position (10 outfield roles: sweeper, backs, midfielders, wingers, forwards; GK excluded) based on physical, mental, and technical ratings. Bayesian Networks: 99% accuracy; Decision Tree: 98%; KNN: 97%. Framework combined coach-rated attributes (1–10 scale across physical, mental, technical skills) with ML classifiers. Developed a Football Talent Identification Site for practical deployment. Expert evaluation (20 coaches/managers) confirmed ease of use and relevance. Authors conclude ML classifiers can assign players to their optimal positions with very high accuracy, reducing subjective bias in coach decisions. Prototype system was well-received (75–80% strongly agreed on usability, suitability). Limitations: small single-school dataset, manual skill ratings subjective, no external validation.
(Woods et al., 2018b) Team Formation & Position Classification Classification of elite junior Australian football players (U18) into 4 playing positions (defender, forward, midfield, ruck) based on 12 technical skill indicators from national championships. LDA: 56.8% accuracy (errors: midfielders 19.6% → rucks 75%). Random Forest: 51.6% accuracy (errors: midfielders 27.8% → rucks 100%). PART decision list: 70.1% accuracy (errors: midfielders 14.4% → rucks 100%). Rule induction (PART) generated 6 classification rules, mainly leveraging disposals, contested/ uncontested possessions, kicks, and inside 50s. Showed defenders and forwards overlapped heavily; midfielders most distinct; rucks poorly classified due to small sample. Authors conclude that existing commercial technical indicators provide limited discriminatory power for position classification, with high homogeneity across roles. PART offered relatively better accuracy but overfitting risk noted. Practical implication: recruiters should use more position-specific technical indicators and design competitions/training environments that allow players to demonstrate role-specific attributes. Reliance solely on standard technical stats may obscure positional differences and complicate objective recruitment.
GK = Goalkeeper; κ = Cohen’s Kappa (agreement statistic); LDA = Linear Discriminant Analysis; LMT = Logistic Model Tree; MLP = Multilayer Perceptron; PART = Partial Decision List (rule-based classifier); RF = Random Forest; SMO = Sequential Minimal Optimization; SimpleCART = Classification and Regression Tree (simplified); U13/U18 = Under-13 / Under-18 age category.