|
|
|
| |
| ABSTRACT |
|
Talent identification (TID) in team sports is complex, influenced by biological, technical, psychological, and socio-cultural factors. Machine learning (ML) offers tools to integrate high-dimensional data, yet its applications in youth TID remain underexplored. Objectives: To systematically review ML approaches applied to youth talent identification in team sports, with emphasis on data domains, algorithms, validation strategies, and interpretability. Eligible studies included peer-reviewed quantitative research applying ML to youth athletes (≤21 years) in team sports for TID outcomes. Searches were conducted in PubMed, Scopus, and Web of Science, supplemented by reference and citation screening. Extracted data items included input data domains (anthropometric, physical, technical, perceptual–cognitive, psychological, socio-cultural, and multi-domain), ML approach, validation methods, performance metrics (e.g., accuracy, AUC, F1-score), and interpretability techniques. Risk-of-bias assessment was implemented using PROBAST. From 228 records, 27 studies met inclusion criteria. Soccer was most studied (n = 13), with others covering rugby, basketball, cricket, volleyball, and Australian football. Sample sizes ranged from 21 to 13,876 athletes, predominantly male. Supervised algorithms (Random Forest, gradient boosting, neural networks, penalized regression) were most common; some studies used unsupervised clustering. Validation practices varied, with few employing nested cross-validation or external testing. Reported discrimination metrics ranged from modest to excellent (ROC-AUC ≈ 0.58-0.96, depending on model and context), yet calibration performance (e.g., Brier score, calibration slope) was rarely reported, and external validation was uncommon. Across studies, predictive accuracy was moderate to high internally but rarely externally confirmed. Risk of bias was high in 59 % of studies, mainly due to inadequate analysis and limited generalizability. Overall, ML shows potential to complement, not replace, traditional TID approaches - acting as a decision-support and hypothesis-generation tool that can assist practitioners in early screening, individualized progression modeling, and evidence-based talent forecasting. To strengthen translational impact, future research should emphasize transparent reporting, calibration assessment, and external validation to ensure robust, applicable ML models for sport talent systems. |
| Key words:
Youth athletes, talent development, predictive modeling, sports analytics, artificial intelligence
|
Key
Points
- Machine learning (ML) can identify complex talent patterns across physical, technical, and psychological data, but it should complement—not replace—expert judgment.
- Most studies show moderate accuracy but lack external validation, making their generalizability and real-world reliability limited.
- Current research is constrained by small samples and bias, highlighting the need for larger, multi-sport, and longitudinal datasets with standardized reporting and validation.
|
Talent identification and development (TID) can be conceptualized as a complex, non-linear, and adaptive system arising from the continuous interaction of multiple constraints (e.g., biological, technical, psychological, environmental, and sociocultural), consistent with ecological dynamics (Vaeyens et al., 2008; Seifert et al., 2017; 2022). This perspective treats athletes and teams as complex adaptive systems in which performance emerges from performer–environment couplings rather than from any single determinant, helping explain variability and divergent pathways to expertise (Seifert et al., 2017; 2022). These same principles inform the use of machine learning (ML), as algorithms trained on representative, context-rich data can better capture the functional - rather than merely descriptive - aspects of performance. Incorporating contextualized variables such as opponent positioning, temporal constraints, or perceptual–motor demands enables ML models to infer how athletes adapt to dynamic environments, thereby aligning data-driven modeling with the ecological validity of real performance contexts (Reis et al., 2024; Cordeiro et al., 2025). TID outcomes can be operationally as measurable indicators of athlete progression, including selection (the identification or nomination of athletes for higher-level squads, academies, or representative team) (Larkin and O’Connor, 2017), advancement (continued inclusion or promotion within developmental pathways across time); or retention (sustained participation or non-deselection within structured development systems) (Güllich, 2014). TID are central pillars of performance pathways in team sports, yet they remain challenging due to the multifactorial and long-term nature of sporting excellence (Vaeyens et al., 2008). In soccer and other team sports games, early reviews already emphasized that no single anthropometric, physiological, or psychological attribute uniquely determines future elite status, underscoring the need for multidimensional assessment (Williams and Reilly, 2000). Accordingly, comprehensive, multidisciplinary test batteries have been advocated to distinguish performance levels in youth players, integrating technical, physical, and perceptual–cognitive factors (Reilly et al., 2000). However, conventional selection practices can be biased by structural and developmental factors (Till and Baker, 2020). Across sports, annual age-grouping systematically produces relative age effects that distort participation and attainment, with robust meta-analytic evidence showing substantial over-representation of relatively older athletes (Cobley et al., 2009). These biases also affect women’s sport, where relative age effects are prevalent and can shape pathway opportunities (Smith et al., 2018). In parallel, differences in growth and biological maturation are particularly salient in adolescence, where earlier-developing youth may temporarily appear superior in test batteries, complicating prognostic judgments in talent pathways (Malina et al., 2015). Longitudinal work further suggests that while some anthropometric and running measures show short-term stability, predictability erodes as the follow-up window lengthens, cautioning against early deterministic selection (Deprez et al., 2015). From a systems perspective, team sports exhibit properties of complex adaptive systems in which performance emerges from interacting constraints across performers, tasks, and environments, challenging linear prediction (Seifert et al., 2017). This lens encourages practitioners to design representative learning environments and assess adaptable skill, rather than isolated traits alone (Woods et al., 2020). Concurrently, the proliferation of player monitoring - such as global positioning system (GPS) and inertial technologies - has generated high-volume, multi-source data that can complement traditional scouting in talent pathways (Ravé et al., 2020). For youth programs in particular, such data-rich approaches may help disentangle transient growth effects from underlying skill and potential, if analyzed with appropriate modeling strategies. Machine learning (ML) methods are well suited to model high-dimensional, nonlinear relationships and to fuse heterogeneous data streams, and have transformed predictive analytics across biomedicine in analogous problems (Topol, 2019). Within sport, researchers have highlighted the growing role of artificial intelligence (AI) and ML for decision support across performance and recruitment domains (Chmait and Westerbeek, 2021). Indeed, soccer-specific syntheses now document rapid expansion of ML applications, signaling both opportunity and methodological variability that warrant careful appraisal (Rico-González et al., 2023; Beato et al., 2025). Within this growing landscape, ML applications in TID can be conceptually grouped into four possible overlapping roles. First, predictive modeling seeks to forecast future selection, progression, or performance based on multidimensional athlete data, aligning with conventional supervised learning paradigms (Altmann et al., 2024). Second, clustering and representation learning use unsupervised methods to identify latent groupings or archetypes of players, informing talent grouping and developmental profiling (Contreras-García et al., 2024; Haan et al., 2025). Third, longitudinal monitoring leverages sequential or temporal models to track developmental trajectories and maturation dynamics, offering insight into non-linear growth patterns (Chmait and Westerbeek, 2021). Finally, decision-support systems integrate these analytic layers into practical tools that complement coach judgment by providing interpretable, data-informed recommendations (Chmait and Westerbeek, 2021). ML applications in youth talent identification are beginning to emerge, directly targeting selection and advancement decisions within academies and development squads (Nassis et al., 2023). Recent work in elite youth soccer used supervised algorithms (e.g., gradient-boosted trees) to predict selection versus de-selection across age groups, identifying contributions from speed, change of direction, countermovement jump, aerobic speed reserve, and technical skill (Altmann et al., 2024). A growing line of inquiry also examines how socio-biological factors, particularly the relative age effect and maturation status, may influence data-driven decision-making (Finnegan et al., 2024). ML offers a means to quantify, and potentially mitigate, these entrenched selection biases - depending on how data are sampled, labelled, and validated - thus serving as a test case for fairness and transparency in predictive modelling (Reis et al., 2024). Multidisciplinary approaches have also combined psychosocial and physiological measures with ML to predict youth rugby union selections, illustrating the value of integrating non-physical determinants (Owen et al., 2022). Beyond supervised prediction, unsupervised learning has been explored to derive role-agnostic player groupings from match running data, offering alternative structures for evaluation and development planning (Haan et al., 2025). At the position-specific level, ML classifiers have been applied to discriminate performance tiers in professional goalkeepers, demonstrating how algorithmic profiling can inform specialized talent evaluation (Jamil et al., 2021). Yet, translating these advances into dependable youth talent decisions requires vigilance about methodological pitfalls common to prediction research (de Jong et al., 2021). Small sample sizes and inadequate validation inflate estimated performance, highlighting the importance of robust procedures such as nested cross-validation and strict separation of training and testing (Vabalas et al., 2019). Data leakage - through feature selection on the full dataset, reusing individuals across folds, or inadvertent temporal contamination - can markedly overstate model accuracy and undermine reproducibility (Kapoor and Narayanan, 2023). Evaluation must also account for class imbalance and choose metrics judiciously, given differing sensitivities of ROC and precision–recall analyses under skewed outcomes typical of selection tasks (Richardson et al., 2024). For clinical-style prediction problems, independent external validation remains essential to estimate generalizability prior to deployment in new cohorts or clubs (Gallitto et al., 2025). Aligned with broader prediction-model science, contemporary reporting guidance (TRIPOD+AI) and risk-of-bias tools (e.g., PROBAST) provide structured expectations for transparency, reproducibility, and appraisal of ML-based models (Wolff et al., 2019; Collins et al., 2024). Several narrative and systematic reviews have synthesized traditional and methodological approaches to talent identification (TID) in team sports, but without a specific emphasis on ML techniques and their unique challenges (Barraclough et al., 2022). Other reviews focus on ML in soccer broadly or on injury risk prediction, rather than on youth talent identification across multiple team sports and data modalities (Nassis et al., 2023; Leckey et al., 2025). Likewise, sport-specific talento identification syntheses in football underscore multidimensional determinants but do not evaluate the distinct promises and pitfalls of ML for selection decisions across team sports (Sarmento et al., 2018). Therefore, the purpose of this systematic review is to map, critically appraise, and synthesize applications of ML to youth talent identification in team sports, with attention to data sources, model classes, validation strategies, interpretability, and risk of bias, consistent with contemporary prediction-model guidance. Conceptually, this review also examines whether multidomain ML models – integrating physical, technical, perceptual–cognitive, and psychosocial indicators - capture developmental potential more effectively than single-domain approaches, thereby addressing how the multidimensional nature of athlete development can be represented within predictive frameworks. Analytically, we also quantify the use of nested versus non-nested cross-validation procedures to provide a transparent overview of model evaluation rigor and guide the reproducibility of the synthesis process. Specifically, we aim to catalog the types of athlete data and ML methods used to predict selection and advancement in team sports, evaluate methodological quality, reporting, and validation practices, summarize model performance, calibration, and generalizability, and identify evidence gaps and practical implications for programs and practitioners seeking to integrate ML into selection and development processes.
The review was conducted and reported in accordance with PRISMA 2020 recommendations to ensure transparent and reproducible synthesis (Page et al., 2021b). Registration was conducted on OSF (osf.io/yn895; October 15, 2025).
Eligibility criteriaPICO criteria
Studies were considered eligible if they addressed the use of ML methods for TID in team sports. Eligibility was defined using a modified PICO framework as follows: Population (P): Youth athletes (≤21 years) engaged in organized team sports (e.g., soccer, basketball, rugby, hockey, handball, volleyball, American football, baseball, and other team sports). Studies were eligible regardless of competitive level (grassroots, academy, sub-elite, or elite youth), and no restrictions were imposed on sex. Studies focusing exclusively on adult/professional-only cohorts or on individual sports were excluded. We defined “youth” as athletes ≤21 years to align with established competitive tiers and developmental transition points in team sports. In football and other codes, U21 is the terminal youth category preceding senior squads; research shows that experience and performance at U21 best predict subsequent senior participation compared with earlier youth levels, situating age 21 as the practical boundary of the youth pathway (Herrebrøden and Bjørndal, 2022). More broadly, youth-athlete development reviews describe late adolescence and emerging adulthood (late teens–early 20s) as the period when maturation, psychosocial development, and role transitions converge - precisely the window spanned by the U21 tier - supporting the conceptual placement of ≤21 as the end of the formative, pre-senior phase (Varghese et al., 2022). In studies that included both youth (≤21 years) and adult athletes, inclusion was contingent on whether youth-specific results could be clearly identified or disaggregated. Intervention/Exposure (I): Application of ML algorithms (supervised, unsupervised, reinforcement, or hybrid approaches) to support talent identification or selection processes (e.g., prediction of selection vs. deselection, progression to higher competitive levels, role-agnostic player clustering, or position-specific profiling in youth athletes). Studies limited to traditional statistical analyses without ML components were excluded. Comparators (C): Comparator groups were not mandatory. Where applicable, comparators could include traditional scouting, expert coach assessment, or alternative analytic approaches (e.g., regression, rule-based classification). Outcomes (O): Eligible studies had to report at least one youth TID-related outcome, such as predictive accuracy of selection, identification of key features contributing to progression, classification of athlete profiles, or algorithmic discrimination of performance tiers within youth cohorts. Studies were excluded if ML was applied exclusively to non-TID outcomes (e.g., injury prediction, workload monitoring, or tactical analysis), if ML was applied only in adult/professional samples, or if results were not disaggregated to allow extraction of youth TID-specific findings.
Study design and settingAll quantitative empirical studies employing ML algorithms for TID were included, regardless of design (cross-sectional, longitudinal, retrospective, or prospective). Proof-of-concept studies, validation studies, and applied analyses in real-world settings were all eligible. Qualitative studies, narrative commentaries, editorials, opinion pieces, and reviews were excluded, though their reference lists were screened for potential eligible primary studies.
Report characteristicsOnly peer-reviewed journal articles were included to ensure methodological rigor. Grey literature, preprints, conference abstracts, theses, and unpublished reports were excluded due to limitations in methodological detail and peer review. Only studies published in English were considered eligible. No restrictions were placed on the year of publication.
Information sourcesThe literature search was conducted across three major bibliographic databases to ensure coverage of relevant studies: PubMed, Scopus, and the Web of Science Core Collection. No restrictions were applied with respect to publication year, study design, or participant age at the search stage. The final searches of all databases were completed on October 15, 2025. To complement the electronic database searches, the reference lists of all studies meeting the eligibility criteria were manually examined to identify additional articles not retrieved in the initial search. Reference lists of previous systematic and narrative reviews relevant to talent identification, sports analytics, or the application of machine learning in sport were also screened. Furthermore, backward and forward citation searches were conducted using the Web of Science Core Collection for all included studies to capture any additional eligible publications. No study registers, trial registries, organizational repositories, or grey literature sources were searched. Only peer-reviewed journal publications retrieved through the databases and reference list searches were included for screening.
Search strategyThe search strategy was designed to capture all available studies addressing the use of ML for TID in team sports. The strategy combined controlled vocabulary terms and free-text words related to “machine learning,” “artificial intelligence,” and “talent identification” with sport-specific terms, following iterative piloting and refinement to balance sensitivity and specificity. The conceptual structure of the strategy was based on a modified PICO approach, focusing on the population of team sport athletes and the intervention or exposure of machine learning applications for talent identification outcomes. The following search strategy was employed: ("machine learning” OR "artificial intelligence” OR "deep learning" OR "supervised learning" OR "unsupervised learning" OR "neural network*" OR "support vector machine*" OR "random forest*" OR "gradient boosting" OR "learning algorithms" OR "bayesian logistic regression" OR “random forest" OR "random forests" OR "trees" OR "elastic net" OR "ridge" OR "lasso" OR "boosting" OR "predictive modeling") AND (talent* OR "talent identification" OR "talent detection" OR "talent development" OR "player selection" OR "athlete selection" OR "talent promotion") AND ("team sport*" OR "soccer" OR "football" OR "basketball" OR "rugby" OR "handball" OR "volleyball" OR "hockey" OR "baseball" OR "softball" OR "lacrosse" OR "water polo").
Selection processAll records identified through database searching were imported into an Excel sheet, and duplicates were removed prior to screening. Two reviewers independently assessed the eligibility of studies against the predefined inclusion and exclusion criteria in title/abstract screening and then in full-text screening. Disagreements between reviewers were resolved through discussion. The reasons for excluding studies at the full-text stage were documented and reported.
Data collection processTwo reviewers independently extracted data from each study. The extracted information was subsequently compared, and any discrepancies were resolved through discussion. No automation tools or machine learning–based systems were used for data collection. Only information explicitly reported in tables, text, or graphs was included.
Data itemsThe domain of interest was the performance of machine learning models applied to talent identification in youth team sports. Within this domain, data were extracted on predictive or classification performance metrics reported by each study. These included, where available, overall accuracy, sensitivity (recall), specificity, precision, F1-score, area under the receiver operating characteristic curve (AUC-ROC), and area under the precision–recall curve (AUC-PR). When studies reported multiple metrics, all available values were collected to allow for a comprehensive synthesis. Other domains included talent-related predictions and classifications such as selection versus deselection, progression to higher competition levels, clustering of players into performance profiles, and position- or role-specific identification. Where studies reported longitudinal prediction outcomes, all time points were collected, and no restrictions were applied to the follow-up period. In cases where results were presented using different analysis strategies (e.g., cross-validation folds, test set performance, external validation), all eligible outcomes were extracted, with priority given to independent test set or external validation results when synthesizing evidence. No changes were made during the review process to the inclusion or definition of outcome domains. All outcome domains compatible with TID were considered equally relevant at the data extraction stage. However, in the interpretation of findings, external validation performance and transparent reporting of prediction quality were considered most critical, as these outcomes are directly aligned with the review’s objectives of evaluating methodological robustness and generalizability. In addition to outcomes, other variables were extracted from each study to support subgroup analyses and contextual interpretation. Study characteristics included publication year and country of origin. Participant characteristics comprised sample size, sex distribution, age range, competitive context (e.g., grassroots, academy, or elite youth), and where available, indicators of biological maturation. Sport type was also recorded. Data characteristics included the domain of features used (e.g., anthropometric, physical, technical, perceptual - cognitive, psychosocial, or multi-domain) and the methods of data acquisition (e.g., field-based tests, questionnaires, match-derived tracking data). Machine learning–related variables included the class of algorithms applied (e.g., supervised, unsupervised, ensemble, deep learning), model development strategies (e.g., feature selection, dimensionality reduction), training and validation procedures (e.g., cross-validation, independent test set, external validation), and performance metrics reported. Where available, reporting of interpretability approaches (e.g., feature importance, SHapley Additive exPlanations, Local Interpretable Model-agnostic Explanations) was also extracted. When information was missing or unclear, we recorded it as “not reported” without making assumptions.
Study risk of bias assessmentThe methodological quality and risk of bias of all included studies were assessed using the Prediction model Risk Of Bias Assessment Tool (PROBAST, version 1.0), which is specifically designed for evaluating studies that develop, validate, or update predictive models (de Jong et al., 2021). PROBAST was chosen because machine learning applications in talent identification constitute predictive modeling studies, and the tool allows systematic evaluation across relevant domains. To complement this formal appraisal, we also considered a broader construct of practical trustworthiness - the extent to which a model’s reported performance can be reasonably trusted for real-world decision support. This concept integrates three key safeguards: (i) external validation on independent data to test generalizability; (ii) calibration assessment to ensure probabilistic predictions correspond to observed outcomes; and (iii) data-leakage control, referring to methodological steps that prevent overlap between training and test information. The PROBAST framework consists of four domains (Wolff et al., 2019): (i) participants, assessing whether the study sample is representative and appropriate for the intended target population; (ii) predictors, evaluating the definition, measurement, and availability of input variables; (iii) outcomes, assessing whether outcome definitions, timing, and measurement were appropriate; and (iv) analysis, focusing on modeling methods, handling of overfitting, missing data, validation, and performance reporting. Each domain includes signaling questions that guide judgments of “low,” “high,” or “unclear” risk of bias. An overall risk of bias judgment was made for each study by aggregating across domains, with studies classified as “low risk” only if all domains were rated low. If one or more domains were judged as high risk, the overall classification was high; if one or more were unclear with none rated high, the overall classification was unclear. Two reviewers independently performed the risk of bias assessment for each included study. Discrepancies in judgments were resolved through discussion. All judgments were based exclusively on information reported in the published articles. Given the particularities of machine learning research, special attention was given to signaling questions within the analysis domain, including handling of class imbalance, prevention of data leakage, adequacy of validation strategies, and transparency of reporting model performance metrics.
Effect measuresFor the outcome domain - predictive performance of machine learning models for talent identification in team sports - we extracted and reported all performance metrics provided by the original studies. Given the diversity of machine learning methods and outcome definitions, no single effect measure was imposed a priori. Instead, the following effect measures were prioritized based on their frequency of use and interpretability in predictive modeling research. For binary classification outcomes (e.g., selected vs. deselected, progressed vs. not progressed), the principal effect measures were overall accuracy, sensitivity (recall), specificity, precision (positive predictive value), F1-score, and the area under the receiver operating characteristic curve (AUC-ROC). Where reported, the area under the precision–recall curve (AUC-PR) was also extracted to account for class imbalance, which is common in talent identification contexts. For multi-class or clustering outcomes (e.g., player profiles, position-specific categories), measures such as overall classification accuracy, macro- and micro-averaged F1-scores, and adjusted Rand index were extracted. For continuous outcomes (e.g., predictive regression of performance scores or advancement probabilities), effect measures included mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). Where multiple metrics were presented for the same model, all were recorded, but in synthesis greater emphasis was placed on metrics reflecting generalizability, particularly those derived from independent test sets or external validation cohorts. No thresholds for minimally important differences were defined a priori, as such benchmarks do not currently exist for talent identification in team sports. Instead, results were interpreted with reference to established conventions in machine learning research (e.g., AUC-ROC values of 0.50 indicating no discrimination, 0.70-0.80 acceptable, 0.80-0.90 excellent, and >0.90 outstanding performance) while acknowledging the limitations of applying generic thresholds to heterogeneous sporting contexts. No re-expression of results into alternative effect measures was required, as extracted metrics were analyzed in their originally reported form. The choice to retain multiple performance measures was justified by the heterogeneous reporting practices in the field and by the need to provide a transparent overview of predictive model performance rather than privileging a single effect measure.
Synthesis methodsData from included studies were extracted into structured evidence tables designed to enable consistent cross-study comparison. Extraction focused on: (i) study identification details (sport, competitive level, and sample characteristics); (ii) input data domains (e.g., anthropometric, physical, technical, perceptual–cognitive, psychosocial, or multi-domain); (iii) machine learning approach (e.g., supervised classification, regression, ensemble learning, clustering, or deep learning methods); (iv) type of outcome predicted (e.g., selection vs. deselection, progression, positional classification, performance prediction, profiling, or maturation); (v) validation strategy and performance metrics; (vi) interpretability analyses or insights reported by authors; and (vii) main results and conclusions. If studies tested multiple algorithms, results were extracted for each model, though synthesis tables emphasized the best-performing or most interpretable approach. No data transformations, imputations, or re-analyses were performed; where performance metrics or validation details were missing, these were reported as “not reported.” To facilitate synthesis, studies were grouped according to their primary analytic aim rather than by sport or algorithm. Each table followed a standardized column structure (General Aim, Outcomes Predicted, Key Performance Metrics, Interpretability/Key Insights, and Main Results & Conclusions). To improve clarity, abbreviation glossaries were provided for each table, and narrative overviews were written to introduce and contextualize the included studies. Given the heterogeneity of sports, data modalities, machine learning methods, and outcome definitions, statistical pooling or meta-analysis was not feasible. Instead, a structured narrative synthesis was undertaken. This narrative integrated the tabular evidence with cross-cutting themes, focusing on: (i) recurring methodological patterns; (ii) relative strengths and limitations of different ML approaches; (iii) the role of interpretability in practical application; and (iv) conceptual insights into how ML has been used in talent identification and development.
Study selectionA total of 228 records were identified through database searches (PubMed, n = 28; Scopus, n = 128; Web of Science, n = 72). After removal of 83 duplicates, 145 records were screened by title and abstract, of which 63 were excluded. The remaining 82 reports were retrieved in full text, with none unretrievable. Following detailed eligibility assessment, 55 reports were excluded, primarily due to population not meeting inclusion criteria (n = 53) or intervention/outcomes not relevant (n = 2). Ultimately, 27 studies fulfilled all criteria and were included in the systematic review (Figure 1).
Study characteristicsAcross the 27 studies included in this review, most (n=13) focused exclusively on football (soccer), reflecting its global prominence in youth talent pathways (Zhao et al., 2019; Jauhiainen et al., 2019; Abidin, 2021; Owen et al., 2022; Kelly et al., 2022; Abidin and Erdem, 2025). Other team sports examined included Australian Rules Football (Woods et al., 2018b; a; Gogos et al., 2020; Jennings et al., 2024), rugby (Woods et al., 2018a; Owen et al., 2022), basketball (Ge, 2024; Contreras-García et al., 2024), cricket (Brown et al., 2024), and volleyball (Formenti et al., 2022; Sanjaykumar et al., 2024). Sample sizes varied considerably, from very small academy samples as n=21 (Abidin, 2021) or n=22 (Formenti et al., 2022) to large federated datasets as 13,876 (Altmann et al., 2024) or n=2222 (Abidin and Erdem, 2025). While most studies reported male-only samples, some included both sexes (de Almeida-Neto et al., 2023; Ge, 2024) or were female-focused (Formenti et al., 2022; Sanjaykumar et al., 2024). Reporting of biological maturation was inconsistente, since some reported explicitly the maturation (de Almeida-Neto et al., 2023; Brown et al., 2024; Duncan et al., 2024), others not reported in many academy datasets (Altmann et al., 2024; Abidin and Erdem, 2025). In terms of data domains, studies frequently combined anthropometric and physical performance measures (Craig and Swinton, 2021; de Almeida-Neto et al., 2023; Ge, 2024), but increasingly incorporated technical, psychological, perceptual–cognitive, or socio-cultural variables (Owen et al., 2022; Formenti et al., 2022; Brown et al., 2024). Supervised ML approaches predominated, with common algorithms including Random Forest (Abidin, 2021; Owen et al., 2022), Support Vector Machines (Razali et al., 2017; Abidin, 2021), penalized regression (Craig and Swinton, 2021; Kelly et al., 2022), and neural networks (de Almeida-Neto et al., 2023; Jennings et al., 2024). A smaller subset used unsupervised or hybrid approaches for clustering or anomaly detection (Jauhiainen et al., 2019; Ge, 2024; Contreras-García et al., 2024). Validation practices varied: while some employed robust strategies such as nested cross-validation (Altmann et al., 2024) or prospective external testing (Jennings et al., 2024), others relied only on internal resampling or leave-one-out (Razali et al., 2017; Formenti et al., 2022). Reporting of interpretability methods was uneven since some studies (Retzepis et al., 2024; Altmann et al., 2024) applied SHAP values, while others (Woods et al., 2018b; Abidin, 2021) relied on simpler feature rankings, and many did not address interpretability at all (Theagarajan and Bhanu, 2021; Sanjaykumar et al., 2024). The Figure 2 summarizes the distribution of methodological rigor across different machine learning approaches used in youth-focused talent identification and development research. The chart highlights that most studies employed supervised, non-deep learning models with cross-validation as the primary evaluation method, while nested, temporal, or external validation approaches were rare.
Risk of bias in studiesAcross the 27 included studies, the PROBAST assessment (Table 2) showed that 19 studies (70.4%) were rated Low risk of bias for Participants, and 20 studies (74.1%) for Predictors. In contrast, 13 studies (48.1%) were rated Unclear for Outcomes, and 13 studies (48.1%) were judged High risk in Analysis. Overall, 16 studies (59.3%) were assessed as having High risk of bias. Regarding applicability, 11 studies (40.7%) raised Some concern for Participants, 15 studies (55.6%) were rated Low concern for Predictors, and 11 studies (40.7%) were judged as having High concern for Outcomes.
Synthesis of studiesTable 3 synthesizes studies that focus primarily on selection prediction within talent identification systems, where ML models were used to determine whether athletes would be admitted, retained, or promoted at different stages of development. These works investigated diverse sports and settings, ranging from youth soccer academies (Jauhiainen et al., 2019; Craig and Swinton, 2021; Altmann et al., 2024), cricket county programes (Brown et al., 2024), rugby union regional selection (Owen et al., 2022), and Australian football drafts (Jennings et al., 2024). Studies also included models for admission and branch allocation in sport schools (Abidin and Erdem, 2025), as well as selection support tools for school athletes (Theagarajan and Bhanu, 2021). Table 4 summarizes studies that applied ML to predict technical or physiological performance outcomes in sport. A study (Cornforth et al., 2015) reealed that regression models using pre-match heart rate variability (HRV) and environmental data could predict in-game outputs in Australian football. More recent studies employed ML to model skill-specific outcomes in youth soccer, such as dribbling performance (Duncan et al., 2024) and test-based fitness under environmental stressors (Sandamal et al., 2024). Similarly, Sanjaykumar et al. (Sanjaykumar et al., 2024) showed that Random Forest and XGBoost could accurately predict volleyball performance from anthropometric and body composition data. Table 5 compiles studies exploring the use of ML for team formation and playing position classification, where algorithms aim to replicate or optimize decisions traditionally made by coaches. A study (Abidin, 2021) tested multiple ML models for both position assignment and lineup generation in youth soccer, demonstrating high concordance with coach decisions. Other study (Razali et al., 2017) developed a prototype system to classify football players into positional roles using physical, mental, and technical ratings, validated by expert coach evaluation. Finally a study (Woods et al., 2018b) examined positional classification in elite junior Australian football using technical skill indicators, highlighting the limitations of conventional statistics for discriminating playing roles. Table 6 includes studies that address broader or emerging applications of ML in talent identification and development, spanning orientation, specialization, profiling, maturation, and scouting support. Examples include orientation of youth into appropriate sports using morphological and neuromuscular profiles (de Almeida-Neto et al., 2023), detection of premature specialization in basketball (Contreras-García et al., 2024), fitness assessment with deep learning (Ge, 2024), and forecasting AFL career outcomes (Gogos et al., 2020). Other studies investigated multidimensional predictors of progression (Kelly et al., 2022), latent factor modeling of youth soccer assessments (Kilian et al., 2023), and scouting frameworks in women’s and men’s football (Venkataraman et al., 2024; López-De-Armentia, 2024). A study (Retzepis et al., 2024) applied explainable ML to maturation prediction, while other (Woods et al., 2018a) compared gameplay profiles of youth vs. senior rugby league. Finally a study (Zhao et al., 2019) demonstrated cross-sport profiling with anthropometric and physiological tests.
This systematic review synthesized evidence on the application of ML methods in sport TID and development. Across the included studies, ML was employed for diverse purposes, ranging from predicting selection and performance outcomes to supporting team formation, profiling, maturation assessment, and scouting. The findings highlight the challenges of applying ML in this domain: on one hand, advanced algorithms can capture complex, multidimensional patterns that traditional statistical approaches may overlook; on the other, the heterogeneity of data types, small sample sizes, and lack of external validation continue to limit their translational value. This capacity to model multidimensional structure aligns closely with the ecological dynamics view of talent development, in which performance emerges from interaction-dominant rather than variable-dominant processes. ML’s real strength lies not merely in detecting correlations among isolated predictors but in uncovering higher-order patterns that emerge from the interaction of biological, psychological, and environmental constraints (Reis et al., 2024). Accordingly, future research should prioritize feature sets and modeling approaches that represent these interdependent relationships - such as contextual, temporal, and relational variables - thereby aligning computational design with the ecological nature of athlete development.
Selection predictionThe synthesis of selection-focused studies demonstrates that ML models can capture important physical, technical, psychological, and socio-cultural factors associated with advancement or deselection in talent pathways. Models such as XGBoost, neural networks, and one-class SVMs achieved moderate to high predictive validity in academy soccer (Jauhiainen et al., 2019; Jennings et al., 2024; Altmann et al., 2024), while decision trees and hybrid deep learning architectures produced high accuracy in school-based settings (Theagarajan and Bhanu, 2021; Abidin and Erdem, 2025). Several studies emphasized that physical and skill-related variables (e.g., sprinting ability, countermovement jump, ball control) remain consistently influential in selection decisions, while psychological characteristics such as coping under pressure and emotional regulation also emerged as critical predictors (Owen et al., 2022; Kelly et al., 2022). Importantly, socio-cultural and relative age effects were shown to influence outcomes, underscoring that selection is not solely determined by athletic performance (Craig and Swinton, 2021; Brown et al., 2024). Nevertheless, these studies highlight important limitations. Predictive accuracies often fell below thresholds typically required for decision-making in practice (e.g., AUC < 0.70, (Altmann et al., 2024)), while external validation was rare, raising concerns about generalizability across sports, contexts, and samples. This pattern underscores a crucial conceptual distinction between apparent validity - performance measured within the development sample - and transportable validity, which reflects how well a model generalizes to independent, real-world contexts. For example, a model predicting academy selection may achieve high internal accuracy (AUC ≈ 0.85) through resampling or cross-validation, yet when applied to a different club, season, or cohort, its performance may degrade to AUC ≈ 0.65. Such declines are not merely statistical artifacts but manifestations of the context-bound, dynamic nature of athlete development, where the distribution of constraints and opportunities shifts across settings. Recognizing this difference reinforces that external validation is not only a methodological requirement but a theoretical test of whether the modeled relationships capture genuine developmental regularities rather than local sampling patterns. Many models also relied too much on physical test data, which limits interpretability when predicting long-term success within already selected elite groups (Craig and Swinton, 2021). Small sample sizes and imbalance between selected and deselected athletes further restrict model robustness (Jauhiainen et al., 2019). These findings emphasize that ML should not replace expert judgment but instead complement existing scouting frameworks. Moreover, the dominance of soccer-based studies likely shapes the implicit model priors in this field, since features that are salient in invasion games (e.g., intermittent high-speed running, rapid change of direction, spatial–temporal awareness, and transition behaviors) are overrepresented in training data and outcome labels. As a result, ML models - and the feature-engineering conventions they normalize - may capture sport-specific regularities that do not readily transfer to sports with different task dynamics. This concentration can narrow ecological validity, as the performer–environment couplings and constraint sets underpinning soccer differ from those governing performance in sports such as volleyball. Expanding the evidence base beyond invasion games and encouraging cross-sport external validation would therefore strengthen the domain generalizability of ML applications in TID.
Performance predictionStudies applying ML to performance prediction showed promising results in linking physiological and technical markers with skill-based and in-game outcomes. Early work (Cornforth et al., 2015) demonstrated that heart rate variability and environmental data could moderately predict match loads in Australian football. More recent studies (Duncan et al., 2024; Sandamal et al., 2024) expanded to youth skill assessment, where ML algorithms predicted soccer dribbling ability and test-based fitness with high accuracy when including multidimensional features such as fundamental motor skills, anthropometry, and hormonal profiles. Random Forest and XGBoost emerged as strong performers, offering predictive power and capturing non-linear relationships in volleyball performance from anthropometric data (Sanjaykumar et al., 2024). Despite these advances, performance prediction studies also exhibit challenges. The use of laboratory or field-test performance outcomes raises questions about ecological validity for predicting actual match performance. Furthermore, over-reliance on physiological data may neglect tactical, cognitive, and psychosocial contributors to performance. While explainable ML techniques provide interesting information into feature importance, few studies validated whether these insights align with real-world coaching expertise. To enhance translation, future work should integrate multimodal data sources and conduct prospective validation in competitive environments.
Team formation & position classificationThe reviewed studies demonstrate that ML can approximate and in some cases outperform coach-derived decisions regarding position classification and team formation. For example, Random Forest and Multilayer Perceptrons achieved >90% accuracy in predicting player positions and generating lineups closely resembling coaches’ choices in youth soccer (Abidin, 2021). Bayesian and tree-based models also assigned players to suitable positions with very high accuracy when using multidimensional skill ratings (Razali et al., 2017). Even when accuracy was lower, as in Australian football positional classification (Woods et al., 2018b), ML revealed meaningful patterns, such as the overlap between defenders and forwards, or the distinctiveness of midfielders. However, most models were trained on small or academy-level datasets, limiting their generalizability across contexts. For instance in Australian football study (Woods et al., 2018b), poor classification of rucks highlighted that some roles remain underrepresented or difficult to capture with standard performance indicators. External or longitudinal validation of team formation models is virtually absent, and practical adoption will require integration with real-time data streams rather than retrospective or synthetic datasets. Thus, while ML shows strong potential in complementing coaching decisions, its utility remains contingent on larger, multi-sample validation and the inclusion of richer, role-specific features.
Profiling, development, scouting & maturationStudies beyond direct selection and performance prediction illustrate the expanding scope of ML in talent identification and development. Morphological and neuromuscular profiling models showed value for orienting youth into appropriate sports (de Almeida-Neto et al., 2023), while cluster and outlier analyses revealed concerning early specialization patterns in basketball compared with professional norms (Contreras-García et al., 2024). Deep learning models integrating autoencoders and Gaussian mixtures provided accurate classification of youth fitness levels (Ge, 2024), while explainable ML approaches accurately predicted biological maturation status (Retzepis et al., 2024). Studies on scouting systems in women’s and men’s football (Venkataraman et al., 2024; López-De-Armentia, 2024) highlight the growing use of ML and automated data collection in expanding recruitment pipelines, particularly where resources are scarce. These findings underline ML’s versatility in supporting orientation, development monitoring, and scouting beyond narrow predictive tasks. Nevertheless, several limitations constrain the translation of these broader applications. Many studies remain proof-of-concept, conducted with small or single-institution datasets (de Almeida-Neto et al., 2023; Retzepis et al., 2024), or descriptive case studies without predictive validation (Venkataraman et al., 2024). External generalizability is especially limited where region-specific environmental effects or sample-specific datasets dominate (Sandamal et al., 2024; Contreras-García et al., 2024).
Limitations on ML reportingAcross the included studies, the analysis domain emerged as the most frequent source of high risk of bias, primarily due to small samples, reliance on internal validation, or use of synthetic/augmented data without adequate safeguards against optimism. For example, a study (Abidin, 2021) relied on only 21 real players supplemented with synthetic augmentation, producing very high accuracies but at the expense of validity. Similarly, another study (Abidin and Erdem, 2025) reported accuracies above 97% but did so without external validation and with imbalanced data, leaving open the possibility of overfitting. Even in larger, better-resourced settings (Altmann et al., 2024), while participants and predictors were appropriately defined, the lack of calibration and external testing led to an overall “unclear” rating in the analysis domain. These aspects suggest that although predictive modeling is advancing in youth TID research, methodological rigor in handling imbalance, avoiding leakage, and validating models externally is still uncommon. A second recurrent issue relates to applicability of predictors and outcomes, especially where subjective or indirect measures were used. For instance, studies using coach-rated assessments as input variables (Abidin, 2021; Abidin and Erdem, 2025) faced concerns that these subjective scores could embed bias or even overlap with the outcome being predicted. Other study (de Almeida-Neto et al., 2023) used cross-sport orientation outcomes rather than within-sport selection, which limited the direct applicability of their findings to talent identification in team sports. In contrast, where predictors were standardized and outcomes were objectively defined (Craig and Swinton, 2021), the risk of bias was lower, even if model performance was weak. Overall, most included studies were judged at least “some concern” for applicability, underscoring that future work should prioritize transparent, objective measures aligned closely with actual selection or progression outcomes.
Limitations of this systematic review, future research and practical applicationsThis review has limitations that should be acknowledged. Despite a comprehensive search and systematic screening process, it is possible that relevant studies were missed, particularly those published in grey literature (e.g., technical reports, theses). The exclusion of grey literature was a deliberate methodological choice to maintain peer-reviewed quality standards; however, it introduces the possibility of publication bias, as studies reporting weaker or non-significant results are less likely to appear in indexed journals. Consequently, the synthesized evidence may overrepresent positive findings and potentially overestimate ML model performance. This limitation may be important, as it reflects a broader tendency within data-driven research toward selective visibility of success - a phenomenon that underscores the need for greater transparency, data sharing, and preregistration in ML-based sports science. Moreover, the heterogeneity of sports, outcome measures, and machine learning approaches precluded meta-analysis and restricted the synthesis to a structured narrative. The reliance on published results also meant that incomplete reporting of performance metrics or validation methods could not be clarified or supplemented, further limiting interpretability. Finally, as many included studies were exploratory, single-sample, or lacked external validation, the evidence base summarized here represents an emerging rather than mature field. Interpretability emerged as one of the least consistently addressed dimensions across studies, yet it represents a continuum of conceptual transparency rather than a binary property. At the most basic level, interpretability can involve global feature importance or coefficient-based rankings that indicate which variables most influence predictions. More advanced methods, such as SHAP (SHapley Additive Explanations) or LIME (Local Interpretable Model-Agnostic Explanations), allow for instance-level attribution, showing how specific inputs contribute to individual outcomes. At the highest tier, counterfactual reasoning provides actionable insight by simulating how changes in certain features might alter selection probabilities or developmental trajectories. Viewing interpretability hierarchically underscores that transparency in ML is scalable—from descriptive feature inspection to causal exploration—and that its depth should align with the practical stakes of decision-making in TID. Looking ahead, future research should prioritize larger, longitudinal, and multi-sport datasets that allow for robust model development and both statistical and ecological external validation. In addition to conventional hold-out or cross-cohort testing, ecological external validation involves evaluating model performance across different clubs, regions, and competition levels to ensure contextual robustness and ecological realism. Such cross-setting validation helps determine whether predictive patterns reflect genuine developmental principles or context-specific artifacts, bridging methodological rigor with the complex, adaptive nature of sport environments. Standardized reporting of ML pipelines - including feature engineering, calibration assessment, validation strategies, and interpretability methods - would improve transparency and comparability across studies. Greater integration of multidimensional data is also needed to capture the complexity of talent development. Moreover, collaboration between sport scientists, data scientists, and practitioners will be essential to ensure that models are not only accurate but also interpretable, ethically sound, and practically relevant. By embracing open science practices and methodological rigor, the field can move beyond optimism bias toward a more cumulative, self-correcting body of evidence that meaningfully informs talent identification and development systems. To enhance reproducibility and comparability, future ML studies in talent identification should adopt, at minimum, clearly describe their data partitioning strategy, including whether splits were performed at the athlete or trial level; outline steps for leakage control to prevent information overlap between training and testing sets; report how class imbalance was handled within validation folds; and include both discrimination and calibration metrics (e.g., AUC, Brier score, calibration slope). In addition, transparency around fairness auditing - such as assessing model performance across relative-age quartiles, sex, or maturation status - will improve interpretability and ethical accountability. Consistent reporting of these elements would substantially strengthen the methodological quality, transparency, and applied trustworthiness of ML research in youth talent identification. To promote equitable predictions across subpopulations, we propose a minimal fairness framework specifying main covariates that should be recorded, modeled, and audited in youth TID, as exemples, birth quarter/relative age, biological maturation status (e.g., PHV indicators), and socio-economic background (e.g., school type or deprivation index), alongside sex and playing context (e.g., region/club resource level). These variables should be (i) pre-specified in protocols, (ii) considered as features or stratification factors where appropriate, and (iii) subjected to subgroup and intersectional audits reporting discrimination, calibration, and error-rate parity at a stated operating point. If disparities are detected, studies should apply bias-mitigation procedures (e.g., reweighting, stratified sampling, threshold adjustment, post-hoc recalibration) and re-report subgroup metrics. From a practical standpoint, the findings of this review suggest that ML may have potential to complement, rather than replace, traditional talent identification and development practices. Current evidence indicates that ML models can highlight patterns across large, multidimensional datasets and may assist coaches and scouts in refining their decisions or monitoring athlete development. However, given the frequent limitations of small sample sizes, context-specific data, and limited external validation, these tools should be viewed as exploratory decision-support aids rather than definitive selection instruments. Practitioners are advised to use ML outputs in conjunction with expert judgment, holistic evaluation of athletes, and awareness of potential biases (e.g., relative age, socio-cultural influences). This complementary role can be understood along two interconnected pathways, namely an operational pathway, in which ML assists practitioners with data-driven screening, workload monitoring, and early flagging of developmental trends to enhance decision efficiency, and a discovery pathway, where ML identifies novel, interaction-based patterns among physical, technical, and psychosocial constraints that can inform longitudinal experimentation and theory development. These pathways illustrate that the value of ML lies not in replacing human expertise but in augmenting it - bridging empirical discovery with applied decision-making in youth talent systems. Careful integration in practice may enhance efficiency and provide additional perspectives, but overreliance on unvalidated models risks reinforcing existing inequalities or producing misleading conclusions. To operationalize these findings, practitioners could adopt tiered decision protocols in which ML models are first used for broad early screening - prioritizing high sensitivity to avoid missing potential talent - followed by structured expert evaluation emphasizing context, adaptability, and psychosocial maturity. Such hybrid frameworks can combine algorithmic efficiency with human interpretive depth, ensuring that automated outputs inform but do not dictate selection. In this way, ML functions as an evidence-based triage tool that supports individualized monitoring, facilitates ongoing re-evaluation, and helps direct coaching resources toward athletes with emerging potential rather than early advantage. From a practitioner perspective, the implementation of ML in TID can also be conceptualized as a sequential decision pathway encompassing model development, validation, deployment, and monitoring. During development, multidisciplinary teams should ensure data representativeness, apply rigorous leakage control, and use nested cross-validation to optimize model tuning. Validation should progress from internal to independent external testing to evaluate transportability and calibration before any operational use. In deployment, ML outputs should serve as decision-support tools within structured selection frameworks - for instance, as high-sensitivity screening aids that prompt subsequent expert evaluation. Finally, ongoing monitoring is essential to detect model drift, reassess fairness across athlete subgroups, and recalibrate performance metrics as data and populations evolve. This cyclical process ensures that ML models remain methodologically sound, contextually relevant, and ethically aligned with the developmental principles of youth sport.
This systematic review found that research applying ML in sport talent identification remains limited in scope but expanding. The majority of available studies focused on selection prediction tasks, particularly in soccer and other team sports, where algorithms were used to forecast admission, progression, or draft success. A smaller but growing body of work addressed performance prediction, leveraging physiological, anthropometric, or cognitive markers to estimate test results or in-game performance. Fewer studies explored team formation and positional classification, and an emerging set of contributions examined broader applications such as profiling, maturation, and scouting support. Across domains, Random Forest, gradient boosting methods, and neural networks were the most frequently applied, often achieving moderate to high internal accuracy. However, very few studies provided external validation, and most were conducted on relatively small, single-sport or academy-specific datasets, limiting generalizability. The findings suggest that while ML offers clear potential to enrich talent identification and development systems, its current role should be viewed as exploratory and complementary rather than decisive. The predominance of selection-focused studies highlights a narrow evidence base, with underrepresentation of longitudinal designs, female athletes, and diverse sporting contexts. Moreover, interpretability methods - although increasingly adopted - remain inconsistently applied, and socio-cultural or psychological factors are still less frequently integrated than physical and technical measures. Future progress will depend on larger, multi-sample datasets, standardized reporting of algorithms and metrics, and collaborative efforts to embed interpretability and equity within predictive pipelines. Until such methodological and theoretical maturity is achieved, the use of ML in practice should remain cautious, serving as a support to - not a substitute for - expert judgment and holistic athlete evaluation. Ultimately, in youth TID, transparency, transportability, and theoretical coherence are the pillars upon which meaningful ML applications must be built.
| ACKNOWLEDGEMENTS |
This study was supported by the Project of China West Normal University, Project Number: [CWNUJG2024098]. The author reports no actual or potential conflicts of interest. The datasets generated and analyzed in this study are not publicly available, but are available from the corresponding author who organized the study upon reasonable request. All experimental procedures were conducted in compliance with the relevant legal and ethical standards of the country where the study was performed. |
|
| AUTHOR BIOGRAPHY |
|
 |
Qingrong Tang |
| Employment: Geely University of China, Chengdu, 641400, China |
| Degree: PhD |
| Research interests: Artificial Intelligence., etc. |
| E-mail: tangqr19@gmail.com |
| |
 |
Xiufang Wei |
| Employment: College of physical education and health, China West Normal University, Nanchong, 637002, China |
| Degree: M.Ed. |
| Research interests: Physical education, etc. |
| E-mail: 15328439955@163.com |
| |
 |
Bo Tan |
| Employment: Geely University of China, Chengdu, 641400, China |
| Degree: PhD |
| Research interests: Sports training, sports psychology, etc. |
| E-mail: 920514879@qq.com |
| |
|
| |
| REFERENCES |
 Abidin D. (2021) A case study on player selection and team formation in football with machine learning. Turkish Journal of Electrical Engineering and Computer Sciences 29, 1672-1691. Crossref
|
 Abidin D., Erdem M. G. (2025) SCM-DL: Split-Combine-Merge Deep Learning Model Integrated With Feature Selection in Sports for Talent Identification. IEEE Access 13, 71148-71172. Crossref
|
 Altmann S., Ruf L., Thiem S., Beckmann T., Wohak O., Romeike C., Härtel S. (2024) Prediction of talent selection in elite male youth soccer across 7 seasons: A machine-learning approach. Journal of Sports Sciences , 1-14. Crossref
|
 Barraclough S., Till K., Kerr A., Emmonds S. (2022) Methodological approaches to talent identification in team sports: A narrative review. Sports 10, 81. Crossref
|
 Beato M., Jaward M. H., Nassis G. P., Figueiredo P., Clemente F. M., Krustrup P. (2025) An educational review on machine learning: A SWOT analysis for implementing machine learning techniques in football. International Journal of Sports Physiology and Performance 20, 183-191. Crossref
|
 Brown T., Cook R., Gough L. A., Khawaja I., McAuley A. B. T., Kelly A. L. (2024) Exploring the multidimensional characteristics of selected and non-selected White British and British South Asian youth cricketers: An exploratory machine learning approach. Youth 4, 718-734. Crossref
|
 Chmait N., Westerbeek H. (2021) Artificial intelligence and machine learning in sport research: An introduction for non-data scientists. Frontiers in Sports and Active Living , 3. Crossref
|
 Cobley S., Baker J., Wattie N., McKenna J. (2009) Annual age-grouping and athlete development: A meta-analytical review of relative age effects in sport. Sports Medicine 39, 235-256. Crossref
|
 Collins G. S., Moons K. G. M., Dhiman P., Riley R. D., Beam A. L., Van Calster B., Ghassemi M., Liu X., Reitsma J. B., van Smeden M., Boulesteix A.-L., Camaradou J. C., Celi L. A., Denaxas S., Denniston A. K., Glocker B., Golub R. M., Harvey H., Heinze G., Hoffman M. M., Kengne A. P., Lam E., Lee N., Loder E. W., Maier-Hein L., Mateen B. A., McCradden M. D., Oakden-Rayner L., Ordish J., Parnell R., Rose S., Singh K., Wynants L., Logullo P. (2024) TRIPOD+AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ , e078378. Crossref
|
 Contreras-García J. M., Courel-Ibáñez J., Piñar-López M. I., Ibáñez S. J. (2024) Early specialization in formative basketball: A machine learning analysis of shooting patterns in U14 and professional players. Journal of Sports Sciences , 1-8. Crossref
|
 Cordeiro M. C., Cathain C. O., Daly L., Kelly D. T., Rodrigues T. B. (2025) A synthetic data-driven machine learning approach for athlete performance attenuation prediction. Frontiers in Sports and Active Living , 7. Crossref
|
 Cornforth D., Campbell P., Nesbitt K., Robinson D., Jelinek H. F. (2015) Prediction of game performance in Australian football using heart rate variability measures. International Journal of Signal and Imaging Systems Engineering 8, 80-88. Crossref
|
 Craig T. P., Swinton P. (2021) Anthropometric and physical performance profiling does not predict professional contracts awarded in an elite Scottish soccer academy over a 10-year period. European Journal of Sport Science 21, 1101-1110. Crossref
|
 de Almeida-Neto P. F., Neto R. B., de Matos D. G., de Medeiros J. A., Bulhões-Correia A., Jeffreys I., Lobato C. H., Aidar F. J., Dantas P. M. S., Cabral B. G. A. T. (2023) Using artificial neural networks to help in the process of sports selection and orientation through morphological and biodynamic parameters: A pilot study. Sport Sciences for Health 19, 929-937. Crossref
|
 de Jong Y., Ramspek C. L., Zoccali C., Jager K. J., Dekker F. W., van Diepen M. (2021) Appraising prediction research: A guide and meta-review on bias and applicability assessment using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). Nephrology 26, 939-947. Crossref
|
 Deprez D., Buchheit M., Fransen J., Pion J., Lenoir M., Philippaerts R. M., Vaeyens R. (2015) A longitudinal study investigating the stability of anthropometry and soccer-specific endurance in pubertal high-level youth soccer players. Journal of Sports Science and Medicine 14, 418-426. Crossref
|
 Duncan M. J., Eyre E. L. J., Clarke N., Hamid A., Jing Y. (2024) Importance of fundamental movement skills to predict technical skills in youth grassroots soccer: A machine learning approach. International Journal of Sports Science & Coaching 19, 1042-1049. Crossref
|
 Finnegan L., van Rijbroek M., Oliva-Lozano J. M., Cost R., Andrew M. (2024) Relative age effect across the talent identification process of youth female soccer players in the United States: Influence of birth year, position, biological maturation, and skill level. Biology of Sport 41, 241-251. Crossref
|
 Formenti D., Trecroci A., Duca M., Vanoni M., Ciovati M., Rossi A., Alberti G. (2022) Volleyball-specific skills and cognitive functions can discriminate players of different competitive levels. Journal of Strength and Conditioning Research 36, 813-819. Crossref
|
 Gallitto G., Englert R., Kincses B., Kotikalapudi R., Li J., Hoffschlag K., Bingel U., Spisak T. (2025) External validation of machine learning models—registered models and adaptive sample splitting. GigaScience , 14. Crossref
|
 Ge C. (2024) Optimization study of a dynamic assessment model of physical fitness for youth basketball training. Applied Mathematics and Nonlinear Sciences , 9. Crossref
|
 Gogos B. J., Larkin P., Haycraft J. A. Z., Collier N. F., Robertson S. (2020) Combine performance, draft position and playing position are poor predictors of player career outcomes in the Australian Football League. PLOS ONE 15, e0234400. Crossref
|
 Güllich A. (2014) Selection, de-selection and progression in German football talent promotion. European Journal of Sport Science 14, 530-537. Crossref
|
 Haan M., de van der Zwaard S., Sanders J., Beek P. J., Jaspers R. T. (2025) Beyond playing positions: Categorizing soccer players based on match-specific running performance using machine learning. Journal of Sports Science and Medicine , 565-577. Crossref
|
 Herrebrøden H., Bjørndal C. T. (2022) Youth international experience is a limited predictor of senior success in football: The relationship between U17, U19, and U21 experience and senior elite participation across nations and playing positions. Frontiers in Sports and Active Living , 4. Crossref
|
 Jamil M., Phatak A., Mehta S., Beato M., Memmert D., Connor M. (2021) Using multiple machine learning algorithms to classify elite and sub-elite goalkeepers in professional men’s football. Scientific Reports 11, 22703. Crossref
|
 Jauhiainen S., Äyrämö S., Forsman H., Kauppi J.-P. (2019) Talent identification in soccer using a one-class support vector machine. International Journal of Computer Science in Sport 18, 125-136. Crossref
|
 Jennings J., Perrett J. C., Wundersitz D. W., Sullivan C. J., Cousins S. D., Kingsley M. I. (2024) Predicting successful draft outcome in Australian rules football: Model sensitivity is superior in neural networks when compared to logistic regression. Plos One 19, e0298743. Crossref
|
 Kapoor S., Narayanan A. (2023) Leakage and the reproducibility crisis in machine-learning-based science. Patterns 4, 100804. Crossref
|
 Kelly A. L., Williams C. A., Cook R., Sáiz S. L. J., Wilson M. R. (2022) A multidisciplinary investigation into the talent development processes at an English football academy: A machine learning approach. Sports 10, 159. Crossref
|
 Kilian P., Leyhr D., Urban C. J., Höner O., Kelava A. (2023) A deep learning factor analysis model based on importance-weighted variational inference and normalizing flow priors: Evaluation within a set of multidimensional performance assessments in youth elite soccer players. Statistical Analysis and Data Mining: The ASA Data Science Journal 16, 474-487. Crossref
|
 Larkin P., O’Connor D. (2017) Talent identification and recruitment in youth soccer: Recruiter’s perceptions of the key attributes for player recruitment. PLOS ONE 12, e0175716. Crossref
|
 Leckey C., van Dyk N., Doherty C., Lawlor A., Delahunt E. (2025) Machine learning approaches to injury risk prediction in sport: A scoping review with evidence synthesis. British Journal of Sports Medicine 59, 491-500. Crossref
|
 López-De-Armentia, J. (2024). WTDTool: Women’s talent detection tool.
In 2024 IEEE International Workshop on Sport, Technology and
Research (STAR) (pp. 144-149). IEEE.
Crossref
|
 Malina R. M., Rogol A. D., Cumming S. P., Coelho E, Silva M. J., Figueiredo A. J. (2015) Biological maturation of youth athletes: Assessment and implications. British Journal of Sports Medicine 49, 852-859. Crossref
|
 Nassis G., Verhagen E., Brito J., Figueiredo P., Krustrup P. (2023) A review of machine learning applications in soccer with an emphasis on injury risk. Biology of Sport 40, 233-239. Crossref
|
 Owen J., Owen R., Hughes J., Leach J., Anderson D., Jones E. (2022) Psychosocial and physiological factors affecting selection to regional age-grade rugby union squads: A machine learning approach. Sports 10, 35. Crossref
|
 Page M. J., McKenzie J. E., Bossuyt P. M., Boutron I., Hoffmann T. C., Mulrow C. D., Shamseer L., Tetzlaff J. M., Akl E. A., Brennan S. E., Chou R., Glanville J., Grimshaw J. M., Hróbjartsson A., Lalu M. M., Li T., Loder E. W., Mayo-Wilson E., McDonald S., McGuinness L. A., Stewart L. A., Thomas J., Tricco A. C., Welch V. A., Whiting P., Moher D. (2021a) The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ , n71-. Crossref
|
 Page M. J., McKenzie J. E., Bossuyt P. M., Boutron I., Hoffmann T. C., Mulrow C. D., Shamseer L., Tetzlaff J. M., Akl E. A., Brennan S. E., Chou R., Glanville J., Grimshaw J. M., Hróbjartsson A., Lalu M. M., Li T., Loder E. W., Mayo-Wilson E., McDonald S., McGuinness L. A., Stewart L. A., Thomas J., Tricco A. C., Welch V. A., Whiting P., Moher D. (2021b) The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Systematic Reviews 10, 89. Crossref
|
 Ravé G., Granacher U., Boullosa D., Hackney A. C. (2020) How to use global positioning systems (GPS) data to monitor training load in the “real world” of elite soccer. Frontiers in Physiology, 11, 944. Crossref
|
 Razali N., Mustapha A., Yatim F. A., Ab Aziz R. (2017) Predicting player position for talent identification in association football. IOP Conference Series: Materials Science and Engineering 226, 012087. Crossref
|
 Reilly T., Williams A. M., Nevill A., Franks A. (2000) A multidisciplinary approach to talent identification in soccer. Journal of Sports Sciences 18, 695-702. Crossref
|
 Reis F. J. J., Alaiti R. K., Vallio C. S., Hespanhol L. (2024) Artificial intelligence and machine learning approaches in sports: Concepts, applications, challenges, and future perspectives. Brazilian Journal of Physical Therapy 28, 101083. Crossref
|
 Retzepis N.-O., Avloniti A., Kokkotis C., Protopapa M., Stampoulis T., Gkachtsou A., Pantazis D., Balampanos D., Smilios I., Chatzinikolaou A. (2024) Identifying key factors for predicting the age at peak height velocity in preadolescent team sports athletes using explainable machine learning. Sports 12, 287. Crossref
|
 Richardson E., Trevizani R., Greenbaum J. A., Carter H., Nielsen M., Peters B. (2024) The receiver operating characteristic curve accurately assesses imbalanced datasets. Patterns 5, 100994. Crossref
|
 Rico-González M., Pino-Ortega J., Méndez A., Clemente F., Baca A. (2023) Machine learning application in soccer: A systematic review. Biology of Sport 40, 249-263. Crossref
|
 Sandamal K., Arachchi S., Erkudov V. O., Rozumbetov K. U., Rathnayake U. (2024) Explainable artificial intelligence for fitness prediction of young athletes living in unfavorable environmental conditions. Results in Engineering 23, 102592. Crossref
|
 Sanjaykumar S., Natarajan S., Lakshmi P. Y., Kalmykova Y., Lobo J., Pavlović R., Setiawan E. (2024) Machine learning analysis for predicting performance in female volleyball players in India. Journal of Human Sport and Exercise 20, 207-215. Crossref
|
 Sarmento H., Anguera M. T., Pereira A., Araújo D. (2018) Talent identification and development in male football: A systematic review. Sports Medicine 48, 907-931. Crossref
|
 Seifert L., Araújo D., Komar J., Davids K. (2017) Understanding constraints on sport performance from the complexity sciences paradigm: An ecological dynamics framework. Human Movement Science 56, 178-180. Crossref
|
 Seifert L., Hacques G., Komar J. (2022) The ecological dynamics framework: An innovative approach to performance in extreme environments: A narrative review. International Journal of Environmental Research and Public Health 19, 2753. Crossref
|
 Smith K. L., Weir P. L., Till K., Romann M., Cobley S. (2018) Relative age effects across and within female sport contexts: A systematic review and meta-analysis. Sports Medicine 48, 1451-1478. Crossref
|
 Theagarajan R., Bhanu B. (2021) An automated system for generating tactical performance statistics for individual soccer players from videos. IEEE Transactions on Circuits and Systems for Video Technology 31, 632-646. Crossref
|
 Till K., Baker J. (2020) Challenges and [possible] solutions to optimizing talent identification and development in sport. Frontiers in Psychology , 11. Crossref
|
 Topol E. J. (2019) High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine 25, 44-56. Crossref
|
 Vabalas A., Gowen E., Poliakoff E., Casson A. J. (2019) Machine learning algorithm validation with a limited sample size. PLOS ONE 14, e0224365. Crossref
|
 Vaeyens R., Lenoir M., Williams A. M., Philippaerts R. M. (2008) Talent identification and development programmes in sport. Sports Medicine 38, 703-714. Crossref
|
 Varghese M., Ruparell S., LaBella C. (2022) Youth athlete development models: A narrative review. Sports Health: A Multidisciplinary Approach 14, 20-29. Crossref
|
 Venkataraman, S., Sundharakumar, K., Bharathi Malakreddy, A. and
Natarajan, S. (2024). YUVA-SQ: A cognitive scouting model for
the beautiful game. In 2024 5th International Conference on
Innovative Trends in Information Technology (ICITIIT) (pp. 1-6). IEEE. Crossref
|
 Williams A. M., Reilly T. (2000) Talent identification and development in soccer. Journal of Sports Sciences 18, 657-667. Crossref
|
 Wolff R. F., Moons K. G. M., Riley R. D., Whiting P. F., Westwood M., Collins G. S., Reitsma J. B., Kleijnen J., Mallett S. (2019) PROBAST: A tool to assess the risk of bias and applicability of prediction model studies. Annals of Internal Medicine 170, 51-58. Crossref
|
 Woods C. T., McKeown I., Rothwell M., Araújo D., Robertson S., Davids K. (2020) Sport practitioners as sport ecology designers: How ecological dynamics has progressively changed perceptions of skill “acquisition” in the sporting habitat. Frontiers in Psychology , 11. Crossref
|
 Woods C. T., Robertson S., Sinclair W. H., Till K., Pearce L., Leicht A. S. (2018a) A comparison of game-play characteristics between elite youth and senior Australian National Rugby League competitions. Journal of Science and Medicine in Sport 21, 626-630. Crossref
|
 Woods C. T., Veale J., Fransen J., Robertson S., Collier N. F. (2018b) Classification of playing position in elite junior Australian football using technical skill indicators. Journal of Sports Sciences 36, 97-103. Crossref
|
 Zhao K., Hohmann A., Chang Y., Zhang B., Pion J., Gao B. (2019) Physiological, anthropometric, and motor characteristics of elite Chinese youth athletes from six different sports. Frontiers in Physiology , 10. Crossref
|
|
| |
|
|
|
|