Research article - (2025)24, 565 - 577
DOI:
https://doi.org/10.52082/jssm.2025.565
Beyond Playing Positions: Categorizing Soccer Players Based on Match-Specific Running Performance Using Machine Learning
Michel de Haan1, Stephan van der Zwaard1,2, Jurrit Sanders3, Peter J. Beek1, Richard T. Jaspers1,
1Department of Human Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
2Department of Cardiology, Amsterdam University Medical Center, Amsterdam, Netherlands
3PSV Eindhoven, Eindhoven, Netherlands

Richard T. Jaspers
✉ Vrije Universiteit Amsterdam, De Boelelaan 1108, 1081 HZ Amsterdam, Netherlands
Email: r.t.jaspers@vu.nl
Received: 26-06-2024 -- Accepted: 27-06-2025
Published (online): 01-09-2025

ABSTRACT

Soccer players are frequently categorized by playing positions, both in the scientific literature and in practice. However, the utility of this approach in evaluating physical match performance and optimizing physical training programs remains unclear. This study compares the effectiveness of categorizing soccer players by their playing position versus using unsupervised machine learning based on match-specific running performance. Match-specific running data were collected from 40 young elite male soccer players over two seasons. Thirty-one of these players completed a 20-meter sprint test and a maximal incremental treadmill test to measure maximal oxygen uptake. Players were categorized both by playing position and by subgroups derived through k-means clustering based on match-specific running performance. Differences in sprint capacity, endurance capacity, and match-specific running performance were compared between and within playing positions, as well as between and within clusters. The two categorization methods were further compared for variance within subgroups and standardized differences between subgroups for total distance (TD), low-intensity running (LIR), moderate-intensity running (MIR), high-intensity running (HIR), and sprint distance during matches. Match-specific running performance differed between playing positions, despite notable inter-individual differences in running intensities within playing positions. Clustering based on match-specific running performance revealed less variance within groups (TD: P = 0.049, LIR: P = 0.032, HIR: P = 0.033) and larger standardized differences between groups (LIR: P = 0.037, MIR: P = 0.041, HIR: P = 0.035, Sprint: P = 0.018) compared to grouping by playing position. Moreover, 20-meter sprint speed differed between the sprint and high intensity endurance clusters (25.22 vs 23.75 km/h, P = 0.012), but not between playing positions. Using unsupervised machine learning to categorize soccer players improves the identification of player groups with similar match-specific running performance, thereby supporting performance evaluation and contributing to the optimization of physical training.

Key words: Clustering, football, artificial intelligence, physiology, sprint speed, V̇O2max

Key Points
  • There is considerable interindividual variation in match-specific running performance within positional groups.
  • Studying the physical capacities, designing training programs or evaluating match-specific running performance of soccer players based on their playing positions is suboptimal.
  • Particularly grouping by forwards, midfielders and defenders should be avoided when evaluating match-specific running performance
  • Identifying subgroups based on match-specific running performance using clustering analysis seems a promising alternative for categorizing soccer players.








Back
|
Full Text
|
PDF
|
Share