Soccer players are frequently categorized by playing positions, both in the scientific literature and in practice. However, the utility of this approach in evaluating physical match performance and optimizing physical training programs remains unclear. This study compares the effectiveness of categorizing soccer players by their playing position versus using unsupervised machine learning based on match-specific running performance. Match-specific running data were collected from 40 young elite male soccer players over two seasons. Thirty-one of these players completed a 20-meter sprint test and a maximal incremental treadmill test to measure maximal oxygen uptake. Players were categorized both by playing position and by subgroups derived through k-means clustering based on match-specific running performance. Differences in sprint capacity, endurance capacity, and match-specific running performance were compared between and within playing positions, as well as between and within clusters. The two categorization methods were further compared for variance within subgroups and standardized differences between subgroups for total distance (TD), low-intensity running (LIR), moderate-intensity running (MIR), high-intensity running (HIR), and sprint distance during matches. Match-specific running performance differed between playing positions, despite notable inter-individual differences in running intensities within playing positions. Clustering based on match-specific running performance revealed less variance within groups (TD: P = 0.049, LIR: P = 0.032, HIR: P = 0.033) and larger standardized differences between groups (LIR: P = 0.037, MIR: P = 0.041, HIR: P = 0.035, Sprint: P = 0.018) compared to grouping by playing position. Moreover, 20-meter sprint speed differed between the sprint and high intensity endurance clusters (25.22 vs 23.75 km/h, P = 0.012), but not between playing positions. Using unsupervised machine learning to categorize soccer players improves the identification of player groups with similar match-specific running performance, thereby supporting performance evaluation and contributing to the optimization of physical training. |