Review article - (2025)24, 543 - 554
DOI:
https://doi.org/10.52082/jssm.2025.543
Impact of Potential Moderating Factors on Absolute Test-Retest Reliability of Grip Strength Measurements in Healthy Populations: A Systematic Review with Meta-Analysis
Takashi Abe1,2,, Jun Seob Song3, Scott J. Dankel4, Ricardo B. Viana5, Akemi Abe2, Jeremy P. Loenneke6
1Graduate School of Health and Sports Science, Institute of Health and Sports Science & Medicine, Juntendo University, Chiba, Japan
2Division of Children’s Health and Exercise Research, Institute of Trainology, Fukuoka, Japan
3Department of Counseling, Health and Kinesiology, Texas A&M University-San Antonio, San Antonio, TX, USA
4Department of Health and Exercise Science, Rowan University, Glassboro, NJ, USA
5Human Anatomy Laboratory, Institute of Physical Education and Sport, Federal University of Ceará, Fortaleza, Ceará, Brazil
6Department of Health, Exercise Science, and Recreation Management, Kevser Ermin Applied Physiology Laboratory, The University of Mississippi, Oxford, MS, USA

Takashi Abe
✉ Institute of Health and Sports Science & Medicine, Juntendo University, 1-1 Hirakagakuendai, Inza-shi, Chiba 270-1695, Japan
Email: t12abe@gmail.com
Received: 18-04-2025 -- Accepted: 25-06-2025
Published (online): 01-09-2025

ABSTRACT

Grip strength, a biomarker, can be measured at any age; however, its values vary daily for each individual, which impacts the assessment. Absolute test-retest reliability (i.e., minimal difference, MD) is commonly defined as the variation in absolute values of measurements taken by a single person or instrument on the same item under identical conditions. Nevertheless, the potential moderators of absolute repeatability in grip strength measurements have not yet been fully elucidated. We conducted a systematic review with meta-analysis to examine the influence of potential moderating factors on the absolute test-retest repeatability of grip strength measurements in healthy populations. PubMed, Scopus, and SPORTDiscus databases were searched up to January 2025 following the PRISMA guidelines, and 48 studies were included in this review. Age, test-retest interval, and device were used as potential moderating factors; however, sex and sports experience were excluded due to the limited number of published articles. We found considerable variation among studies reporting MD and percentage of MD to measured value (%MD) across each age group. The mean MD (%MD) values were 1.9 kg (25.4%) in young children (<7 years old), 2.5 kg (13.8%) in children (7-10 years old), 4.2 kg (17.1%) in adolescents (10-18 years old), 4.0 kg (11.6%) in young adults (18-35 years old), and 4.7 kg (16.7%) in older adults (>60 years old). Neither age [effect size [ES]: 0.015 (95% confidence interval [CI]: -0.004, 0.035; p = 0.113) for MD and ES: -0.025 (95% CI: -0.089, 0.039; p = 0.439) for %MD], test-retest interval [ES: 0.006 (95% CI: -0.002, 0.013; p = 0.143) for MD and ES: 0.022 (95% CI: -0.001, 0.046; p = 0.065) for %MD] nor handgrip device (p = 0.752 for MD and p = 0.334 for %MD) served as significant moderators of MD and %MD reliability. Due to the limited number of studies, sex and sports experience were excluded from the analysis; as a result, their impacts remain unknown.

Key words: Dynamometer, handgrip, reproducibility, peak muscular strength

Key Points
  • In previous studies, the intraclass correlation coefficient (ICC) is often the preferred method for reporting measurement reliability.
  • One major limitation with reporting ICC values is that they are entirely dependent on the heterogeneity of the sample included in the reliability assessment.
  • While there are certainly instances where relative reliability may be necessary, often absolute reliability is preferred and more useful.
  • We found considerable variation among studies reporting absolute test-retest reliability of grip strength tests, such as minimal differences (MD) and the percentage of MD to the measured value across each age group.
  • Neither age, test-retest interval, nor handgrip device served as a significant moderator of MD and percentage of MD reliability.
INTRODUCTION

Grip (or handgrip) strength is an extensively used biomarker in research and clinical practice within the health, sports science, nutrition, and medical fields (Abe et al., 2022; Bohannon, 2015; Bohannon, 2019; Norman et al., 2011). An online literature search (i.e., grip strength as a keyword) using PubMed identified over 48,000 publications at a rate of over 3,500 per year in the last five years. These publications include scientific literature and guidelines discussing grip strength's reference values in each age group in both sexes (Abe et al., 2016; Hanten et al., 1999; Ramirez-Velez et al., 2021) and its association with current and future health (Celis-Morales et al., 2017; Peralta et al., 2023; Rantanen et al., 2003). For example, grip strength increases dramatically from preschool children to young adults, maintains stability in middle age, and then declines in old age (Abe et al., 2024; Loenneke et al., 2024; Stenholm et al., 2012). In children and adolescents, grip strength may be a valuable indicator of bone health that improves with growth (Saint-Maurice et al., 2018). Grip strength is also used as a criterion for diagnosing sarcopenia in middle-aged and older men and women (Cruz-Jentoft et al., 2019). However, individual grip strength varies daily, and the degree of these changes may differ depending on age, sex, device type, and sports experience (i.e., athletes). For example, assuming similar daily variation in each individual, the absolute test-retest reliability of grip strength measurements is expected to differ, being lower in children and older adults with low grip strength levels than in younger adults. The same is true for both men and women in adolescence and beyond. Additionally, athletes in sports may be better equipped to consistently exert maximum muscle strength. Thus, measurement error should be considered when comparing the measured grip strength values with the evaluation or diagnosis criteria.

As is the case with many studies assessing reliability in the exercise science literature, the intraclass correlation coefficient (ICC) appears to be the preferred method for reporting the reliability of grip strength measurements (Bobos et al., 2020; Bohannon, 2017). One major limitation with reporting ICC values is that they are entirely dependent on the heterogeneity of the sample included in the reliability assessment (i.e. between subject variability) given that the ICC is calculated with the following formula:

Thus, if the sample recruited is very homogenous (low between subject variability), the ICC values will likely be small demonstrating poor reliability, even if the absolute test-retest reliability is good. On the contrary, if the sample recruited is very heterogenous (high between subject variability), the ICC values will likely be high demonstrating good reliability, even if the absolute test-retest reliability is not good (Weir, 2005). While there are certainly instances where relative reliability may be important (i.e. epidemiologic studies, correlations between variables, etc.), often absolute reliability (i.e., standard error of measurement (SEM) and minimal difference (MD)) is preferred and more useful (Weir, 2005). For example, whenever repeated measures are used (i.e. training interventions), individuals are only compared to themselves, so factoring in between subject variability is not important. However, good absolute test-retest reliability will reduce the variability (i.e. error) amongst repeated measures, improving the ability to detect true changes by reducing the denominator in the test statistic. The same holds true for detecting differences between groups, where better absolute test-retest reliability within each group is preferred to lower the pooled standard error. Despite the importance of establishing absolute test-retest reliability of grip strength, it is unclear to what extent it varies with age and other physical factors and measurement methods (e.g., time interval, device used). Therefore, this systematic review with meta-analysis examined the impact of potential moderators on absolute test-retest repeatability of grip strength measurements in healthy populations.

METHODS

We conducted this systematic review according to the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) statement (Page et al., 2021). The study was pre-registered (January 8, 2025) in the International Prospective Register of Systematic Reviews (PROSPERO) (CRD42025635760).

English-language searches of the electronic databases PubMed, Scopus, and SPORTDiscus were conducted from inception to January 9, 2025, by a researcher (J.S.S.). Relevant articles were retrieved from electronic databases combining the following terms: (handgrip OR grip) AND (reliability OR retest OR reproducibility OR repeatability). Initially, all files were extracted from databases in either RIS (Scopus and SPORTDiscus) or NBIB (PubMed) format. The files were then uploaded into Rayyan software, where duplicates were eliminated. Subsequently, two reviewers (T.A. and J.S.S.) independently checked the titles and abstracts of identified articles for relevance. The reviewers then independently reviewed the full text of potentially eligible papers. Any disagreements between the reviewers on inclusion were resolved by a consensus between both reviewers. Additional articles were identified via hand-searching and reviewing the reference list of relevant papers. The study selection process is summarized using the PRISMA flow diagram (Figure 1).

To be included in this systematic review, studies were required to fulfill the following criteria: (1) a published original study written in English language; (2) healthy participants with no restrictions on age, sex, and physical activity/training status; (3) measured maximum handgrip strength using a standardized handgrip dynamometer at both test and retest, with the same investigator conducting the measurements (i.e., intra-observer); and (4) reported absolute reliability (i.e., MD) or provided data needed to calculate absolute reliability (e.g., standard error of measurement, standard deviation of test-retest mean difference). Studies were included if they targeted healthy participants based on the title and abstract of the articles, but were excluded if they targeted participants with any diseases. When reliability information was not available in the title and abstract, we examined the characteristics of the participants in the articles. If there was no mention of a study participant's chronic diseases, they were considered "healthy individuals." If a study did not report absolute reliability, then we calculated the MD using: the standard deviation of test-retest mean difference (SDd) (Equation 1); and SEM (Equation 2).

Equation 1: MD = SDd × 1.96

Equation 2: MD = SEM × 1.96 × √2

If a study reported the test-retest pooled SD with intraclass correlation (ICC), we first calculated the SEM using Equation 3. The calculated SEM was then used to determine the MD using Equation 2. For studies that did not report the test-retest pooled SD, we used the SD of test (pre-test), assuming that test-retest pooled SD and test SD would be similar, as both tests were completed by the same individuals.

Equation 3: SEM = test & retest pooled SD (or test SD) × √ (1 – ICC)

The percentage MD (%MD) was calculated following Equation 4. When the study did not report test-retest pooled values, the mean grip strength was calculated as average using the test and retest values.

Equation 4: %MD = MD ÷ mean test & retest grip strength × 100

The following study characteristics were extracted: authors, publication year, participants’ characteristics (age, sex, and health status), sample size, handgrip device, time interval between test and retest, handgrip strength at test and retest (with SD), test and retest mean difference (with SD), ICC, SEM, and MD. Two researchers (T.A. and J.S.S.) extracted these data manually, with disagreement resolved by consensus between both researchers. To standardize grip strength values to kilogram units, those reported in newtons were converted as follows: 1 N = 0.10197 kg. In studies reporting the sum of grip strength values for both the left and right hands, the MD value was divided by 2 for statistical processing, as many studies used grip strength values and MD values for one hand.

A modified version of the critical appraisal tool was utilized to evaluate the methodological quality of the studies included in this review (Brink and Louw, 2011), and two researchers (S.J.D. and J.S.S.) independently evaluated the included studies. Seven relevant items were extracted from the modified checklist: 1) Subject characteristics were clearly described, 2) The competence of the raters was explicitly detailed, 3) Raters were blinded to their previous findings, 4) The time interval between repeated measures was appropriate, 5) The execution of the test was described in sufficient detail to allow replication, 6) Study participant's withdrawals were clearly explained, 7) The statistical methods were suitable for the study's objectives. Other items were not included as they were not considered relevant for this review. The score for each item was determined as follows: 1 = yes; 0 = no. Consequently, the maximal possible score was 7.

To account for the dependency of multiple effect sizes nested within individual studies, a multi-level model was employed using the metafor package (version 4.6-0) in RStudio (version: 2024.12.1 + 563) (Assink and Wibbelink, 2016). Three models were computed to assess 1) the MD, 2) the %MD, and 3) systematic bias (test 2 - test 1). Since the effect sizes of interest were computed off variability statistics to assess reliability, each of the studies was weighted based on the sample size as we have done previously (Dankel et al., 2019). Three moderating variables were also assessed for the MD and %MD which included: 1) age (continuous: years), 2) time interval between test-retest (continuous: days), and handgrip device used (categorical: Jamar, Takei, other) to determine their influence on reliability. For systematic bias (calculated as the change from test 1 to test 2), age was used as a moderator to determine if children or younger adults may have experienced a greater learning effect. Sex and sports experience were not used as moderating variables for any of the analyses, given the limited number of studies that assessed males and females separately and the limited number of studies assessing athletes. In summarizing the results (Table 1, Table 2, and Table 3), the following age ranges were used for age categorization: young children (under 7 years old), children (between 7-10 years old), adolescents (between 10-18 years old), young adults (between 18-35 years old), middle-aged adults (between 36-60 years old), and older adults (>60 years old). Statistical significance was set at p < 0.05.

RESULTS
Included studies and participant and research protocol characteristics

The original article search yielded 4,233 studies. Three additional studies were identified from the reference lists of the included articles. After removing duplicates and eliminating articles based on the eligibility criteria, 48 studies (Abe et al., 2018; Abe et al., 2019; Abe et al., 2022; Amado-Pacheco et al., 2019; Anstey et al., 1997; Balogun et al., 1991; Beauchamp et al., 2021; Biasini et al., 2023; Bohannon, 2006; Bohannon and Schaubert, 2005; Bohannon et al., 2011; Boshnjaku et al., 2021; Cadenas-Sanchez et al., 2016; Cildan Uysal et al., 2022; Dugdale et al., 2019; Espana-Romero et al., 2010; Essendrop et al., 2001; Fernandez-Santos et al., 2016; Ferreira et al., 2021; Gasior et al., 2020; Gerodimos, 2012; Gerodimos and Karatrantou, 2013; Gil et al., 2022; Hamilton et al., 1992; Jenkins and Cramer, 2017; Karatrantou et al., 2020; Kieser et al., 2025; King-Dowling et al., 2024; Legg et al., 2020; Lemmink et al., 2001; Leszczak et al., 2024; Maurya et al., 2023; O’Keeffe et al., 2020; Ortega et al., 2008; Petersen et al., 2015; Plant et al., 2016; Ramirez-Velez et al., 2015; Sanchez-Delgado et al., 2015; Savva et al, 2013; Suzuki et al., 2019; Svensson et al., 2008; Tan et al., 2001; Trajkovic et al., 2024; Tsang, 2005; Venegas-Carro et al., 2022; Villafane et al., 2016; Walamies and Turjanmaa, 1993; Ward and Adams, 2007) were included in this review. Of those studies, 16 included 42 data points assigned to children and adolescents (852 boys, 294 girls, and 879 mixed) (Abe et al., 2022; Amado-Pacheco et al., 2019; Cadenas-Sanchez et al., 2016; Dugdale et al., 2019; Espana-Romero et al., 2010; Fernandez-Santos et al., 2016; Gasior et al., 2020; Gerodimos, 2012; Gerodimos and Karatrantou, 2013; King-Dowling et al., 2024; O’Keeffe et al., 2020; Ortega et al., 2008; Ramirez-Velez et al., 2015; Sanchez-Delgado et al., 2015; Svensson et al., 2008; Trajkovic et al., 2024) (Table 1), 25 included 50 data points assigned to young and middle-aged adults (98 men, 109 women, and 1,244 mixed) (Abe et al., 2018; Abe et al. 2019; Balogun et al., 1991; Beauchamp et al., 2021; Biasini et al., 2023; Bohannon, 2006; Bohannon et al., 2011; Boshnjaku et al., 2021; Cildan Uysal et al., 2022; Essendrop et al., 2001; Gerodimos, 2012; Gil et al., 2022; Hamilton et al., 1992; Karatrantou et al., 2020; Kieser et al., 2025; Leszczak et al., 2024; Maurya et al., 2023; Petersen et al., 2015; Plant et al., 2016; Savva et al., 2013; Tan et al., 2001; Tsang, 2005; Venegas-Carro et al., 2022; Walamies and Turjanmaa, 1993; Ward and Adams, 2007) (Table 3), and 12 included 23 data points assigned to older adults (166 men, 292 women, and 1,046 mixed) (Abe et al., 2018; Anstey et al., 1997; Beauchamp et al., 2021; Bohannon and Schaubert, 2005; Boshnjaku et al., 2021; Ferreira et al., 2021; Gil et al., 2022; Jenkins and Cramer, 2017; Legg et al., 2020; Lemmink et al., 2001; Suzuki et al., 2019; Villafane et al., 2016) (Table 2). The following studies (Bohannon, 2006; Bohannon et al., 2011; Kieser et al., 2025; Plant et al., 2016; Tsang, 2005) included participants spanning a broad range of adults (young, middle-aged, and older) and were therefore included in Table 3.

The main dynamometers used to measure grip strength were Jamar (Abe et al., 2019; Bohannon and Schaubert, 2005; Bohannon et al., 2011; Boshnjaku et al., 2021; Essendrop et al., 2001; Gasior et al., 2020; Gerodimos, 2012; Gerodimos and Karatrantou, 2013; Hamilton et al., 1992; Jenkins and Cramer, 2017; Karatrantou et al., 2020; Legg et al., 2020; Lemmink et al., 2001; Savva et al., 2013; Trajkovic et al., 2024; Tsang, 2005; Venegas-Carro et al., 2022; Villafane et al., 2016; Ward and Adams, 2007) and Takei (Abe et al., 2018; Abe et al., 2019; Abe et al., 2022; Cadenas-Sanchez et al., 2016; Dugdale et al., 2019; Fernandez-Santos et al., 2016; Gerodimos and Karatrantou, 2013; King-Dowling et al., 2024; O’Keeffe et al., 2020; Ortega et al., 2008; Petersen et al., 2015; Ramirez-Velez et al., 2015; Sanchez-Delgado et al., 2015; Suzuki et al., 2019; Tan et al., 2001; Trajkovic et al., 2024). Other studies used different types of dynamometers, such as Grippit (Svensson et al., 2008), JTECH (Biasini et al., 2023; Plant et al., 2016), and MicroFET (O’Keeffe et al., 2020). Two studies did not report the type of dynamometers (Amado-Pacheco et al., 2019; Beauchamp et al., 2021).

The most commonly used test-retest intervals were 24 hours (Abe et al., 2018; Abe et al., 2019; Bohannon, 2006; Gasior et al., 2020; Gerodimos, 2012; Gerodimos et al., 2013; Tan et al., 2001; Ward and Adams, 2007) or 7 days (Abe et al., 2022; Beauchamp et al., 2021; Espana-Romero et al., 2010; Essendrop et al., 2001; Fernandez-Santos et al., 2016; Ferreira et al., 2021; Gil et al., 2022; Hamilton et al., 1992; Lemmink et al., 2001; O’Keeffe et al., 2020; Petersen et al., 2015; Ramirez-Velez et al., 2015; Savva et al., 2013; Svensson et al., 2008; Venegas-Carro et al., 2022; Villafane et al., 2016), with several studies using 2 weeks (Amado-Pacheco et al., 2019; Cadenas-Sanchez et al., 2016; Hamilton et al., 1992; Leszczak et al., 2024; Ortega et al., 2008). In 41 of the 48 studies, the test-retest interval was 2 weeks or less. Nine studies had a range of test-retest intervals that were not consistent, such as within 7 days (Balogun et al., 1991; Biasini et al., 2023; Maurya et al., 2023) or 2-10 days (Kieser et al., 2025).

Assessment of methodological quality

The mean score was 4.2 out of 7 (range: 2-7), indicating a methodological quality rating that varied from low to high (Supplementary Table 1). Seventeen of the 48 studies scored 5 or higher, while 10 received scores of 2 or 3.

Impact of potential moderators on absolute test-retest reliability of grip strength measurements

Nine of the 48 studies did not report the mean age of participants, only an age range (Dugdale et al., 2019; Espana-Romero et al., 2010; Gasior et al., 2020; Plant et al., 2016; Sanchez-Delgado et al., 2015; Suzuki et al., 2019; Svensson et al., 2008; Walamies and Turjanmaa, 1993; Ward and Adams, 2007). There was considerable variation among studies reporting absolute test-retest reliability (MD and %MD) in each age group (Figure 2 and 4). Specifically, the mean MD value for young children (under 7 years old) was 1.9 kg (Abe et al., 2022; Amado-Pacheco et al., 2019; Cadenas-Sanchez et al., 2016; King-Dowling et al., 2024; Sanchez-Delgado et al., 2015; Svensson et al., 2008), while it was 2.5 kg for children aged 7 to 10 (Espana-Romero et al., 2010; Fernandez-Santos et al., 2016; Gerodimos, 2012; Gerodimos and Karatrantou, 2013; Gasior et al., 2020). The mean MD for adolescents (10-18 years old) was 4.2 kg (Dugdale et al., 2019; Espana-Romero et al., 2010; Gasior et al., 2020; Gerodimos, 2012; Gerodimos and Karatrantou, 2013; O’Keeffe et al., 2020; Ortega et al., 2008; Ramirez-Velez et al., 2015; Svensson et al., 2008; Trajkovic et al., 2024), which is similar to young adults (18-35 years old; 4.0 kg) (Balogun et al., 1991; Biasini et al., 2023; Boshnjaku et al., 2021; Cildan Uysal et al., 2022; Gerodimos, 2012; Gil et al., 2022; Hamilton et al., 1992; Karatrantou et al., 2020; Leszczak et al., 2024; Maurya et al., 2023; Petersen et al., 2015; Savva et al., 2013; Venegas-Carro et al., 2022). Middle-aged (36-60 years old) and older (>60 years old) adults had an MD of approximately 5-6 kg. On the other hand, the mean %MD values in young children and adolescents were approximately 25% and 17%, respectively, while those in young and older adults were about 12% and 17%, respectively (Table 4).

Mean weighted reliability statistics

The overall weighted MD was 4.463 (95% confidence interval [CI]: 3.926, 4.999; p < 0.001). As there was significant heterogeneity (Q = 36,484.970 p < 0.001) that could be attributed to both within (33.9%) and between (65.7%) study variance, potential moderators were examined. Neither age [effect size [ES]: 0.015 (95% CI: -0.004, 0.035; p = 0.113)], test-retest interval [ES: 0.006 (95% CI: -0.002, 0.013; p = 0.143)] nor handgrip device (p = 0.752) were significant moderators of reliability. The overall weighted %MD was 16.307 (95% CI: 14.529, 18.085; p < 0.001). Like that of the absolute MD, neither age [ES: -0.025 (95% CI: -0.089, 0.039; p = 0.439)], test-retest interval [ES: 0.022 (95% CI: -0.001, 0.046; p = 0.065)], or handgrip device (p = 0.334) were significant moderators of reliability. There was also no apparent systematic bias [ES: 0.162 (95% CI: -0.139, 0.464; p = 0.291)], and the presence of systematic bias was not moderated by age [ES: 0.005 (95% CI: -0.006, 0.017; p = 0.380)] suggesting there was no learning effect, and this did not differ based on age.

Four studies reported absolute reliability for both boys and girls (Amado-Pacheco et al., 2019; Cadenas-Sanchez et al., 2016; Ortega et al., 2008; Ramirez-Velez et al., 2015; Sanchez-Delgado et al., 2015), while three studies focused on adult men and women (Bohannon, 2006; Karatrantou et al., 2020; Maurya et al., 2023). The mean MD value was 3.0 kg for boys and 2.5 kg for girls, with %MD values of 24.2% and 23.4%, respectively. In adults, the mean MD values were 3.9 kg for women and 5.4 kg for men, with %MD values of 17.3% for women and 15.3% for men (Table 4).

No studies have compared the absolute reliability of grip strength measurements between participants with and without sports experience. However, studies have been done on pre-pubertal, pubertal, and young adult basketball players (Gerodimos, 2012), pre-pubertal and pubertal wrestlers (Gerodimos and Karatrantou, 2013), youth soccer players (Dugdale et al., 2019), and middle-aged ten-pin bowlers (Tan et al., 2001). Moreover, twenty-two studies measured grip strength in both the left and right hands; half (11 studies) compared dominant and non-dominant hands, and the remaining 11 studies were able to compare right and left hands (Table 4).

DISCUSSION

The current manuscript investigated the impact of potential moderating factors on the absolute test-retest reliability of grip strength measurements in a healthy population. This systematic review with meta-analysis included 48 studies involving 4,980 healthy participants (i.e., 2,025 children and adolescents, 1,451 young and middle-aged adults, and 1,504 older adults). Our findings demonstrated that (1) there was considerable variation among studies reporting MD and %MD across each age group; (2) the mean MD (%MD) values were 1.9 kg (25.4%) in young children (<7 years old), 2.5 kg (13.8%) in children (7-10 years old), 4.2 kg (17.1%) in adolescents (10-18 years old), 4.0 kg (11.6%) in young adults (18-35 years old), and 4.7 kg (16.7%) in older adults (>60 years old); (3) no studies have compared the MD and %MD between participants with and without sports experience; (4) neither age, test-retest interval, nor handgrip device served as significant moderators of MD and %MD reliability.

In this study, our meta-analysis found no evidence that the MD and %MD in test-retest reliability for grip strength measurements were influenced by age. One possible reason is the considerable variation in MD and %MD among studies within each age group (Figure 2 and Table 4). Nonetheless, the mean %MD for the reliability of test-retest grip strength measures in each age group is distinctive and partly similar to the %MD observed for muscle strength measures other than grip strength. For instance, maximal voluntary isokinetic muscle strength is a standard outcome measure for assessing knee joint function. A study measured knee extension and flexion peak torque at an angular velocity of 60 degrees per second across two sessions, 7 days apart, involving 22 children (10 boys and 12 girls) with a mean age of 8.8 years (Fagher et al., 2016). The MD and %MD calculated from the SEM of the test-retest reliability reported by the authors were 15.2 Nm and 30.9% for knee extension and 9.7 Nm and 36.1% for knee flexion, respectively. Another study (Santos et al., 2013) also assessed the test-retest reliability (7 days apart) of knee extension and flexion peak torque (60 degrees per second) in children with a mean age of 8.5 years. The %MD calculated from SEM was 29.3% for the dominant leg, 33.1% for the non-dominant leg for knee extension, and 46.2% and 32.6%, respectively, for knee flexion. The %MD values in these studies were similar to those observed in young children (< 7 years old) for grip strength measurements. Considering that the mean %MD of grip strength measurements in children of the same age group (7-10 years old) was 13.8%, the %MD of the isokinetic strength measure may appear high. In addition, a study (Maffiuletti et al., 2007) investigating the reproducibility (7 days apart) of knee extension and flexion peak torques under the same conditions (60 degrees per second) in young adults found that the %MD values are in the same range (10.7% for knee extension and 8.6% for knee flexion) as those observed in grip strength measurements in young adults (11.6%).

Eighty-five percent (41 studies) of the 48 included studies had a test-retest interval of less than two weeks, with 7 days being the most common (16 studies). This may explain why the test-retest interval did not affect grip strength measurements' MD and %MD. Those results suggest that at least a test-retest interval of two weeks or less may not significantly affect the grip strength reliability of MD and %MD. Three included studies reported test-retest reliability at two different intervals: 7 days vs. 9 weeks (Venegas-Carro et al., 2022), 24 hours vs. 1 year (Abe et al., 2018), and 12 weeks vs. 24 weeks (Jenkins and Cramer, 2017). For instance, Venegas-Carro et al. (2022) reported that MD and %MD values doubled at a 9-week interval (5.7 kg and 11.8%) compared with a 7-day interval (3.1 kg and 6.6 %). Abe et al. (2018) observed that although this test was performed on a different population, the MD and %MD values were greater at a 1-year interval (6.4 kg and 21.1 %) than at a 24-hour interval (3.97 kg and 10.1 %). However, Jenkins and Cramer (2017) reported similar MD values at 12- and 24-week intervals, making it unclear whether and at what point extending the interval affects grip strength reproducibility. Grip strength is one part of a physical fitness test taken annually for children and adolescents. Future studies may clarify the impact of extending the test-retest interval on the reproducibility of grip strength measurements.

About 70% of the included studies utilized the Jamar hand dynamometer, regarded as the gold standard or the Takei dynamometer. In both Jamar and Takei, despite differing standardized measurement conditions (sitting vs. standing, elbows at 90 degrees vs. extended, five grip widths vs. adjustments for hand size), the type of device did not affect MD and %MD in grip strength measurements. Furthermore, several studies examining the measurement accuracy of different handgrip dynamometers using Jamar as a benchmark also reported a good correlation between the two (Cildan Uysal et al., 2022; Hamilton et al., 1992; Trajkovic et al., 2024). However, the mechanical systems of the devices differ between Jamar (hydraulic) and Takei (Smedley), and it has been observed that the measured values of the spring-type Takei differ from those of the Jamar in participants with high grip strength (Abe et al., 2019). Studies reporting test-retest reliability of grip strength measurement in young children are limited to studies using the Takei dynamometer.

In this study, sex and sports experience were not used as moderating variables in meta-analyses due to the limited number of studies that assessed boys and girls or men and women separately and the limited number of studies that assessed athletes. The difference in grip strength between boys and girls in children under 10 is less than that observed in adult men and women (Ramirez-Velez et al., 2021). Thus, MD may be similar in younger children when %MD is identical in both sexes. In adults, there is a clear sex difference in average grip strength, and when the %MD is the same for both sexes, it is expected that the MD of women with lower grip strength will be smaller than that of men. The results of included studies reporting MD and %MD separately for boys and girls or men and women suggest this possibility (Table 4), although future studies are needed. In addition, no studies have compared MD and %MD between athletes and non-athletes. Further research may be required to clarify whether sports experience affects the test-retest reliability of grip strength measurements.

CONCLUSION

The data analyzed from the collected studies found considerable variation among studies reporting MD and the percentage of MD to measured value (%MD) across each age group. Neither age, test-retest interval nor handgrip device served as significant moderators of MD and %MD reliability. Due to the limited number of studies, sex and sports experience were excluded from the analysis; as a result, their impacts remain unknown.

ACKNOWLEDGEMENTS

The authors have no conflict of interest to declare. This study received no specific grants, fellowships, or materials gifts from any funding agency in the public, commercial, or not-for-profit sectors. The datasets generated during and/or analyzed during the current study are not publicly available but are available from the corresponding author who organized the study.

AUTHOR BIOGRAPHY
     
 
Takashi Abe
 
Employment:Institute of Health and Sports Science & Medicine, Juntendo University
 
Degree: PhD
 
Research interests: Exercise physiology and adaptation by exercise training
  E-mail: t12abe@gmail.com
   
   

     
 
Jun Seob Song
 
Employment:Department of Counseling, Health and Kinesiology, Texas A&M University-San Antonio
 
Degree: PhD
 
Research interests: Skeletal muscle physiology
  E-mail: jsong1@tamusa.edu
   
   

     
 
Scott J. Dankel
 
Employment:Department of Health and Exercise Science, Rowan University
 
Degree: PhD
 
Research interests: Exercise physiology and statistics
  E-mail: dankel47@rowan.edu
   
   

     
 
Ricardo B. Viana
 
Employment:Institute of Physical Education and Sport, Federal University of Ceará
 
Degree: PhD
 
Research interests: Exercise physiology and human anatomy
  E-mail: vianaricardoborges@ufc.br
   
   

     
 
Akemi Abe
 
Employment:Division of Children’s Health and Exercise Research, Institute of Trainology
 
Degree: BS
 
Research interests: Active play and health in children
  E-mail: amyabe3379@gmail.com
   
   

     
 
Jeremy P. Loenneke
 
Employment:Department of Health, Exercise Science, and Recreation Management, The University of Mississippi
 
Degree: PhD
 
Research interests: Skeletal muscle physiology
  E-mail: jploenne@olemiss.edu
   
   

REFERENCES
Abe T., Thiebaud R. S., Loenneke J. P. (2016) Age-related change in handgrip strength in men and women: Is muscle quality a contributing factor?. Age 38, 28.
Abe T., Dankel S. J., Buckner S. L., Jessee M. B., Mattocks K. T., Mouser J. G., Bell Z. W., Loenneke J. P. (2018) Short term (24 hours) and long term (1 year) assessments of reliability in older adults: Can one replace the other?. Journal of Aging Research and Clinical Practice 7, 82-84.
Abe T., Loenneke J. P., Thiebaud R. S., Loftin M. (2019) The bigger the hand, the bigger the difference? Implications for testing strength with 2 popular handgrip dynamometers. Journal of Sport Rehabilitation 28, 278-282.
Abe T., Thiebaud R. S., Ozaki H., Yamasaki S., Loenneke J. P. (2022) Children with low handgrip strength: A narrative review of possible exercise strategies to improve its development. Children (Basel) 9, 1616.
Abe T., Sanui R., Sasaki A., Ishibashi A., Daikai N., Shindo Y., Abe A., Loenneke J. P. (2022) Optimal grip span for measuring maximum handgrip strength in preschool children. International Journal of Clinical Medicine 13, 479-488.
Abe T., Machida S., Nakamura M., Kohmura Y., Suzuki K., Abe A., Nakano M., Loenneke J. P., Naito H. (2024) Tracking handgrip strength in Kendo athletes from university to middle and older adulthood. American Journal of Human Biology 36, e24082.
Amado-Pacheco J. C., Prieto-Benavides D. H., Correa-Bautista J. E., Garcia-Hermoso A., Agostinis-Sobrinho C., Alonso-Martinez A. M., Izquierdo M., Ramirez-Velez R. (2019) Feasibility and reliability of physical fitness tests among Colombian preschool children. International Journal of Environmental Research and Public Health 16, 3069.
Anstey K. J., Smith G. A., Lord S. (1997) Test-retest reliability of a battery of sensory, motor and physiological measures of aging. Perceptual and Motor Skills 84, 831-834.
Assink M., Wibbelink C. J. (2016) Fitting three-level meta-analytic models in R: A step-by-step tutorial. The Quantitative Methods for Psychology 12, 154-174.
Balogun J. A., Adenlola S. A., Akinloye A. A. (1991) Grip strength normative data for the Harpenden dynamometer. Journal of Orthopedic Sports Physical Therapy 14, 155-160.
Beauchamp M. K., Hao Q., Kuspinar A., D’Amore C., Scime G., Ma J., Mayhew A., Bassim C., Wolfson C., Kirkland S., Griffith L., Raina P. (2021) Reliability and minimal detectable change values for performance-based measures of physical functioning in Canadian Longitudinal Study on Aging. Journal of Gerontology A Biological Science and Medical Science 76, 2030-2038.
Biasini N. R., Pellegrino M., Switzer-McIntyre S., Kasawara K. T. (2023) Reliability and validity of shoulder and handgrip strength testing. Physiotherapy Canada 75, 65-71.
Bobos P., Nazari G., Lu S., MacDermid J. C. (2020) Measurement properties of the hand grip strength assessment: A systematic review with meta-analysis. Archives of Physical Medicine and Rehabilitation 101, 553-565.
Bohannon R. W. (2015) Muscle strength: Clinical and prognostic value of hand-grip dynamometry. Current Opinion in Clinical Nutrition & Metabolic Care 18, 465-470.
Bohannon R. W. (2017) Test-retest reliability of measurements of hand-grip strength obtained by dynamometry from older adults: A systematic review of research in the PubMed Database. Journal of Frailty & Aging 6, 83-87.
Bohannon R. W. (2019) Grip strength: An indispensable biomarker for older adults. Clinical Intervention in Aging 14, 1681-1691.
Bohannon R. W. (2006) Test-retest reliability of the MicroFET 4 hand-grip dynamometer. Physiotherapy Theory and Practice 22, 219-221.
Bohannon R. W., Schaubert K. L. (2005) Test-retest reliability of grip-strength measures obtained over a 12-week interval from community-dwelling elders. Journal of Hand Therapy 18, 426-427.
Bohannon R. W., Bubela D. J., Magasi S., Gershon R. C. (2011) Relative reliability of three objective tests of limb muscle strength. Isokinetic and Exercise Science 19, 77-81.
Boshnjaku A., Bahtiri A., Feka K., Krasniqi E., Tschan H., Wessner B. (2021) Test-retest reliability data of functional performance, strength, peak torque and body composition assessments in two different age groups of Kosovan adults. Data in Brief 36, 106988.
Brink Y., Louw Q. A. (2011) Clinical instruments: Reliability and validity critical appraisal. Journal of Evaluation in Clinical Practice 18, 1126-1132.
Cadenas-Sanchez C., Martinez-Tellez B., Sanchez-Delgado G., Mora-Gonzalez J., Castro-Pinero J., Lof M., Ruiz J. R., Ortega F. B. (2016) Assessing physical fitness test in preschool children: Feasibility, reliability and practical recommendations for the PREFIT battery. Journal of Science and Medicine in Sport 19, 910-915.
Celis-Morales C. A., Lyall D. M., Anderson J., Iliodromiti S., Fan Y., Ntuk U. E., Mackay D. F., Pell J. P., Sattar N., Gill J. M. R. (2017) The association between physical activity and risk of mortality is modulated by grip strength and cardiorespiratory fitness: Evidence from 498135 UK-Biobank participants. European Heart Journal 38, 116-122.
Cildan Uysal S., Tonak H. A., Kitis A. (2022) Validity, reliability and test-retest study of grip strength measurement in two positions with two dynamometers: Jamar® Plus and K-Force® Grip. Hand Surgery and Rehabilitation 41, 305-310.
Cruz-Jentoft A. J., Bahat G., Bauer J., Boitie Y., Bruyere O., Cederholm T., Cooper C., Landi F., Rolland Y., Sayer A. A., Schneider S. M., Sieber C. C., Topinkova E., Vandewoude M., Visser M., Zamboni M. (2019) Sarcopenia: Revised European consensus on definition and diagnosis. Age and Ageing 48, 16-31.
Dankel S. J., Kang M., Abe T., Loenneke J. P. (2019) A meta-analysis to determine the validity of taking blood pressure using the indirect cuff method. Current Hypertension Reports 21, 11.
Dugdale J. H., Arthur C. A., Sanders D., Hunter A. M. (2019) Reliability and validity of field-based fitness tests in youth soccer players. European Journal of Sport Science 19, 745-756.
Espana-Romero V., Artero E. G., Jimenez-Pavon D., Cuenca-Garcia M., Ortega F. B., Castro-Pinero J., Sjostrom M., Castillo-Garzon M. J., Ruiz J. R. (2010) Assessing health-related fitness tests in the school setting: Reliability, feasibility and safety; The ALPHA Study. International Journal of Sports Medicine 31, 490-497.
Essendrop M., Schibye B., Hansen K. (2001) Reliability of isometric muscle strength tests for the trunk, hands and shoulders. International Journal of Industrial Engineering 28, 379-387.
Fagher K., Fritzson A., Drake A. M. (2016) Test-retest reliability of isokinetic knee strength measurements in children aged 8 to 10 years. Sports Health 8, 255-259.
Fernandez-Santos J. R., Ruiz J. R., Gonzalez-Montesinos J. L., Castro-Pinero J. (2016) Reliability and validity of field-based tests to assess upper-body muscular strength in children aged 6-12 years. Pediatric Exercise Science 28, 331-340.
Ferreira S., Raimundo A., Marmeleira J. (2021) Test-retest reliability of the functional reach test and the hand grip strength test in older adults using nursing home services. Irish Journal of Medical Science 190, 1625-1632.
Gasior J. S., Pawlowski M., Jelen P. J., Rameckers E. A., Williams C. A., Makuch R., Werner B. (2020) Test-retest reliability of handgrip strength measurement in children and preadolescents. International Journal of Environmental Research and Public Health 17, 8026.
Gerodimos V. (2012) Reliability of handgrip strength test in basketball players. Journal of Human Kinetics 31, 25-36.
Gerodimos V., Karatrantou K. (2013) Reliability of maximal handgrip strength test in pre-pubertal and pubertal wrestlers. Pediatric Exercise Science 25, 308-322.
Gil A. W., da Silva R. A., Pereira C., Nascimento V. B., Amorim C. F., Imaizumi M., Teixeira D. C. (2022) Reproducibility of dynamometers in handrail format in evaluating handgrip strength and traction in young and older adults. Medical Engineering & Physics 100, 103749.
Hamilton G. F., McDonald C., Chenier T. C. (1992) Measurement of grip strength: Validity and reliability of the sphygmomanometer and Jamar grip dynamometer. Journal of Orthopaedic & Sports Physical Therapy 16, 215-219.
Hanten W. P., Chen W. Y., Austin A. A., Brooks R. E., Carter H. C., Law C. A., Morgan M. K., Sanders D. J., Swan C. A., Vanderslice A. L. (1999) Maximum grip strength in normal subjects from 20 to 64 years of age. Journal of Hand Therapy 12, 193-200.
Jenkins N. D. M., Cramer J. T. (2017) Reliability and minimum detectable change for common clinical physical function tests in sarcopenic men and women. Journal of American Geriatrics Society 65, 839-846.
Karatrantou K., Xagorari A., Vasilopoulou T., Gerodimos V. (2020) Does the number of trials affect the reliability of handgrip strength measurement in individuals with intellectual disability?. Hand Surgery and Rehabilitation 39, 223-228.
Kieser J., Langford M., Stover E., Tomkinson G. R., Clark B. C., Cawthon P. M., McGrath R. (2025) Absolute agreement between subjective hand squeeze and objective handgrip strength in adults. Journal of Strength and Conditioning Research 39, 16-23.
King-Dowling S., Fortnum K., Chirico D., Le T., Kwan M. Y. W., Timmons B. W., Cairney J. (2024) Reliability of field- and laboratory-based assessments of health-related fitness in preschool-aged children. American Journal of Human Biology 36, e23987.
Legg H. S., Spindor J., Dziendzielowski R., Sharkey S., Lanovaz J. L., Farthing J. P., Arnold C. M. (2020) The reliability and validity of novel clinical strength measures of the upper body in older adults. Hand Therapy 25, 130-138.
Lemmink K. A. P. M., Han K., de Greef M. H. G., Rispens P., Stevens M. (2001) Reliability of the Groningen fitness test for the elderly. Journal of Aging and Physical Activity 9, 194-212.
Leszczak J., Pniak B., Druzbicki M., Guzik A. (2024) The reliability of a biometrics device as a tool for assessing hand grip and pinch strength, in a Polish cohort - A prospective observational study. PLoS One 19, e0303648.
Loenneke J. P., Abe A., Yamasaki S., Tahara Y., Abe T. (2024) Sex differences in strength during development: Implications for inclusivity and fairness in sport. American Journal of Human Biology 36, e24152.
Maffiuletti N. A., Bizzini M., Desbrosses K., Babault N., Munzinger U. (2007) Reliability of knee extension and flexion measurements using the Com-Trex isokinetic dynamometer. Clinical Physiology and Functional Imaging 27, 346-353.
Maurya P. S., Sisneros K. P., Johnson E. B., Palmer T. B. (2023) Reliability of handgrip strength measurements and their relationship with muscle power. Journal of Sports Medicine and Physical Fitness 63, 805-811.
Norman K., Stobaus N., Gonzalez M. C., Schulzke J. D., Pirlich M. (2011) Hand grip strength: Outcome predictor and marker of nutritional status. Clinical Nutrition 30, 135-142.
O’Keeffe B. T., Donnelly A. E., MacDonncha C. (2020) Test-retest reliability of student-administered health-related fitness tests in school settings. Pediatric Exercise Science 32, 48-57.
Ortega F. B., Artero E. G., Ruiz J. R., Vicente-Rodriguez G., Bergman P., Hagstromer M., Ottevaere C., Nagy E., Konsta O., Rey-Lopez J. P., Polito A., Dietrich S., Plada M., Beghin L., Manios Y., Sjostrom M., Castillo M. J. (2008) Reliability of health-related physical fitness tests in European adolescents. The HELENA Study. International Journal of Obesity (Lond) 32, S49-S57.
Page M. J., McKenzie J. E., Bossuyt P. M., Boutron I., Hoffmann T. C., Mulrow C. D., Shamseer L., Tetzlaff J. M., Moher D. (2021) Updating guidance for reporting systematic reviews: Development of the PRISMA 2020 statement. Journal of Clinical Epidemiology 134, 103-112.
Peralta M., Dias C. M., Marques A., Henriques-Neto D., Sousa-Uva M. (2023) Longitudinal association between grip strength and the risk of heart diseases among European middle-aged and older adults. Experimental Gerontology 171, 112014.
Petersen N., Thieschafer L., Ploutz-Snyder L., Damann V., Mester J. (2015) Reliability of a new test battery for fitness assessment of the European Astronaut corps. Extreme Physiology & Medicine 4, 12.
Plant C. E., Parsons N. R., Edwards A., Rice H., Denninson K., Costa M. L. (2016) A comparison of electronic and manual dynamometry and goniometry in patients with fracture of the distal radius and healthy participants. Journal of Hand Therapy 29, 73-80.
Ramirez-Velez R., Rincon-Pabon D., Correa-Bautista J. E., Garcia-Hermoso A., Izquierdo M. (2021) Handgrip strength: Normative reference values in males and females aged 6-64 years old in a Colombian population. Clinical Nutrition ESPEN 44, 379-386.
Ramirez-Velez R., Rodrigues-Bezerra D., Correa-Bautista J. E., Izquierdo M., Lobelo F. (2015) Reliability of health-related physical fitness tests among Colombian children and adolescents: The FUPRECOL study. PLoS One 10, e014875.
Rantanen T., Volpato S., Ferrucci L., Heikkinen E., Fried L. P., Guralnik J. M. (2003) Handgrip strength and cause-specific and total mortality in older disabled women: Exploring the mechanism. Journal of American Geriatrics Society 51, 636-641.
Saint-Maurice P. F., Laurson K., Welk G. J., Eisenmann J., Gracia-Marco L., Artero E. G., Ortega F., Ruiz J. R., Moreno L. A., Vicente-Rodriguez G., Janz K. F. (2018) Grip strength cutpoints for youth based on a clinically relevant bone health outcome. Archives of Osteoporosis 13, 92.
Sanchez-Delgado G., Cadenas-Sanchez C., Mora-Gonzalez J., Martinez-Tellez B., Chillon P., Lof M., Ortega F. B., Ruiz J. R. (2015) Assessment of handgrip strength in preschool children aged 3 to 5 years. Journal of Hand Surgery 40, 966-972.
Santos A. N., Pavao S. L., Avila M. A., Salvini T. F., Rocha N. A. C. F. (2013) Reliability of isokinetic evaluation in passive mode for knee flexors and extensors in healthy children. Brazilian Journal of Physical Therapy 17, 112-120.
Savva C., Karagiannis C., Rushton A. (2013) Test-retest reliability of grip strength measurement in full elbow extension to evaluate maximum grip strength. Journal of Hand Therapy 38, 183-186.
Stenholm S., Tiainen K., Rantanen T., Sainio P., Heliovaara M., Impivaara O., Koskinen S. (2012) Long-term determinants of muscle strength decline: Prospective evidence from the 22-year Mini-Finland follow-up survey. Journal of American Geriatrics Society 60, 77-85.
Suzuki Y., Kamide N., Kitai Y., Ando M., Sato H., Yoshitake S., Sakamoto M. (2019) Absolute reliability of measurements of muscle strength and physical performance measures in older people with high functional capacities. European Geriatrics Medicine 10, 733-740.
Svensson E., Waling K., Hager-Ross C. (2008) Grip strength in children: Test-retest reliability using Grippit. Acta Paediatrica 97, 1226-1231.
Tan B., Aziz A. R., Teh K. C., Lee H. C. (2001) Grip strength measurement in competitive ten-pin bowlers. Journal of Sports Medicine Physical Fitness 41, 68-72.
Trajkovic N., Rancic D., Ilic T., Herodek R., Korobeynikov G., Pekas D. (2024) Measuring handgrip strength in school children: Inter-instrument reliability between Takei and Jamar. Scientific Reports 14, 1074.
Tsang R. C. C. (2005) Reference values for 6-minute walk test and hand-grip strength in healthy Hong Kong Chinese adults. Hong Kong Physiotherapy Journal 23, 6-12.
Venegas-Carro M., Kramer A., Moreno-Villanueva M., Gruber M. (2022) Test-retest reliability and sensitivity of common strength and power tests over a period of 9 weeks. Sports 10, 171.
Villafane J. H., Valdes K., Buraschi R., Martinelli M., Bissolotti L., Negrini S. (2016) Reliability of the handgrip strength test in elderly subjects with Parkinson disease. Hand 11, 54-58.
Walamies M., Turjanmaa V. (1993) Assessment of the reproducibility of strength and endurance handgrip parameters using a digital analyser. European Journal of Applied Physiology 67, 83-86.
Ward C., Adams J. (2007) Comparative study of the test-re-test reliability of four instruments to measure grip strength in a healthy population. British Journal of Hand Therapy 12, 48-54.
Weir J. P. (2005) Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. Journal of Strength and Conditioning Research 19, 231-240.








Back
|
PDF
|
Share