|
|
|
ABSTRACT |
Handgrip strength, a component of physical fitness tests and a biomarker of future health, is typically measured annually. However, no studies have looked at the consistency of these measurements over time. We investigated the reliability of handgrip strength measurements among university students who take the test annually. Our data included 3649 students (2769 males and 880 females) who were tested annually over their four years in university. Results showed a significant difference in absolute errors across the three test-retest intervals (p < 0.001). Specifically, the 1-year longitudinal reproducibility was significantly better than the reproducibility at 2-years and 3-years. There were no differences in longitudinal reproducibility between the 2-year and 3-year time points (p = 0.490). The minimal difference values at the 1-year, 2-year, and 3-year time points were 7.70, 8.33, and 8.35 kg, respectively. When comparing the percentage of coefficient variation (%CV) values, the reliability was better in males than in females (p = 0.025). The difference in the %CV between sexes was 0.27 (95% CI: 0.03-0.52) %. When examining the results using absolute error, the results were reversed, with females having lower absolute error values than males (p < 0.001). The reliability values were better for the digital handgrip device as compared to the analog device (p < 0.001). These findings can help evaluate the consistency of handgrip strength measurements made annually. When measured annually in young adults, a change of at least approximately 20% (calculated using %MD in both sexes) in the measured values is necessary to indicate a change confidently. |
Key words:
Absolute reliability, biomarker, grip strength, reproducibility
|
Key
Points
- Absolute reliability, assessed by the standard error of measurement and minimal difference, indicates the consistency of individual scores across multiple measurements.
- Most previous studies have focused on test-retest intervals of less than two weeks; therefore, longer intervals, such as one year, should be considered.
- We evaluated the absolute reliability of handgrip strength measurements in university students who repeated the test annually.
- Our results can help assess handgrip strength measurements conducted annually.
|
Handgrip strength is a component of physical fitness tests and a biomarker for future health in children and adolescents (Abe et al., 2023a; O’Keeffe et al., 2020; Ortega et al., 2025), typically assessed annually in elementary and/or middle schools in Japan and other countries. It is also used to diagnose sarcopenia and frailty in middle-aged and older adults (Chen et al., 2020; Cruz-Jentoft et al., 2019) and may be measured regularly (i.e., annually or monthly). However, an individual’s handgrip strength can vary daily (i.e., biological variability), and the extent of these changes may differ based on specific moderating factors such as age, sex, test-retest interval, dynamometer type, and physical activity. Therefore, measurement error should be considered when comparing the measured handgrip strength values with the evaluation or diagnostic criteria. The reliability of handgrip strength measurements includes both absolute and relative reliability. Absolute reliability, assessed by the standard error of measurement (SEM) and minimal difference (MD), refers to the consistency of individual scores when measured repeatedly. Conversely, relative reliability, indicated by the intraclass correlation coefficient (ICC), examines how individuals maintain their relative positions within a group over repeated measurements (Weir, 2005). Absolute reliability is often preferred and more useful; it is remarkably functional when comparing handgrip strength values against standard values, taking into account measurement error. We recently conducted a systematic review with meta-analysis on the absolute test-retest reliability of handgrip strength measurements (Abe et al., 2025). The findings from this systematic review demonstrated considerable variation among studies reporting MD and the percentage of MD to measured value (%MD) across each age group. Neither age, test-retest interval nor handgrip device served as significant moderators of MD and %MD reliability. However, most studies have focused on test-retest intervals of less than two weeks; therefore, longer intervals (such as one year) should be considered. Since health-related physical fitness tests targeting healthy individuals are conducted annually, test-retest reliability results need to account for the impacts of a one-year interval or longer. Therefore, this study examined the absolute reliability of test-retest handgrip strength measurements when university students repeated the test annually.
Study design and participantJuntendo University has been conducting physical fitness tests (J-Fit Plus Study), including handgrip strength, as part of its university curriculum. The School of Physical Education (later the School of Sports and Health Science) has compiled the test results into a database since 1973. This study involved a secondary data analysis using the J-Fit Plus Study database and was approved by the University's Institutional Review Board (#HSS-2025-74). The university began as a medical and physical education school, serving only male students until 1991, when it began accepting female students. Privacy measures were maintained throughout the university, and all data were anonymized prior to analysis. We utilized a database of 10,633 individuals who participated in the measurements upon university enrollment. Of these students, 4,029 participated in four measurements each year until their fourth year. Additionally, 380 students did not comply with valid anthropometry and/or handgrip strength data. Finally, the data included 3649 students, comprising 2769 males and 880 females (Table 1). The data utilized in this study included a history of sports activities during school days. All students who participated in the measurement were asked to list the sports and cultural clubs they had participated in each year at the time of the measurement. No students in the final sample changed their sport affiliation during the period. Students who participated in all cultural clubs or had no activities were classified as the non-sports group. For statistical analysis, each sports event and the non-sporting group were treated as an item.
Assessment of handgrip strength and anthropometryIn the J-Fit Plus Study, a calibrated Smedley hand dynamometer (Takei Kiki, Niigata, Japan) was used to measure the handgrip strength of the left and right hands. Initially, the hand dynamometer was analog, but it was later replaced with a digital type (Takei Grip-D) in 1996. Before testing, the grip span (the distance between the dynamometer grip bars) was adjusted to the hand size of the students (the index fingers formed a right angle). All students were instructed to follow standard protocol and maintain an upright standing position with their arms down by their sides, holding the dynamometer without squeezing. The students were then asked to squeeze the handle of the dynamometer as hard as possible. The measurement was completed twice for each side, with about a 1-minute break in between (alternating sides; the right side was tested first). The highest values from both the right and left sides were averaged for data analysis. Standing height was measured barefoot using a stadiometer (YG-200, Yagami, Nagoya, Japan). Body mass was measured with minimal clothing (i.e., shorts) using a digital scale (WB-150, Tanita, Tokyo, Japan), and body mass index (BMI) was calculated as body mass divided by height squared (in kg/m2).
Statistical analysisTo assess reliability, we calculated three statistics, which included the absolute error, MD, and the percent coefficient of variation (%CV). The absolute error and %CV values were calculated at the individual level to allow for computing inferential statistics. The absolute error was calculated as the absolute value of the difference score between the two handgrip tests. The %CV was calculated as the standard deviation of all 4 handgrip test values divided by the mean. This was then multiplied by 100 to display as a percentage. A one-way repeated measures ANOVA was computed to determine how the absolute errors changed as the test-retest duration increased. We elected to use the absolute errors as they were all performed by the same individuals, and therefore, did not need to account for differences in absolute strength. We also reported the minimal difference values at the group level, which were calculated as 1.96 x the standard deviation of the change score between the two tests. To assess the influence of sex and handgrip device on reliability, an independent t-test was computed on the %CV values and absolute errors. To assess the influence of sports involvement, a one-way ANOVA was computed on the %CV values and absolute errors across sports. To assess whether there was any systematic bias between the two handgrip devices, independent t-tests were computed on the mean handgrip strength values between devices.
Descriptive statistics are expressed as mean (standard deviation), and inferential statistics are expressed as mean [95% confidence interval (CI)].
Descriptive statisticsA total of 3,649 individuals (2,769 males and 880 females) completed the study. Means for age, height, and body mass were as follows: 18.1 (0.5) years, 170.4 (8.2) cm, 63.8 (9.5) kg at the time of the first measurement (Table 1). A total of 2,671 individuals completed the handgrip test using the digital handgrip device, and 978 individuals used the analog device. The most common sports participated in were soccer (n = 442), long-distance track and field events (n = 293), volleyball (n = 251), and baseball (n = 211) in males. For females, the most common sports were basketball (n = 124), volleyball (n = 110), and soccer (n = 98).
Test-retest intervalThere was a significant difference between absolute errors spanning the three test-retest intervals (p < 0.001) (Table 2). Specifically, the 1-year longitudinal reproducibility was significantly better than the reproducibility at 2-years and 3-years. There were no differences between the longitudinal reproducibility at the 2-year and 3-year time points (p = 0.490). The MD values at the 1-year, 2-year, and 3-year time points were 7.70, 8.83, and 8.85 kg, respectively. Results were not different when assessing reproducibility as the absolute error or %CV.
Difference between males and femalesWhen comparing the %CV values, the reliability was better in males than in females (p = 0.025). The difference in the %CV between sexes was 0.27 (95% CI: 0.03-0.52) %. When examining the results using absolute error, the results were reversed, with females having lower absolute error values than males (p < 0.001) (Table 2).
Handgrip deviceThe reliability values were better for the digital handgrip device as compared to the analog device (p < 0.001) (Table 2). The difference in the %CV between devices was 0.57 (95% CI: 0.34-0.80) %. The results were not different when examining reliability using the %CV or absolute error.
Sport participationThere was no evidence for an influence of sports participation on the reliability of handgrip strength (p = 0.140).
This study examined the absolute longitudinal reproducibility of handgrip strength measurements involving 3,649 male and female university students who repeated the tests four times at annual intervals. We used absolute error, %CV, and minimal difference (MD) as indicators of absolute reliability and obtained the following results: 1) longitudinal reproducibility in the first year was significantly better than the reproducibility in the second and third years, 2) females had lower absolute error values than males, while %CV was reversed, 3) the reproducibility values were better for the digital handgrip device compared to the analog device, and 4) there was no impact of sports participation on handgrip strength reliability.
Impact of test-retest intervalConsidering that health-related physical fitness tests are often administered annually to healthy individuals in schools and public places, corresponding reliability studies may be necessary. However, as mentioned earlier, most studies on the test-retest reliability of handgrip strength measurements have reported relatively short-term results, with the test interval being less than two weeks (Abe et al., 2025). The findings of this study indicated that the long-term absolute reliability of handgrip strength measurements was better in the first year than in the second and third years. For instance, the MD in the first year was 7.7 kg, which corresponds to approximately 18% of the measured value when calculating the percentage of MD (%MD). The MD in the second and third years was 8.3 kg, resulting in a %MD of approximately 20%. In contrast to the yearly reliability results from this study, short-term reliability conducted in relatively short intervals (mainly within two weeks) reported, on average, 4.0 kg and 11.6% in the MD and %MD for young adults (Abe et al., 2025). Those results suggest that the reliability of handgrip strength measurements may decrease as the test-retest interval increases. There has been one study to date comparing test-retest reliability between short and long (one year) intervals (Abe et al., 2018), which observed that MD and %MD values were higher at a 1-year interval (6.4 kg and 21.1%) than at a 24-hour interval (3.97 kg and 10.1%), even though the tests were performed on different participants. On the other hand, there is variability in the MD and/or %MD among studies for young adults (Abe et al., 2025), with some studies indicating values comparable to those in the present study (Biasini et al., 2023; Maurya et al., 2023; Savva et al., 2013). Noteworthy about the participants in this study is that the data involved university students with a background in sports. Details of training and nutrition for each sport were not recorded; however, some study participants competed at national and/or international levels. Handgrip strength in adults has been reported to be less affected by physical activity and sports, including resistance training (Abe et al., 2023b; Labott et al., 2019). Furthermore, the effect of nutritional intake on handgrip strength is restricted (Hanach et al., 2019; Nunes et al., 2022). Taking these factors into consideration, the long-term reliability results of handgrip strength measurements obtained in this study are considered to be values that may serve as a reference for measurements conducted on an annual basis.
Difference between males and femalesWe found that reliability results differed between males and females, depending on whether the ratings were relative or absolute. This difference mainly results from variations in handgrip strength values by sex. Our results showed that males had approximately 55% greater handgrip strength than females (Table 1). Previous studies have reported similar findings in normative young adults of both sexes (Abe et al., 2016; Ramirez-Velez et al., 2021). If the percentage of absolute reliability measures is the same for males and females, males with stronger handgrip strength will show lower absolute stability values. This was evident in the results, which indicated that absolute errors were larger in males than in females. However, what remains puzzling is that the relative values of the absolute reliability (%CV) differ between sexes, with females showing higher values than males. This difference is likely the result of lower handgrip values present in females (Table 2). In this study, we did not measure any direct variables to explain why the relative value of absolute reliability was higher in females.
Impact of device typesOur results indicated that the digital dynamometer is more reliable than the analog one. In the J-Fit Plus study, the dynamometer switched from analog to digital in 1996. Consequently, data from the analog device were collected before 1996, while data from the digital device were collected afterward. The study's data do not include students who had their handgrip strength measured with both types of dynamometers. Analog devices used before 1996 differ in materials and shape from current digital devices (Yoshimura and Hayashi, 2016). In a previous systematic review (Abe et al., 2025), however, we noted that the device did not impact the overall longitudinal reproducibility of handgrip strength measurements, despite the differences in shape and material between the Jamer and Takei models. Therefore, the reasons for the differences observed in measurement reliability between analog and digital devices remain unclear. One possible explanation is that the analog display presents results in 1 kg increments, whereas the digital display shows results in 0.1 kg increments. Currently, analog devices are made of the same materials and have the same shape as digital devices.
Sports participationWe expected that athletes would have a better ability to consistently exert maximal strength, which might be reflected in the long-term reliability of handgrip strength measurements. However, the expected results were not observed in this study. Thus, no evidence was found to suggest that sports experience has better longitudinal stability in handgrip strength measurements.
Since handgrip strength tests are typically performed annually, this study assessed the longitudinal stability of handgrip strength measurements in university students. Our findings revealed differences in long-term reliability in terms of retest interval, sex, and device, but sports experience did not impact reliability. In particular, reliability (as assessed by absolute error, %CV, and MD) was better in the first year than in the second and third years. Females had lower absolute errors and slightly higher %CV than males. Modern digital devices have better reliability compared with analog devices. These results may help evaluate handgrip strength measurements when performed on an annual basis.
ACKNOWLEDGEMENTS |
We acknowledge all the students who participated in this study. The authors declare no conflicts of interest. This work was supported by the Japan Society for the Promotion of Science to TA (grant number: 22K11610). The experiments comply with the current laws of the country in which they were performed. The authors have no conflict of interest to declare. The datasets generated during and/or analyzed during the current study are not publicly available but are available from the corresponding author who was an organizer of the study. |
|
AUTHOR BIOGRAPHY |
 |
Takashi Abe |
Employment: Institute of Health and Sports Science & Medicine, Juntendo University |
Degree: Ph.D. |
Research interests: Exercise physiology and adaptation by exercise training |
E-mail: t12abe@gmail.com |
|
 |
Scott J. Dankel |
Employment: Department of Health and Exercise Science, Rowan University |
Degree: Ph.D. |
Research interests: Exercise physiology and statistics |
E-mail: dankel47@rowan.edu |
|
 |
Yoshimitsu Kohmura |
Employment: Graduate School of Health and Sports Science, Juntendo University |
Degree: Ph.D. |
Research interests: Sports science |
E-mail: ykoumura@juntendo.ac.jp |
|
 |
Jeremy P. Loenneke |
Employment: Department of Health, Exercise Science, and Recreation Management, The University of Mississippi |
Degree: Ph.D. |
Research interests: Skeletal muscle physiology |
E-mail: jploenne@olemiss.edu |
|
 |
Koya Suzuki |
Employment: Graduate School of Health and Sports Science, Juntendo University |
Degree: Ph.D. |
Research interests: Human growth and development in sport |
E-mail: ko-suzuki@juntendo.ac.jp |
|
|
|
REFERENCES |
 Abe T., Dankel S. J., Buckner S. L., Jessee M. B., Mattocks K. T., Mouser J. G., Bell Z. W., Loenneke J. P. (2018) Short term (24 hours) and long term (1 year) assessments of reliability in older adults: Can one replace the other?. Journal of Aging Research and Clinical Practice 7, 82-84. Crossref
|
 Abe T., Kohmura Y., Suzuki K., Someya Y., Loenneke J. P., Machida S., Naito H. (2023a) Handgrip strength and healthspan: Inmact of sports during the developmental period. Juntendo Medical Journal 69, 400-404. Crossref
|
 Abe T., Song J. S., Dankel S. J., Viana R. B., Abe A., Loenneke J. P. (2025) Impact of potential moderating factors on absolute test-retest reliability of grip strength measurements in healthy populations: A systematic review with meta-analysis. Journal of Sports Science and Medicine 24, 543-554. Crossref
|
 Abe T., Thiebaud R. S., Loenneke J. P. (2016) Age-related change in handgrip strength in men and women: is muscle quality a contributing factor?. Age 38, 28. Crossref
|
 Abe T., Viana R. B., Dankel S. J., Loenneke J. P. (2023b) Different resistance exercise interventions for handgrip strength in apparently healthy adults: A systematic review. International Journal of Clinical Medicine 14, 552-581. Crossref
|
 Biasini N. R., Pellegrino M., Switzer-McIntyre S., Kasawara K. T. (2023) Reliability and validity of shoulder and handgrip strength testing. Physiotherapy Canada 75, 65-71. Crossref
|
 Chen L.-K., Woo J., Assantachai P., Auyeung T. W., Chou M. Y., Iijima K., Jang H. C., Kang L., Kim M., Kim S., Kojima T., Kuzuya M., Lee J. S. W., Lee S. Y., Lee W. J., Liang C. K., Lim J. Y., Lin C. H., Meguro K., Nagai A., Nakakubo S., Ng T. P., Ninomiya T., Ogawa Y., Oyanagi E., Peng L. N., Satake S., Suzuki T., Ubaida-Mohien C., Won C. W., Yamada M., Yamamoto K., Yoshida H., Akishita M. (2020) Asian Working Group for Sarcopenia: 2019 Consensus Update on Sarcopenia Diagnosis and Treatment. Journal of the American Medical Directors Association 21, 300-307. Crossref
|
 Cruz-Jentoft A. J., Bahat G., Bauer J., Boirie Y., Bruyère O., Cederholm T., Cooper C., Landi F., Rolland Y., Sayer A. A., Schneider S. M., Sieber C. C., Topinkova E., Vandewoude M., Visser M., Zamboni M. (2019) Sarcopenia: revised European consensus on definition and diagnosis. Age and Ageing 48, 16-31. Crossref
|
 Hanach N. I., McCullough F., Avery A. (2019) The impact of dairy protein intake on muscle mass, muscle strength, and physical performance in middle-aged to older adults with or without existing sarcopenia: A systematic review and meta-analysis. Advances in Nutrition 10, 59-69. Crossref
|
 Labott B. K., Bucht H., Morat M., Morat T., Donath L. (2019) Effects of exercise training on handgrip strength in older adults: A meta-analytical review. Gerontology 65, 686-698. Crossref
|
 Maurya P. S., Sisneros K. P., Johnson E. B., Palmer T. B. (2023) Reliability of handgrip strength measurements and their relationship with muscle power. Journal of Sports Medicine and Physical Fitness 63, 805-811. Crossref
|
 Nunes E., Colenso-Semple L., McKellar S. R., Yau T., Ali M. U., Fitzpatrick-Lewis D., Sherifali D., Gaudichon C., Tome D., Atherton P. J., Robles M. C., Naranjo-Modad S., Braun M., Landi F., Phillips S. M. (2022) Systematic review and meta-analysis of protein intake to support muscle mass and function in healthy adults. Journal of Cachexia, Sarcopenia and Muscle 13, 795-810. Crossref
|
 Ortega F. B., Zhang K., Cadenas-Sánchez C., Tremblay M. S., Jurak G., Tomkinson G. R., Ruiz J. R., Keller K., Delisle Nyström C., Sacheck J. M., Pate R., Weston K. L., Kidokoro T., Poon E. T., Wachira L.-J. M., Ssenyonga R., Gomes T. N. Q. F., Cristi-Montero C., Fraser B. J., Niessner C., Onywera V. O., Liu Y., Liang Li-Lin, Prince S. A., Lubans D. R., Lang J. J. (2025) The Youth Fitness International Test (YFIT) battery for monitoring and surveillance among children and adolescents: A modified Delphi consensus project with 169 experts from 50 countries and territories. Journal of Sport and Health Science 14, 101012. Crossref
|
 O’Keeffe B. T., Donnelly A. E., MacDonncha C. (2020) Test-retest reliability of student-administered health-related fitness tests in school settings. Pediatric Exercise Science 32, 48-57. Crossref
|
 Ramirez-Velez R., Rincon-Pabon D., Correa-Bautista J. E., Garcia-Hermoso A., Izquierdo M. (2021) Handgrip strength: Normative reference values in males and females aged 6-64 years old in a Colombian population. Clinical Nutrition ESPEN 44, 379-386. Crossref
|
 Savva C., Karagiannis C., Rushton A. (2013) Test-retest reliability of grip strength measurement in full elbow extension to evaluate maximum grip strength. Journal of Hand Therapy 38, 183-186. Crossref
|
 Weir J. P. (2005) Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. Journal of Strength and Conditioning Research 19, 231-240. Crossref
|
 Yoshimura H., Hayashi Y. (2016) History of hand and back-strength dynamometers used in Japan and their existing products. Japan Journal of Test and Measurement in Health and Physical Education 15, 33-42. [Japanese with English Abstract]
|
|
|
|
|
|
|