|
RELIABILITY OF PHYSIOLOGICAL, PSYCHOLOGICAL AND COGNITIVE VARIABLES
IN CHRONIC FATIGUE SYNDROME AND THE ROLE OF GRADED EXERCISE
|
School of Human Movement and Exercise Science, The University of Western
Australia, Crawely, Western Australia
| Received |
|
22 March 2005 |
| Accepted |
|
16
September 2005 |
| Published |
|
01
December 2005 |
©
Journal of Sports Science and Medicine (2005) 4, 463
- 471
Search
Google Scholar for Citing Articles
| ABSTRACT |
| The
objective of this study was to assess variability in symptoms and
physical capabilities in chronic fatigue syndrome (CFS) participants
both before and after a graded exercise intervention. Sixty-one CFS
subjects participated in a 12-week randomized controlled trial of
either graded exercise (n =32) or relaxation/stretching therapy (n
= 29). Specific physiological, psychological and cognitive variables
were assessed once weekly over a four-week period both prior to and
after the intervention period. All scores were assessed for reliability
using an intraclass correlation coefficient (ICC). Apart from mental
and physical fatigue, baseline ICC scores for all variables assessed
were moderately to highly reliable, indicating minimal variability.
Baseline scores for mental and physical fatigue were of questionable
reliability, indicating a fluctuating nature to these symptoms (R1
= 0.64 and 0.60, respectively). Variability in scores for mental fatigue
was reduced after graded exercise to an acceptable classification
(R1 = 0.76). Results from this study support a variable
nature to the symptoms of mental and physical fatigue only. Consequently,
in order to more accurately report the nature of mental and physical
fatigue in CFS, future studies should consider using repeated-measures
analysis when assessing these symptoms. Graded exercise resulted in
the reclassification of scores for mental fatigue from questionable
to acceptable reliability.
KEY
WORDS: Fluctuating symptoms, repeated measures, single session
measures, repeatability.
|
| INTRODUCTION |
|
The
term chronic fatigue syndrome was chosen by the Centers for Disease
Control and Prevention (CDC; Atlanta, USA) to describe a syndrome
that consisted of chronic debilitating fatigue, which could not
be explained by any known chronic medical or psychological condition
(Holmes et al., 1988).
As a result of this debilitating fatigue, it is not uncommon for
sufferers to also complain of depression, anxiety and a reduced
physical capacity (Wessely and Edwards, 1993).
Attempts to determine the etiology for this disorder have involved
intensive research without the benefit of conclusive results. To
date there is no known cure.
An intriguing aspect to chronic fatigue syndrome (CFS), that has
important research implications, is that sufferers typically report
a fluctuating nature to their symptoms and physical capabilities
(Evengard et al., 1999;
Wilson et al., 1994).
This is represented by periods of wellness, where near normal activity
levels are often resumed, and then periods of relapse that often
requires bed rest (Evengard et al., 1999;
Wilson et al., 1994).
Unfortunately, little research has been undertaken to date in order
to investigate this feature of CFS, yet neglect to account for possible
variation in symptoms and physical capabilities when conducting
studies in this population can lead to the reporting of results
that do not accurately reflect the true nature of CFS, nor the efficacy
of an intervention. This conjecture is supported by various researchers
who note that the reporting of single-session measures may yield
inconclusive results, as levels of fatigue and effort can vary over
extended periods of time (Fuentes et al., 2001;
Gantz and Holmes, 1990;
Kane et al., 1997).
This is particularly pertinent if a study cohort does not meet power
requirements and where symptom fluctuations in a small group of
subjects may significantly influence research outcomes. Use of single-session
measurements in order to assess baseline and post-intervention variables
is common in CFS research (Fulcher and White, 1997;
Powell et al., 2001;
Weardon et al., 1998),
as are small cohorts (Cannon et al., 1999;
Paul et al., 1999;
Clapp et al., 1999).
To date, studies that have assessed the fluctuating nature of CFS
have reported symptom variation over time periods ranging from 24
hours to 3.5 years (Fuentes et al., 2001;
Heinman, 1995;
Hill et al., 1999;
Cabane et al., 2000).
These results suggest that single-session assessment of variables
previously shown to be reliable may not be sufficient in a CFS population.
Reliability can be referred to as the amount of variation that occurs
in results between trials (MacDougall et al., 1991),
or the consistency of measurements from trial to trial (Safrit and
Wood, 1989).
Further to this, Atkinson and Nevill (1998)
note that the term reliability is interchangeable with the terms
'repeatability', reproducibility', 'stability', 'agreement' and
'concordance', while the phrase 'stability reliability' is defined
specifically as relating to the day-to-day variability in measurements.
Atkinson and Nevill (1998)
further note that in some circumstances more time than a day may
be needed between measurement sessions in order to allow for recovery
from exercise tests.
If there is a fluctuating nature to symptoms and capabilities in
CFS, then graded exercise may have a role in reducing this variability.
Symptoms and physical capabilities in CFS sufferers have consistently
been shown to be improved by the use of graded exercise therapy
(Fulcher and White, 1997;
Powell et al., 2001;
Wallman et al., 2004;
Weardon et al., 1998).
It is therefore plausible to postulate that graded exercise may
also reduce the range in any variation found in these symptoms and
physical capacities.
The aim of our study was twofold. Firstly, we aimed to assess the
stability and reliability of scores recorded for specific physical,
psychological and cognitive variables that were measured once a
week over a four-week period in a CFS population. Measurements were
made on a weekly rather than a daily basis in order to allow CFS
participants time to recover fully from their previous exercise
session. Reliability scores were then categorised according to guidelines
proposed by Vincent (1995)
as being either highly reliable, moderately reliable or of questionable
reliability (see the statistical procedures section). A second aim
of our study was to determine whether a twelve-week program of graded
exercise therapy was capable of reducing any variability found in
any of the measures analysed during the initial assessment. It was
hypothesized that all baseline variables assessed would be of questionable
reliability (i.e. highly variable) and that a twelve-week graded
exercise program would reduce this variability, resulting in the
reclassification of intraclass correlation coefficient (ICC) scores
to more reliable categories.
|
| METHODS |
|
Eighty-two
chronic fatigue syndrome participants (aged between 16-74 years)
were recruited from advertisements placed in local newspapers and
from notices placed in medical surgeries. Prior to participation
in this study, written confirmation of a CFS diagnosis was required
from each subject's medical doctor. To receive a CFS diagnosis,
participant's needed to meet the working case definition of CFS
as defined by Fukuda et al. (1994).
This definition requires two major criteria and four of eight minor
criteria to be met. The major criteria consist of the existence
of severe fatigue that persists for six months or longer and that
cannot be explained by any other chronic medical or psychological
disorder. The minor criteria consist of a variety of minor symptoms
that include impaired memory, sore throat, tender lymph nodes, muscle
pain, multi-joint pain, new headaches, unrefreshing sleep and post-exceptional
malaise. This requirement resulted in 14 participants being excluded
due to non-compliance or for not meeting the CFS criteria. The remaining
68 patients were randomized to a graded exercise or relaxation/
flexibility intervention. Randomization involved the use of a random
number table and was performed by an independent investigator. Six
participants withdrew from the study prior to baseline testing for
personal reasons, while one participant was unable to perform the
cycle test. This left 32 CFS participants in the graded exercise
group (27 females and 5 males), and 29 CFS participant in the relaxation/flexibility
group (20 females and 9 males; refer Figure
1). These subjects participated in all testing sessions held
prior to and post the intervention. All testing sessions were performed
in a university performance laboratory. Participants taking medication
needed to have been on this medication for at least six weeks prior
to the commencement of baseline trials, while changes made to existing
medication during the trials resulted in the deletion of the participant's
data prior to analysis. Ethics approval for this project was granted
by the University of Western Australia Human Research Ethics Committee,
and all participants completed consent forms prior to entering the
trials.
Participants were required to attend weekly testing sessions that
were held at the same time and on the same week day over a four-week
period, both before and after the 12-week intervention program.
The same exercise physiologist conducted all tests. On arrival for
the first testing session, age, height and illness duration were
recorded for all participants.
In order to encourage more incapacitated CFS sufferers to participate
in our study, as well as increase the likelihood of participants
returning for repeat testing, exercise testing involved a sub-maximal
cycle test called the Aerobic Power Index test (Telford et al.,
1989).
This test is a modification of the PWC170 (physical work
capacity at a heart rate of 170 bpm; Wahlund, 1948)
exercise test and has been shown to be reliable in a CFS population
(intraclass correlation coefficient = 0.97 which equated to high
reliability; Wallman et al., 2003).
Prior to the commencement of the first testing session, a target
heart rate (THR) was determined based on the formula: THR = 220
- age x 0.75 (Telford et al., 1989).
Additionally, individual body-mass was recorded prior to each exercise
test using Sauter scales (August Sauter GmbH D-7470 Albstadt 1 Ebingen,
West Germany). Participants were then fitted with a Polar Beat HR
monitor (Polar Electro Oy, Kempele, Finland), and seated on a front-entry
Exertech Ex- 10 cycle ergometer (Repco Cycle Company, Huntingdale,
Victoria). Seat height was adjusted and recorded along with resting
HR. The exercise protocol required participants to pedal at a rate
of 25 Watts (W) for one minute, with this rate increasing by 25
W every subsequent minute. The exercise test was terminated at the
end of the minute that the individual THR was reached and an interpolation
procedure (as described in Telford et al., 1989)
was used in order to equate individual THR with the power output
(W·kg-1) achieved. If a participant was unable to reach
their individual THR, then the Watts achieved during the last full
minute of exercise were recorded as peak W·kg-1. Heart
rate was recorded at the end of each minute of the cycle test, while
ratings of perceived exertion (RPE), as measured by the Borg scale
(Borg, 1982),
was recorded 55 seconds into each minute of the exercise test. A
similar interpolation procedure to that used to determine power
output at THR was employed to determine RPE, oxygen uptake (ml·kg-1·min-1)
and respiratory exchange ratio (RER) values that equated to each
individual's THR. Reliability of RPE and oxygen uptake (ml·kg-1·min-1)
values recorded during this exercise test have been previously demonstrated
in a CFS population (ICC = 0.87 and 0.91, which equated to moderate
and high reliability respectively; Wallman et al., 2003).
During the exercise test, oxygen consumption was analysed by a metabolic
cart which consisted of a computerised on-line system. The volume
of inspired air was analysed by a Morgan Ventilometer Mark II 225A
(P.K. Morgan, UK), while expired air was continuously sampled and
recorded every 15 seconds by Applied Electrochemistry S-3A O2
and CD-3A CO2 analysers (Pittsburg, PA, USA). The Morgan
ventilometer and the O2 and CO2 sensors were
calibrated prior to and after each test. All data were corrected
for any gas or ventilatory drift. Analysis of data involved averaging
each minute of data and then assessing the last two minutes of this
data.
Current activity levels were assessed using the Older Adult Exercise
Status Inventory (OA-ESI; O'Brien- Cousins, 1996).
The OA-ESI is a seven-day self-report inventory that assesses the
frequency, duration, and level of intensity of a wide range of work
and physical activities that are commonly undertaken by people both
young and old. Test-retest reliability scores from two separate
studies for the OA-ESI were r = 0.756 and r = 0.771, while concurrent
validity was determined by examining the correlations of weekly
exercise with other previously validated activity indicators, and
resulted in scores that ranged from r = 0.411 to r = 0.491 (O'Brien-Cousins,
1996).
Mental and physical fatigue were assessed using the Chalder Fatigue
scale (an 11-item, self-report scale; Chalder et al., 1993).
Split half reliability studies resulted in r =.861 and r =.847 (n
= 274), while validation coefficients for the original 14-item Chalder
fatigue scale resulted in scores of 75.5 for sensitivity and 74.5
for specificity (Chalder et al., 1993).
As a consequence of this result, three items were eliminated from
the Chalder fatigue scale leaving 11 items, which resulted in a
Cronbach's alpha for the revised version of 0.89 (Chalder et al.,
1993).
The Hospital Anxiety and Depression Scale (HADS; Zigmond and Snaith,
1983)
is a 14-item self-report questionnaire used to rate anxiety and
depression levels. Reliability, validity and psychometric properties
have been established for HADS (Zigmond and Snaith). Both questionnaires
required participants to rate how they felt during the previous
week, including the day of testing.
Cognitive functioning was assessed using a computerised version
of the modified Stroop Color Word test (MacLeod, 1991).
This visual attention test assesses the level of interference caused
by irrelevant stimuli. Participants were required to complete two
levels of the test that differed by speed of presentation of the
stimuli. Only the second and more difficult level (requiring 95
responses in a two-minute time period) was included for analysis,
as the first level served as a warm-up session. Scoring was based
on the number of correct responses given.
Initial exercise duration was based upon each participant's current
activity level and ranged from 5-15 minutes, while exercise intensity
(HR beats per minute·bpm-1) was based upon the mean HR
value (bpm) achieved midpoint during the sub-maximal exercise tests.
Each participant was supplied with a Polar HR monitor in order to
assist them attain the required HR intensity. Graded exercise was
aerobic in nature and consisted of swimming, cycling or walking.
Exercise was home-based and was attempted every second day, unless
a relapse occurred. If a relapse occurred, then participants were
advised either to avoid exercise or to reduce the duration and/or
intensity of the exercise until the participant felt that they could
recommence the prescribed program again. Participants were supplied
with a small laminated Borg scale and were required to rate their
sense of effort on completion of each exercise session. Details
relating to HR, RPE and duration of each exercise session were recorded
in a diary by the participant. Every second week, participants were
contacted by phone in order to review their progress and to determine
the duration of the exercise session for the next fortnight. If
RPE values recorded over the fortnight were either stable or decreasing,
then exercise duration was increased by approximately five minutes
for the following fortnight. When exercise duration reached 30 minutes,
then intensity was increased by raising the target heart rate by
approximately two bpm each month.
Relaxation/flexibility therapy was used in this study as a non-exercise
control intervention. Participants in this group were required to
listen to a 20 minute relaxation tape and to perform simple stretching
exercises every second day over 12 weeks. The number of stretches
performed increased gradually each fortnight from an initial number
of 4 to a total of 20 in week ten. All participants kept a diary
recording details of their sessions. Every second week, participants
were contacted by phone in order to review their progress and to
discuss the stretching prescription for the following fortnight.
The same exercise physiologist worked with both groups and a concerted
effort was made in order to spend the same amount of time on the
phone to all participants in both therapies. Participants participating
in relaxation/stretching were asked not to participate in any extra
physical activity while they were enrolled in the study.
Statistical
analyses
An independent samples t-test was used in order to compare age,
height, body-mass, activity levels, and illness duration between
the two groups. An intraclass correlation coefficient (ICC; Winer,
1971)
was employed as the primary outcome measure in order to assess the
variability of data collected weekly over the four-week periods,
both before and after the intervention. According to Vincent (1995),
an ICC is the most appropriate method for assessing the reliability
of repeated measures as an ICC is a univariate statistic that is
sensitive to changes in both the order and the magnitude of these
repeated values. Additionally, an ICC is the recommended method
for assessing the reliability of physiological measures by the Australian
Institute of Sport Laboratory Standards Assistance Scheme. Intraclass
correlation coefficient values were calculated using Version 11
of the Statistical Package for the Social Sciences (SPSS) and incorporated
an ANOVA. Classification of reliability for physiological and cognitive
measures followed the guidelines proposed by Vincent (1995),
with R1 (ICC reliability) scores above 0.90 categorized as highly
reliable, values between 0.80 and 0.89 considered as moderately
reliable, while values below 0.80 were considered to be of questionable
reliability. Further to this, ICC values below 0.70 for the self-report
measures were considered to be of questionable reliability (Vincent,
1995).
|
| RESULTS |
|
Independent
sample t-tests confirmed no significant differences between the
two groups for age (p = 0.45), height (p = 0.37), body-mass (p =
0.76), current activity levels (p = 0.67), and length of illness
(p = 0.92). On completion of the trials, participants from both
groups reported that there were no adverse events associated with
either intervention.
Baseline and post-intervention data for both groups can be found
in Table 1. Baseline ICC scores
were moderately to highly reliable (indicating minimal variability)
for all variables assessed, except for scores for mental and physical
fatigue, which were of questionable reliability.
Post-intervention ICC scores for resting variables were similar
to baseline scores for both groups in that scores for HR and systolic
blood pressure (SBP) were rated as highly reliable, while scores
for diastolic BP were moderately reliable. In relation to the exercise
test, post-intervention ICC scores for peak oxygen uptake, peak
RPE/peak W·kg-1, and peak power (W·kg-1) were
all similar to baseline values (i.e. highly reliable), except for
ICC scores for RPE recorded at the end of the first minute of exercise,
which improved from moderately to highly reliable for both groups
after the intervention. Post-intervention reliability scores for
respiratory exchange ratio were similar to baseline values for the
exercise group (i.e. moderately reliable), yet changed from moderately
reliable to that of questionable reliability in the relaxation/stretching
group.
In relation to psychological variables, scores for anxiety and depression
improved from moderately to highly reliable in the exercise group
after the intervention. Post intervention scores for anxiety also
improved from a moderate to a highly reliable ranking for the relaxation/stretching
group, however reliability scores for depression in this group did
not change over the course of the therapy. Additionally, while reliability
scores for mental fatigue were similar to baseline values after
relaxation/stretching therapy (i.e. of questionable reliability),
reliability scores for this symptom increased by 0.12 in the exercise
group resulting in an acceptable classification. Post-intervention
ICC scores for physical fatigue were still highly variable in both
groups (i.e. of questionable reliability).
Post-intervention scores achieved on the cognitive test improved
in both groups from moderately to highly reliable, while post-intervention
scores for activity levels were similar to baseline values and remained
classified as highly reliable.
|
| DISCUSSION |
|
Chronic
fatigue syndrome sufferers commonly report a variable nature to
their symptoms and physical capabilities, however to date there
have been few studies that have investigated this proposed phenomena.
In order to accurately assess interventions and report results in
CFS studies, this area needs further examination. Our study investigated
symptom variability in specific physiological, psychological and
cognitive variables that were assessed weekly over a four-week period,
both before and after a graded exercise intervention. All instruments
used to assess these variables had been previously shown to be reliable.
While it was hypothesised that all baseline variables would be found
to be of questionable reliability, results demonstrated that this
was only true for scores for mental and physical fatigue. High variability
in scores for mental and physical fatigue also supports previous
studies that reported a fluctuating nature to the symptom of fatigue
in CFS participants (Fuentes et al., 2001;
Hill et al., 1999;
Cabane et al., 2000).
Causes proposed as the basis for high symptom variability in CFS
are varied and include: ion channel dysfunction (Chaudhuri and Behan,
2000);
the reactivation of viruses (Patarca-Montero, 2002);
sleep deprivation, social disruption; reduced physical activity
(Williams et al., 1996);
as well as intermittent physical and/or emotional stress that may
trigger an abnormal neuroendocrine function (Demitrack, 1994).
Additionally, Tomoda et al. (2001)
suggest that biological rhythm disturbances demonstrated in some
CFS participants could be a consequence of changes in cerebral blood
flow or metabolism.
Post-intervention results for all resting and physiological variables
were similar to baseline scores in both groups, once again demonstrating
minimal variation in these measures. However, post-intervention
reliability scores for psychological variables showed that graded
exercise resulted in reduced variability, and hence the reclassification
of all psychological variables to a higher reliability category,
except for physical fatigue scores which were less variable, but
not enough to be reclassified. Reduced variability in psychological
scores may be related to reported improvement in these symptoms
after a graded exercise intervention (Fulcher and White, 1997;
Powell et al., 2001;
Wallman et al., 2004;
Weardon et al., 1998).
It is feasible to presume that if the sensation of a symptom is
reduced, then this is likely to minimize the number of times that
the sufferer notices it. Of interest, is that post-intervention
reliability scores for anxiety also improved in the relaxation/stretching
group. This is not surprising as relaxation and stretching techniques
have been reported to reduce stress and consequently anxiety (Freidberg,
1995;
Lewis et al., 1994),
which may in turn reduce variability in this symptom over time.
Finally, improvement in reliability for post-intervention scores
recorded on the Stroop Color Word test was noted in both groups.
Graded exercise has been suggested by Blackwood et al. (1998)
to improve automaticity in physical movement, which may result in
the freeing up of attentional processes that can then be diverted
to cognitive function. This could consequently minimise variation
in cognitive processing. Improved reliability in cognitive scores
after relaxation/stretching therapy could be due to the ability
of this particular therapy to reduce stress, which may in turn improve
cognitive function and subsequently reduce variability.
A paradox exists between the results shown in this study and the
fluctuating nature of symptoms and physical capabilities commonly
reported by CFS sufferers. An explanation could be that as mental
and physical fatigue represent the defining symptoms of CFS, any
variation felt by some sufferers in these symptoms may be transferred
in a general sense to other symptoms. Additionally, variations in
the sensation of fatigue may also contribute to global feelings
of being well or unwell. Further to this, when CFS sufferers are
specifically required to isolate and rate sensations, they may be
able to then differentiate between any variances in these symptoms.
Another explanation could be that symptoms and physical capabilities
in CFS may fluctuate over a 24 hour period. If this is the case,
then the protocol used in this study was not designed to monitor
and record these changes. Further to this, questionnaires used to
record feelings of anxiety, depression and fatigue required participants
to report how they felt in the previous week, including the day
of testing. While it is likely that responses would have mostly
reflected subjective feelings on the day of testing, it would be
better in future studies to require participants to record how they
were feeling at the exact time of testing only.
Further research that involves regular repeated assessment of commonly
reported symptoms in CFS over a period of time longer than four
weeks, using different assessment instruments, could provide more
insight into the complaint of high symptom variability in this disorder.
|
| CONCLUSIONS |
| Questionable
reliability scores recorded for mental and physical fatigue prior
to the commencement of the interventions supports a variable nature
to these symptoms in CFS. Conversely, scores for all other measures
assessed were shown to vary minimally over a four-week period. Results
from this study suggest that future studies, particularly those where
participant numbers do not meet power requirements, should consider
employing a repeated-measures analysis when assessing the symptoms
of mental and physical fatigue in CFS participants and report the
averaged results. Further to this, a twelve-week intervention of graded
exercise was shown to reduce symptom variation in mental fatigue,
as well as to improve the reliability of scores related to anxiety,
depression, attention, and RPE at the end of the first minute of exercise. |
| KEY
POINTS |
- Chronic
fatigue syndrome sufferers often report a fluctuating nature to
their symptoms and physical capabilities.
- Weekly
assessment over a four-week period of psychological, physiological
and cognitive variables demonstrated that only mental and physical
fatigues were of questionable reliability.
- A
12-week graded exercise intervention resulted in the improvement
of ICC scores for mental fatigue to that of acceptable reliability.
|
| AUTHORS
BIOGRAPHY |
Karen E. WALLMAN
Employment: Lecturer, University of Western Australia.
Degree: PhD.
Research interests: Chronic fatigue syndrome, obesity.
E-mail: kwallman@cyllene.uwa.edu.au |
|
Alan MORTON
Employment: Emeritus Professor, University of Western Australia.
Degree: PhD, Ed.D, FACSM.
Research interests: Exercise induced asthma, chronic
fatigue syndrome and cardio-respiratory response to acute and
chronic exercise.
E-mail: amorton@cyllene.uwa.edu.au |
|
Carmel GOODMAN
Employment: Lecturer, Sports Physician, University of Western
Australia.
Degree: MD.
Research interests: Varied and includes chronic fatigue
syndrome.
E-mail: cgoodman@cyllene.uwa.edu.au |
|
Robert
GROVE
Employment: Professor, University of Western Australia.
Degree: PhD.
Research interests: Social psychology of exercise, health
and sport.
E-mail: :
Bob.Grove@uwa.edu.au |
|
|
|
|