The objective of this paper is to use data from the highest level
in men's tennis to assess whether there is any evidence to reject
the hypothesis that the two players in a match have a constant probability
of winning each set in the match. The data consists of all 4883 matches
of grand slam men's singles over a 10 year period from 1995 to 2004.
Each match is categorised by its sequence of win (W) or loss (L) (in
set 1, set 2, set 3,...) to the eventual winner. Thus, there are several
categories of matches from WWW to LLWWW. The methodology involves
fitting several probabilistic models to the frequencies of the above
ten categories. One fourset category is observed to occur significantly
more often than the other two. Correspondingly, a couple of the fiveset
categories occur more frequently than the others. This pattern is
consistent when the data is split into two fiveyear subsets. The
data provides significant statistical evidence that the probability
of winning a set within a match varies from set to set. The data supports
the conclusion that, at the highest level of men's singles tennis,
the better player (not necessarily the winner) lifts his play in certain
situations at least some of the time.
KEY
WORDS: Data analysis, independence in tennis, constant probabilities,
psychological development.
