Winning at Tennis

Powerful serves, sharply angled shots, soaring lobs, sneaky drop shots, lengthy baseline rallies, unforced errors, and disputed points are all elements of professional tennis matches.

Intriguingly, mathematical models tend to show that the chances of winning a game, set, or match in tennis come down to the probability that a player wins a rally when he or she serves.

In the April Studies in Applied Mathematics, Paul K. Newton of the University of Southern California and Joseph B. Keller of Stanford University provide formulas for computing a tennis player's chances of winning and, in effect, for predicting the outcome of tennis tournaments. In the context of their model, Newton and Keller also prove that the probability of winning a set or match doesn't depend on which player serves first.

It isn't enough to know just the rankings of the two players in a tennis match. There's no obvious, unambiguous way to translate a ranking into a probability of winning. Instead, it turns out that the key factor is the probability that each player wins a rally against the other player when he or she serves. Typically, you assume that these probabilities stay the same throughout a match.

Newton and Keller developed formulas, based on these serve probabilities, for the probability of winning a game, a set, or a match. When their theory is compared with data from the 2002 U.S. Open singles matches, the agreement of model and "real thing" is quite remarkable.

Data for the semifinalists in the 2002 U.S. Open tennis tournament (women's singles):

Player	A	B	C	D	E	F	G
S. Williams	240	349	52	57	0.69	0.91	0.89
V. Williams	270	428	56	70	0.63	0.80	0.79
L. Davenport	206	301	45	53	0.68	0.85	0.88
A. Mauresmo	287	457	58	75	0.63	0.77	0.79
Column A: points won on serve; B: total points served; C: games won on serve; D: total games served; E: fraction of rallies won on serve (A/B); F: fraction of games won on serve (C/D); G: theoretical probability of winning a game on serve, given E.

The formulas developed by Newton and Keller can be used, for example, to predict the tournament champion after the quarterfinal round, based on the accumulated data through this round.

The researchers also prove that, in theory, the probability of a player winning a set when serving first is equal to his or her probability of winning the set when receiving first.

The assumption that the probability of winning a rally on serve stays constant doesn't quite hold throughout an actual match. For example, a study by Franc Klaassen of the University of Amsterdam and Jan Magnus of Tilburg University has shown evidence of a "first game effect." More specifically, the first game of a match is the hardest one in which to break serve.

So, the probabilities of winning a point on serve can vary a bit from game to game. They may even depend on the specific pair of players in a given match. And there's evidence that winning the previous point, game, or set increases the chances of winning the next.

However, except perhaps in head-to-head matches between particular individuals, such factors by themselves tend to have only a small effect on outcomes.

Klaassen and Magnus have developed a model, depending largely on serve probabilities and on player rankings, that allows a computer program to calculate the probability of winning a given tennis match—a probability that's updated as the match progresses point by point! See http://www.stms.nl/december2003/artikel7.htm.

Having such a program at hand to provide the latest estimate of who will win would certainly add a new element for TV spectators, going beyond the current score, percentage of first serves in, the number of aces, and other data that commentators normally provide to furnish insights into the game.

References:

Klaassen, F.J.G.M., and J.R. Magnus. Preprint. How to reduce the service dominance in tennis? Empirical results from four years at Wimbledon. Available at http://dare.uva.nl/document/1742.

______. 2003. On the probability of winning a tennis match. Medicine and Science in Tennis (December).

______. 2003. Forecasting the winner of a tennis match. European Journal of Operations Research 148:257-267. See http://ideas.repec.org/p/dgr/kubcen/200138.html.

Klaassen, F.J.G.M., and J.R. Magnus. 2001. Are points in tennis independent and identically distributed? Evidence from a dynamic binary panel data model. Journal of the American Statistical Association 96(June):500-509. Abstract.

MacPhee, I., J. Rougier, and G.H. Pollard. 2004. Server advantage in tennis matches. Journal of Applied Probability 41(April):1182-1186. Abstract.

Magnus, J.R., and F.J.G.M. Klaassen. 1999. On the advantage of serving first in a tennis set: Four years at Wimbledon. The Statistician 48(July):247-256. Abstract.

______. 1999. The final set in a tennis match: Four years at Wimbledon. Journal of Applied Statistics 26(May):461-468. Abstract.

______. Preprint. Testing some common tennis hypotheses: Four years at Wimbledon. See http://ideas.repec.org/p/dgr/kubcen/199673.html.

Newton, P.K., and J.B. Keller. 2005. Probability of winning at tennis I: Theory and data. Studies in Applied Mathematics 114(April)241-269. Abstract available at http://dx.doi.org/10.1111/j.0022-2526.2005.01547.x.

Peterson, I. 1997. Getting slammed in tennis. Science News Online (Nov. 29). Available at http://www.sciencenews.org/pages/sn_arc97/11_29_97/mathland.htm.

Information about the International Tennis Federation congresses on tennis science and technology can be found at http://www.itftennis.com/.