Despite the high exposure of businesses to rainfall risk most weather derivatives that have been traded have been based on a temperature indices. Nevertheless many end users, such as farmers or hydroelectric generators, are sensitive to rainfall magnitude and frequency. The consequences of too much or not enough rain are spread widely and, directly or indirectly, we all suffer from abnormal rainfall magnitudes.

A typical difficulty with risk analysis based on rainfall is that the magnitude and the frequency of rainfall strongly depend on the site where it is measured whereas the temperature is a more ambient measure. By way of example, if at Heathrow the temperature is 20°C then you can assume that the temperature in central London is near 20°C. However, even if it rains heavily in Heathrow you cannot assume that it will rain simultaneously in central London. There is a high positive probably it does, but it is not 100%.

Until now rainfall process studies have been limited to either cumulative rain over a given period, mainly monthly and yearly, or on the rainfall process over a small time period, such as every five or six minutes. The first topic is quite limited and lacks interest for weather derivatives purposes. The second one is very detailed but irrelevant for market transactions. Instead of statistically analysing the rain, it may be preferable to simulate it.

Only Heathrow data are used from now on.

**Frequency**

The first step in building a stochastic rainfall process is to understand with which probability rainfall occurs. Figure 1 presents the historical frequency rainfall at Heathrow.

The frequency curve is not symmetric and a sinusoidal function might be inadequate to fit it. A smoothing process can be used to correct erratic values (smooth line). Subsequent results are strongly dependent on these probabilities; thus the extraction must be done cautiously.

**The Rainfall Persistence**

The probability that it rains depends on the day of the year. But does it also depend on the past? Without doubt in the case of Heathrow, it appears that when the weather is dry, the following day is more likely to be dry than rainy, and vice versa. So the probability that it rains is conditional on the past. But the probability that it rains depends on the time of year. As a consequence, it is more likely to rain two consecutive days during wintertime than in summertime and the reverse in summertime. Therefore a natural autocorrelation is created which interferes with a possible true autocorrelation.

The model of the persistence of rain is set hereafter.

We note X_{t} the event "it rains at day t". X_{t} is Bernoulli distributed:

Where 1 is for the event "it rains".

We know the historical mean of X_{t} from our first studies. However, this is insufficient to model the time series X_{t}. As a matter of fact, assuming the independence of X_{t} we have E[X_{t} | X_{t-1}, X_{t-2}, X_{t-3}] = E[X_{t}] = p_{t} which is different from E_{H}[X _{t}] (the operator E_{H} is for the historical mean). Therefore the probability p_{t} has a time dependent expectation and is conditional to X_{t-1}, X_{t-2}, X_{t-3},...

A recurrence is produced to estimate the order of this lag dependence. The probability p_{t} is assumed to be given by: p_{t} = Prob(X_{t} = 1 | X_{t-1}, X_{t-2} X_{t-3} X_{t-k} ), with k IN^{*}.

The aim is to extract the minimum value of k that produces the best fit of the distribution of the length of period of rain. Considering a 365 day year, we assume that E[p_{t}] = E[p_{t+365}] which means that the climate does not vary over years. We first estimate conditional probabilities with k=1, then simulate the process and compare the simulated distribution of the length of the period to the smoothed historical one. Then the method is reproduced with higher values of k until no more information on the probability p_{t} is added.

Assuming independence between successive rainy days we have simulated the rainfall below using the historical probability for 38 years (the same length of period from which the graph below has been calculated). Supposing k=1 a good fit is already obtained:

**The Magnitude Process**

Once the length of the rainy period is known the intensity of the rainfall must be evaluated for each day. The previous study didn't reveal a dependence on the length of the rainy period. But studies show that there is a dependence on the length of the period.

Since the average conditional to the previous day is different, the distribution is certainly different. Only four events can be enumerated for a rainy day t under the assumption k=1: R_{t-1} / R_{t} / R_{t+1} or NR_{t-1} / R_{t} / R_{t+1} or R_{t-1} / R_{t} / NR_{t+1} or NR_{t-1} / R_{t} / NR_{t+1} with R_{t} the event it rains at day t and NR_{t} the event no rain at day t. Four distributions for each day of the year should be estimated. In order to reduce estimation bias errors a 30-day period bracketing the day for which the distributions are worked out is considered.

The main characteristics are summarised in this table:

*The four distributions are very different from each other. The average (resp. maximum) between the event R _{t-1} / R_{t} / R_{t+1}and NR_{t-1} / R_{t} / NR_{t+1} is approximately divided by 2.5 (resp. 3.5). One can conclude that the distribution of the magnitude of rainfalls depends on the immediate past and future.*

*Doing the same for each day of the year, all the required information to run the simulations properly is eventually obtained. First, all the rainy days are simulated and then the magnitude of rain is randomly generated using the correct distribution between the four possible ones for each day.*

* *

**Conclusion**

*Rainfall magnitude is extremely dependent upon the location where it is measured. This risk can be as high as 100% in a single month for close locations, approximately 20 miles. Because the probability that it rains is non-constant during a whole year, this phenomenon creates a natural autocorrelation in the process. This pitfall has to be avoided and the rainfall process can be decompounded into two steps. The first stage is the frequency process and the second stage is the magnitude given the frequency. In order to know the distribution of the rainfall magnitude, it is just as important to know if it rained the previous day as if it will rain the next day.*

* *

*This week's Learning Curve was written by* *Michael Moreno,**director at* *Speedwell Weather Derivatives**in London.*

*Derivatives Week* is now accepting submissions from industry professionals for Learning Curve^{®}, the tutorial for new or potential users of derivatives. For details and guidelines on writing a Learning Curve^{®}, please call Jeremy Carter in London at 44-20-7303-1753 Matthew Tremblay in Hong Kong at 852-2912-8097 or Joseph Nadilo in New York at 212-224-3642, .