A POSTERIORI CORRECTION OF FORECAST AND OBSERVATION ERROR
VARIANCES
Leonid Rukhovets
SAIC and Global Modeling and Assimilation Office , NASA/Goddard Space Flight Center ,
Greenbelt, MD, USA
1. Proposed method of total observation and
forecast error variance correction is based on
the assumption about normal distribution of
“observed-minus-forecast” residuals (O-F),
where O is an observed value and F is usually
a short-term model forecast. This assumption
can be accepted for several types of
observations (except humidity) which are not
grossly in error (Andersson and Jarvinen
1999, Dharssi et al. 1992, Hollingsworth et al.
1986 , Jarvinen and Unden 1997, Lorenc
and Hammon 1988 ).
Degree of nearness to normal distribution can
be estimated by the symmetry or skewness
{luck of symmetry) a^ = P3A7 3
and kurtosis 84 = pVa 4 - 3
Here p* = i-order moment, a is a standard
deviation. It is well known that for normal
distribution aj = 34 = 0 .
Table 1 contains a^ and a* for O-F’s of
several types of observations: rawinsonde
heights, winds and mixing ratio, aircraft
winds, cloudtrack winds, and surface heights
(recast as upper air) and winds. Six-hour
model forecasts (F) were obtained using the
Goddard Earth Observing System 4.0.3
assimilation run for October 1-31, 2003.
Figs. 1-7 show O-F histograms for these
observations (without gross errors).
Distributions of O-F’s corresponding to
rawinsonde heights and winds, aircraft
winds, cloudtrack winds, surface heights and
winds are close to the normal distribution.
The rawinsonde mixing ratio distribution
can not be considered normal. The kurtosis
value for mixing ratio observations is also
very large.
2. If a random variable X has normal
distribution, then according to the statistical
rules probability of X to be within the
interval (0, aa) is:
P(0 < X <acr) = F(a) - F(0) = F(a) - 0.5 (1)
Here
x
F(x) = M-Jln j e ~' 2/2 dt ,
—co
a - standard deviation of X, a - arbitrary
constant, F - standard normal distribution
function.
Suppose we have a number (percentage) of
observations within some interval (0,aa).
If the number does not correspond to (1), it
can mean we are using a that is not standard
deviation for these observations .
However this number can be used to correct
the standard deviations.
Tablet. The symmetry aj and the kurtosis ay.
rawinhght
rawinwind
rawinhumd
aircrftwind
cldtrkwind
srfheight
srfwind
33
0.13
-0.01
-0.08
0.01
-0.57
0.02
0.04
34
0.87
-1.12
6.08
-0.67
0.41
-1.07
-1.74
Corresponding author address:
Dr. Leonid Rukhovets, SAIC and
NASA/GSFC, Global Modeling and
Assimilation Office, Mail Code 610.1, MD
20771, USA
For this purpose it is convenient to apply so
called “background check” procedure which
is often used as a part of quality control in
data assimilation systems (Andersson and
Jarvinen 1999, Dee et al. 1999) .
The background check performs an
examination of all observations against the
short range (6 hours) forecasts. Actually, the
inequality is verified for each observation:
(O-F) 2 < a 2 {(&° ) 2 + (d f ) 2 } (2)
where a is a tolerance parameter, d° and
<7 f are respectively appropriate prescribed
observation and forecast error standard
deviations. If for some observation the
inequality (2) is not fulfilled, the observation
is marked as suspect. The prescribed values
d° and d f can approximately represent
the true observation and forecast error
standard deviations .
Let a - {(a 0 f + ( d f ) 2 ) v 2 and
a = {(cr°) 2 +(<J / ) 2 } 1/2 , where < 7 ° and <7 f
are true standard deviations. Instead of (2)
we can write:
| O-F \<a(dlo)<7
Suppose M is the percentage of suspect
observations which was obtained by some
background check.
It means:
P(0 < O-F < ad) = 0.5 - M/(2*100)
But according to (1) we have:
P(0 < O-F < ad) - F(a<37<5) - 0.5
Then
F(a<r/a) = 1 - M/(2*100) (3)
Using the table of standard normal
distribution function we can find the value
ad /<5 = m, corresponding to 1 - M/(2*100).
Then we can find c = a d/m (4)
3. Consider one example. Let a- 2. Suppose
a result of the background check gave
suspect observation percentage of 4.6. Then
1 - alii* 100) = 0.977.
From (3) and the standard normal
distribution table we have 2* d/<5 = 2, and
<7 = d . That is d is specified correctly.
But it corresponds to the known statistical
rule: about 4.6% of observations should be
beyond 2o.
4. Conclusions:
a) Using results of a background check
the prescribed statistics of ( d° ) 2 + (d J ) 2
can be corrected. Then the background
check can be repeated with the new
<7 = {((7°) 2 +{(T f ) 2 } m .
b) The equation (4) together with the
results of appropriate background check can
be considered as a relation between
observation and forecast error standard
deviations. If the true value of <7° is known,
we can calculate (7 f .
c) Consider results of two background
checks for the same observation type, but for
different instruments. Then using (4) we can
write two equations for true sigmas :
« ) + (of ) 2 = s l
(C7° 2 ) 2 + (of ) 2 = S 2
Because o( = of for both types of
observations , subtraction of one equation
from the other gives a relation between two
observation error statndard deviations. If one
of them can be found easier than the other,
the relation can be used to get the second
one through the first one. For example,
observation error standard deviation for
TO VS heights can be found through the
rawinsonde height observation error
standard deviation. Analogously, cloud track
wind observation error standard deviation
can be found through the rawinsonde wind
observation error standard deviation.
REFERENCES:
Andersson, E., and H. Jarvinen, 1999:
Variational quality control. QJ.R.
Meterol. Soc., 125, 697-722
Dharssi, A. C A. C. Lorenc, and
N. B. Ingleby, 1992: Treatment of
gross errors using maximum
probability theory. Q.J.R.Meteorl.
Soc., 118,1017-1036.
Dee, D.P., L. Rukhovets, R. Todling, A.M.
da Silva, and J.W.Larson, 1999: An
adaptive buddy check for
observational quality control.
Q.J.R.Meteorl. Soc., 1 27, 245 1 -247 1
Hollingsworth, A. , D. B. Shaw,
P.Lonnberg, L. Illari, K. Arpe and
A.Simmons, 1986: Monitoring of
observation and analysis quality by
a data assimilation system. Mon.
Wea. Rev., 114, 861-879
Jarvinen, H. , and P. Unden , 1997:
Observation screening and first
guess quality control in the ECMWF
3D-Var data assimilation system.
ECMWF Tech. Memo.236.
Lorenc, A. C., and O. Hammon, 1988:
Objective quality control of
observations using Bayesian
methods. Theory and practical
implementation . Q.J. R. Meteorl. Soc.
114,515-543
x 10
Figure 1. Histogram of O-F rawinsonde heights. Global. All levels. October 2003.
Number of observations = 457919. Yelow color corresponds to observations that were
marked as suspect by background check, but passed the adaptive buddy check (Dee et
all. 1999). The blue curve shows the Gaussian distribution for the same mean and
standard deviation.
of observations = 85776
Figure 6. As in Fig.l but for surface geopotential heights. Number of observations
463369.
Figure 7. As in Fig.l but for surface winds. Number of observations - 123254.