All-sky assimilation of microwave imager observations sensitive to water vapour, cloud and rain

All-sky assimilation of microwave imager observations sensitive to water vapour, cloud and rain A.J. Geer, P. Bauer, P. Lopez and D. Salmond European Centre for Medium-Range Weather Forecasts, Reading, UK Alan.Geer@ecmwf.int The European Centre for Medium-range Weather Forecasts (ECMWF) has assimilated rainand cloud-affected microwave imager observations since 2005. Originally, observations were split into clear and cloudy streams, with the clear observations being used directly in the fourdimensional variational assimilation (4D-Var) and the cloudy observations passing through an initial 1D-Var step, after which a total column water vapour (TCWV) pseudo-observation was assimilated into 4D-Var. A new all-sky system, introduced operationally in March 2009, assimilates all observations together directly into 4D-Var. The radiative effect of clouds and precipitation are simulated where necessary, and for the first time observational information on hydrometeors can be fed directly into the analysed fields. An advantage of the new approach is that the first guess can be corrected where the model is cloudy and the observations are clear. While the all-sky system shows roughly the same forecast performance as the old one, it gives a starting point from which to really begin improving the cloud and rain analysis. 1. Introduction The quality of weather forecasts has been substantially improved in recent decades by making better use of satellite data, especially in the Southern Hemisphere and over the ocean where other observations are scarce. However, satellite observations are still rarely used in areas of precipitation and heavy cloud. In the hope of improving forecasts further, Numerical Weather Prediction (NWP) centres are currently trying to improve the use of satellite observations in cloudy and rainy areas. However, this faces the difficulties that: (a) radiative transfer simulations are more tricky in these regions; (b) compared to clear-sky, rain and cloud modelling is inaccurate, in part because the processes take place on much smaller scales than those of the model grid box; and (c) observation and model operators are far more non-linear than in clear sky. One approach, which could be considered a development of traditional cloud clearing, is to make best use of temperature and humidity information in clear sky areas, while removing the effects of cloud. Recent developments along these lines try to infer information above cloud tops whilst leaving the cloud information as a sink variable (Pavelin et al., 2008; McNally, 2009). A second approach could be considered true rain and cloud assimilation. Observations sensitive to precipitation and cloud are assimilated directly, with the modelled cloud and precipitation fields changing during the assimilation procedure so as to bring the model closer to the observations. 226.1

8000 6000 Number per 1K bin 4000 2000 0 15 10 5 0 5 10 15 FG departure [K] Figure 1: Histograms of SSM/I first-guess radiance departures for samples where both model and observations contain clouds (dotted) or only clear-sky (solid), where the observations contain clouds but the model is cloudfree (dash-dotted) and vice versa (dashed) for channel 19v. Data drawn from a 12-hour assimilation on 1 July 2009 00 UTC. The European Centre for Medium-range Weather Forecasts (ECMWF) has assimilated cloudand precipitation-affected Special Sensor Microwave / Imager (SSM/I) radiance observations since 2005, and similar sensors on other satellites have been included since 2007. SSM/I observations are sensitive to surface properties, total column water vapour (TCWV), cloud ice and water, rain and frozen precipitation. The original approach performed a 1D-Var retrieval of total column water vapour (TCWV), which was then assimilated as a pseudo-observation into the 4D-Var system. This technique, known as 1D+4D-Var, was established as a safe and robust way of using rain observations and coping with the nonlinearity of the moist physics operators (Marécal and Mahfouf, 2002, 2003). A major development was an upgrade to the linearised moist physics operators (Tompkins and Janisková, 2004; Lopez and Moreau, 2005), which allowed the operational implementation of 1D+4D-Var in June 2005 (Bauer et al., 2006a,b). In parallel, clear-sky SSM/I observations were being assimilated directly into 4D- Var. Several years experience with the 1D+4D-Var approach allowed us to understand its advantages and disadvantages. Kelly et al. (2008) tested the impact of rainy 1D+4D-Var SSM/I assimilation on operational forecasts at ECMWF. When the rainy SSM/I observations were removed from the full operational observing system, the result was a small degradation in the skill of humidity and tropical wind forecasts, indicating that the rainy assimilation provides a small positive contribution. However, adding 1D+4D-Var assimilation into a low baseline system, from which most other observations had been removed, showed a much larger impact. This is quite typical for any observation type, but it does suggest a redundancy of information between the cloudy/rainy 1D+4D-Var and the rest of the observing system. Geer et al. (2008) showed that the humidity information coming from the 1D+4D-Var TCWV pseudo-observation was in large part duplicating what was already coming from clear-sky temperature and moisture observations. To make better use of the unique rain and cloud information content of the SSM/I observations would require a direct 4D-Var assimilation of the radiances. Geer et al. (2008) also showed that the sampling in the 1D+4D-Var system was biased, be- 226.2

Figure 2: Representation of the steps and operators needed to minimise the 4D-Var cost function. Bold text (or blue boxes) indicate the things that have been modified for all-sky 4D-Var and are not present in traditional clear sky data assimilation. Cloud and rain are in italics to show that these are not yet a part of the control vector, but are part of the state vector. cause the separation of observations between the clear stream (direct 4D-Var) and cloudy skies (1D+4D-Var) was determined from the observations alone. For the set where the observations were cloudy, the first guess (FG) might still be clear, meaning that the model would always appear on average drier than the observations. For a more balanced sampling, the cloudy stream should also treat the cases where the FG was cloudy and the observations were clear. This would, for the first time, allow a clear observation to remove spurious modelled cloud or rain. Figure 1 shows the distribution of FG departures in these different sets, illustrating that for balanced sampling, roughly 60% (or even more, depending on the cloud criterion) of observations would have to go through the cloudy stream. For the new direct 4D-Var assimilation there seemed no point in maintaining a separate clear-sky path. Instead, the new approach passes all observations, whether clear, cloudy or rainy, through exactly the same observation operator, with cloud and rain radiative transfer being activated when necessary. This is known as an all-sky approach. The assimilation of all-sky SSM/I radiances directly into 4D-Var began operationally in March 2009. Here we examine that system. However, it still has a number of limitations, and these will be addressed by a major revision which it is hoped will become operational in early 2010. This will use a symmetric approach to errors and biases, which is explained briefly in Sec. 4. 2. Method Figure 2 helps summarise the all-sky system. In bold are the main areas that have changed compared to a clear sky 4D-Var assimilation. As mentioned in the introduction, observations are assimilated in all sky conditions. The observation operator, and its tangent linear and adjoint, contain scattering radiative transfer and can deal with cloud and rain. Cost function gradients with respect to cloud are passed back into the adjoint of the forecast model, so 4D-Var has to fit the information on cloud and rain as well as all other observations being assimilated. The only thing lacking from a full cloud and rain assimilation system is that the control variable has not yet been extended to include cloud or precipitation. However, cloud and rain are generated (by the forecast model) if changes are made to the dynamics or moisture fields at the beginning of the assimilation time window. In the case of cloud and rain, interpolation of model fields to the observation point is consid- 226.3

10000 1000 Passed QC Frequency 100 Failed BgQC 10 Failed VarQC 1 40 20 0 20 40 FG departure [K] Figure 3: Histogram of SSM/I channel 19v FG departures from one analysis cycle, 00Z on 1st March 2009, showing the numbers that were actively assimilated (thin line); rejected by BgQC (thick line); rejected by VarQC (dashed line). ered error-prone. Instead, as for the 1D+4D-Var approach, no interpolation is done and only the nearest observation to the model grid point is taken, with a maximum allowed distance of 10 km. During the inner-loop minimisations, when a low resolution grid is used, distances may be much larger, so observation error is inflated as a quadratic function of distance from the model grid point. SSM/I channels 19v, 19h, 22v, 37v and 85h are assimilated actively. Observation errors are increased to very large values in cloudy situations for channels 37v and 85v, where cloud and rain radiative transfer is not as reliable as in the lower frequency channels. Observations are assimilated over sea surfaces only. 3. Performance in the full system The initial direct 4D-Var approach represents a very cautious assimilation of the observations. The effective error assigned to observations is generally quite high, due to the inflation of observation errors with distance from the grid point. This becomes significant because the true resolution of the analysis is that of the final incremental inner loop, which is only T255 (roughly 80km). Even where the observation errors are small, quality control (QC) has been set up to remove most of the contentious observations (Fig. 3), i.e. those where cloud or rain is present in either FG or observation but not in the other: these are associated with large brightness temperature departures. Traditional measures for evaluating an NWP system are (i) forecast scores calculated using the analyses as a reference and (ii) fit to other observations. Without going into detail here, these show that the all-sky direct 4D-Var only just replicates the performance of the old system (e.g. that shown by Kelly et al., 2008). The tropical wind impact is mostly retained, but fits to other observations show that the humidity is not quite so constrained as in the previous system. This can all be attributed to the very cautious treatment of observation errors and QC. A final limitation is found equally in the 1D+4D-Var and the all-sky approach. It is far more 226.4

a 0.5 LWP increment [kg m 2 ] 0.0 0.5 0.5 0.0 0.5 LWP FG departure [kg m 2 ] b 0.5 LWP increment [kg m 2 ] 0.0 0.5 0.5 0.0 0.5 LWP FG departure [kg m 2 ] Figure 4: Information transfer into the 4D-Var analyses, based on SSM/I observations from 23rd - 29th August 2007 in an All-Sky system. 2D histograms show FG departure against increment, for LWP: (a) in the first 1 h 50 min of the assimilation window and (b) the last 1 h 20 min. LWP is derived from observed or simulated TBs. The 1:1 line is overplotted, and if all observational information were transferred into the analyses, points would lie on this line. Contours are in logarithmic steps, starting from the outermost contour: 3,10,32, 100, 316 etc. difficult to get the assimilation system to create cloud than it is to remove cloud. This is seen when comparing FG departures to increments (Fig. 4b). At the beginning of the time window it is even quite difficult to remove cloud (Fig. 4a). These problems are thought to be due in part to the lack of a cloud control variable and in part to an ad-hoc restriction of humidity increments in the control variable transformation (this is intended to prevent supersaturation.) 4. Next version: symmetric errors As seen in the previous sections, the cautious treatment of observation errors and quality control in the initial all-sky approach resulted in a very limited impact from the observations. Additionally, the initial approach did not include any specific cloud or rain bias correction. Both of these things can be better understood by studying the variation of the FG departures with cloud amount. Here, cloud amount is represented by P37, the Petty (1994) 37 GHz normalised polarisation difference. This is a rough estimate of the square of the cloud and rain transmittance at 37 GHz, and it is calculated from the brightness temperatures. Its value is 1 in clear skies, varying to 0 in completely opaque cloud and rain conditions. The advantage of using P37 is that it can be calculated either from the observed or from the modelled brightness temperatures. This is an important point, for as Fig. 5a shows, binning FG departures by either observed or modelled cloud amount will result in a spurious sampling bias. For example, when heavy rain observations are selected, model error is likely to mean that some of the corresponding model points are clear sky instead. Hence, just from this sampling choice, the observations 226.5

15 10 Mean departure [K] 5 0 5 10 15 0.2 0.4 0.6 0.8 1.0 1.2 P37 20 Std. dev. departure [K] 15 10 5 0 0.2 0.4 0.6 0.8 1.0 1.2 P37 Figure 5: (a) Mean and (b) Standard deviation of global FG departures binned against model cloud amount (red), observed cloud amount (green) and mean of observed and modelled cloud (black). Sample is SSM/I channel 37v observations from 1st to 4th July 2009. Cloud amount is represented by the 37 GHz normalised polarisation difference (P37) which is 0 where cloud and rain are completely opaque at 37 GHz, and 1 in clear skies. are likely to have more cloud, and higher TBs, than the model. In this case, departures will be large and positive. The way to avoid this sampling bias is to bin FG departures by the mean of observed and modelled cloud amount. Figure 5a shows that biases are very much smaller in this case. The use of mean cloud will be referred to as symmetric sampling. Traditional bias correction approaches use model values as a predictor, but clearly, if modelled cloud were used, a spurious sampling bias would occur. Hence, the symmetric cloud amount should be used instead. A second very useful property of symmetric sampling, compared to using modelled or observed cloud amount, is that a far greater range of departure standard deviations can be identified (Fig. 5b). Standard deviations are smallest when the symmetric P37 is 1, i.e. both model and observations are clear, and largest around 0.6, where model and observation may be strongly disagreeing. Standard deviations are reduced as symmetric P37 tends to zero, where again the model and observation start to agree, i.e. both are rainy. Based on a graph like Fig. 5b we can use the symmetric cloud amount to predict the total departure error variance (i.e. the sum of observation and first guess error variance). We use this in two ways: First, it can be used in QC, where departures larger than 2.5 times the expected total error can be discarded. Second, by assuming a certain fraction of this error belongs to the observation and a certain fraction to the background, we can use this to predict the observation error. Hence, we can vary observation error between clear and cloudy skies, as a function of the symmetric cloud amount (P37). A final change for the new system treats the spatial scale of observations in a better way. Nonlinear radiative transfer and the beam-filling effect (e.g. Kummerow, 1998) mean that different degrees of spatial averaging change the behaviour of cloud and rain TBs. Without going 226.6

into details here, it is clearly important that observation and model representativity scales should be comparable. For the next system, observations will be averaged onto the grid-scale representing the final inner loop minimisation, T255. This means that about 8 SSM/I (or 50 AMSR-E) observations are averaged to give one super-observation for assimilation. This means that the inflation of observation error with distance can be removed. With observation errors predicted by the symmetric total error model, and no error inflation with distance from the grid-point, the observation errors are typically much smaller, and the observations in general now have a much greater impact on the analyses. Additionally, the use of the symmetric error model for QC means that far fewer observations are rejected, especially where model and observation disagree (observation errors will, however, be large in these areas). 5. Conclusion Direct 4D-Var assimilation of all-sky microwave imager radiances has been operational at ECMWF since March 2009. This initial implementation is very cautious, assigning large observation errors and using QC to remove the most contentious observations. As a result, it only just matches the forecast performance of the previous split system, which used a 1D+4D- Var approach for cloudy and rainy skies and a direct 4D-Var approach for clear skies. However, the new system represents a starting point from which the influence of these observations can gradually be increased as further modelling and data assimilation improvements are made. A revised all-sky system will be introduced in early 2010, with a complete change to the way observation errors, quality control and bias correction are handled. These will all now be based on a symmetric model of the total error of FG departures. This shows how total error varies as a function of cloud and rain amount, with small errors in clear skies and larger ones in heavy cloud rain. The symmetric approach uses the mean of observed and modelled cloud amount to predict total error and the bias. To use either modelled or observed cloud amount on its own would result in a false sampling bias. Using the symmetric error model allows the influence of the observations to be increased very substantially compared to the original allsky implementation, but relatively large observation errors are still required in areas of cloud and rain. However, these large errors are thought to truly reflect the current situation. A number of experiments, not shown here for space reasons, examined this further. In 4D-Var, the full forecast model is part of the observation operator, but it is unable to simulate cloud and rain structures of the quality required and is subject to synoptically dependent systematic errors. Where observation errors in cloud and rain were reduced, forecast scores actually degraded, because the initial state was, erroneously, being corrected to adjust for the systematic model error. The presence of synoptically dependent systematic errors in rain and cloud in the forecast model is probably now the main obstacle to improving the impact of cloud and rain assimilation. Options are either simply to improve the moist physics in the model, or to start using a weak-constraint or parameter estimation framework. Of these, the parameter estimation approach might be better, since it can be used to directly addresses the source of the problems, which is likely the moist physics parametrisations. The other major data assimilation developments still required are: Adding cloud to the control variable; Setting up the control variable transformations to avoid penalising moisture or cloud in- 226.7

creases when they are truly required; It is also likely that background error correlation lengths and structures, which are currently optimised for clear sky, are inappropriate for cloudy and rainy areas, and should be revised. Acknowledgement Alan Geer s work at ECMWF was funded through a EUMETSAT research fellowship. References Bauer, P., P. Lopez, A. Benedetti, D. Salmond, and E. Moreau, 2006a: Implementation of 1D+4D-Var assimilation of precipitation-affected microwave radiances at ECMWF. I: 1D- Var. Quart. J. Roy. Meteorol. Soc., 132, 2277 2306. Bauer, P., P. Lopez, D. Salmond, A. Benedetti, S. Saarinen, and E. Moreau, 2006b: Implementation of 1D+4D-Var assimilation of precipitation-affected microwave radiances at ECMWF. II: 4D-Var. Quart. J. Roy. Meteorol. Soc., 132, 2307 2332. Geer, A. J., P. Bauer, and P. Lopez, 2008: Lessons learnt from the operational 1D+4D-Var assimilation of rain- and cloud-affected SSM/I observations at ECMWF. Quart. J. Roy. Meteorol. Soc., 134, 1513 1525. Kelly, G. A., P. Bauer, A. J. Geer, P. Lopez, and J.-N. Thépaut, 2008: Impact of SSM/I observations related to moisture, clouds and precipitation on global NWP forecast skill. Mon. Weather Rev., 136, 2713 2726. Kummerow, C., 1998: Beamfilling errors in passive microwave rainfall retrievals. J. Appl. Meteor., 37, 356 370. Lopez, P. and E. Moreau, 2005: A convection scheme for data assimilation: Description and initial tests. Quart. J. Roy. Meteorol. Soc., 131, 409 436. Marécal, V. and J.-F. Mahfouf, 2002: Four-dimensional variational assimilation of total column water vapour in rainy areas. Mon. Weather Rev., 130, 43 58. Marécal, V. and J.-F. Mahfouf, 2003: Experiments on 4D-Var assimilation of rainfall data using an incremental formulation. Quart. J. Roy. Meteorol. Soc., 129, 3137 3160. McNally, A., 2009: The direct assimilation of cloud-affected satellite infrared radiances in the ECMWF 4D-Var. Quart. J. Roy. Meteorol. Soc., 135, 1214 1229. Pavelin, E. G., S. J. English, and J. R. Eyre, 2008: The assimilation of cloud-affected infrared satellite radiances for numerical weather prediction. Quart. J. Roy. Meteorol. Soc., 13, 737 749. Petty, G., 1994: Physical retrievals of over-ocean rain rate from multichannel microwave imagery. Part I: Theoretical characteristics of the normalised polarisation and scattering indices. Meteorol. Atmos. Phys., 54, 79 99. Tompkins, A. M. and M. Janisková, 2004: A cloud scheme for data assimilation: Description and initial tests. Quart. J. Roy. Meteorol. Soc., 130, 2495 2517. 226.8