Numerical weather prediction models are equations describing the evolution of the atmosphere, typically coded into a computer program.
Data assimilation provides a very large number of practical ways to bring these observations into the models.
Because data assimilation developed out of the field of numerical weather prediction, it initially gained popularity amongst the geosciences.
In fact, one of the most cited publication in all of the geosciences is an application of data assimilation to reconstruct the observed history of the atmosphere.
[1] Classically, data assimilation has been applied to chaotic dynamical systems that are too difficult to predict using simple extrapolation methods.
The difference between the forecast and the observations at that time is called the departure or the innovation (as it provides new information to the data assimilation process).
A weighting factor is applied to the innovation to determine how much of a correction should be made to the forecast based on the new information from the observations.
The best estimate of the state of the system based on the correction to the forecast determined by a weighting factor times the innovation is called the analysis.
Much of the work in data assimilation is focused on adequately estimating the appropriate weighting factor based on intricate knowledge of the errors in the system.
The measurements are usually made of a real-world system, rather than of the model's incomplete representation of that system, and so a special function called the observation operator (usually depicted by h() for a nonlinear operator or H for its linearization) is needed to map the modeled variable to a form that can be directly compared with the observation.
From this perspective, the analysis step is an application of Bayes' theorem and the overall assimilation procedure is an example of recursive Bayesian estimation.
Many optimisation approaches exist and all of them can be set up to update the model, for instance, evolutionary algorithm have proven to be efficient as free of hypothesis, but computationally expensive.
On land, terrain maps available at resolutions down to 1 kilometer (0.6 mi) globally are used to help model atmospheric circulations within regions of rugged topography, in order to better depict features such as downslope winds, mountain waves and related cloudiness that affects incoming solar radiation.
[4] These observations are irregularly spaced, so they are processed by data assimilation and objective analysis methods, which perform quality control and obtain values at locations usable by the model's mathematical algorithms.
[11][12] Reconnaissance aircraft are also flown over the open oceans during the cold season into systems which cause significant uncertainty in forecast guidance, or are expected to be of high impact from three to seven days into the future over the downstream continent.
[14] Efforts to involve sea surface temperature in model initialization began in 1972 due to its role in modulating weather in higher latitudes of the Pacific.
Using a hydrostatic variation of Bjerknes's primitive equations,[16] Richardson produced by hand a 6-hour forecast for the state of the atmosphere over two points in central Europe, taking at least six weeks to do so.
[17] His forecast calculated that the change in surface pressure would be 145 millibars (4.3 inHg), an unrealistic value incorrect by two orders of magnitude.
The large error was caused by an imbalance in the pressure and wind velocity fields used as the initial conditions in his analysis,[16] indicating the need for a data assimilation scheme.
Originally "subjective analysis" had been used in which numerical weather prediction (NWP) forecasts had been adjusted by meteorologists using their operational expertise.
They introduce into the right part of dynamical equations of the model a term that is proportional to the difference of the calculated meteorological variable and the observed value.
However, this was (and remains) a difficult task because the full version requires solution of the enormous number of additional equations (~N*N~10**12, where N=Nx*Ny*Nz is the size of the state vector, Nx~100, Ny~100, Nz~100 – the dimensions of the computational grid).
The significant advantage of the variational approaches is that the meteorological fields satisfy the dynamical equations of the NWP model and at the same time they minimize the functional, characterizing their difference from observations.
The fundamental questions also arise in application of the advanced DA techniques such as convergence of the computational method to the global minimum of the functional to be minimised.
The 4DDA method which is currently most successful[19][20] is hybrid incremental 4D-Var, where an ensemble is used to augment the climatological background error covariances at the start of the data assimilation time window, but the background error covariances are evolved during the time window by a simplified version of the NWP forecast model.
A typical cost function would be the sum of the squared deviations of the analysis values from the observations weighted by the accuracy of the observations, plus the sum of the squared deviations of the forecast fields and the analyzed fields weighted by the accuracy of the forecast.
Factors driving the rapid development of data assimilation methods for NWP models include: Data assimilation has been used, in the 1980s and 1990s, in several HAPEX (Hydrologic and Atmospheric Pilot Experiment) projects for monitoring energy transfers between the soil, vegetation and atmosphere.
For instance: - HAPEX-MobilHy,[24] HAPEX-Sahel,[25] - the "Alpilles-ReSeDA" (Remote Sensing Data Assimilation) experiment,[26][27] a European project in the FP4-ENV program[28] which took place in the Alpilles region, South-East of France (1996–97).
The Flow-chart diagram (right), excerpted from the final report of that project,[23] shows how to infer variables of interest such as canopy state, radiative fluxes, environmental budget, production in quantity and quality, from remote sensing data and ancillary information.
[citation needed] The numerical forecast models are becoming of higher resolution due to the increase of computational power, with operational atmospheric models now running with horizontal resolutions of order of 1 km (e.g. at the German National Meteorological Service, the Deutscher Wetterdienst (DWD) and Met Office in the UK).