Obstacles and benefits of the implementation of a reduced rank smoother with a high resolution model of the Atlantic ocean

Introduction Conclusions References

models is made possible provided relevant adaptations, such as reduced-order formulation (Parrish and Cohn, 1985;Todling and Cohn, 1994;Evensen, 1994;Fukumori andMalanotte-Rizzoli, 1995 andHoutekamer andMitchell, 1998).The filter provides an optimal estimate of the system state, given a model state (numerical), and past and present observations available up to the analysis time.Thus, it is well indicated to initialize a forecast, what is the historical purpose of data assimilation in geophysics.Though, oceanographic applications of data assimilation get increasingly diversified.In particular, climate studies require accurate reconstructions of the past ocean circulation, reanalyses, as performed for instance in the MyOcean project (http://www.myocean.eu.org/).Such reanalyses are expected to gain accuracy when the observations datasets are used on a four-dimensional way with data assimilation, i.e. when each observation has an influence on past, present and future states of the model solution.In the framework of the Kalman filter, the past influence of observations is introduced through retrospective analyses and leads to a smoothing formulation of the estimation problem.Optimal smoothers in estimation theory may be considered as a four-dimensional extension of the Kalman filter that takes into account future observations.
Several smoother approaches exist (see Cosme et al., 2011, for detailed descriptions).In this paper, we consider the sequential fixed-lag smoother (Cohn et al., 1994;Evensen and van Leeuwen, 2000), well designed for reanalysis purposes.More precisely, we use the reduced rank square root smoother developed by Cosme et al. (2010), which is based on the Singular Evolutive Extended Kalman (SEEK) filter (Pham et al., 1998;Brasseur and Verron, 2006).Actually, the MyOcean reanalysis system is based on the SEEK filter too.This reduced-rank smoother has been tested by Cosme et al. (2010) with a square-box configuration of a high resolution ocean circulation model.In this work, we investigate the application of the reduced-rank smoother with a more complex and realistic tropical Atlantic ocean circulation model in a ¼ • resolution configuration.We strive to identify obstacles and solutions to implement the smoother in such a realistic context.The gain of the smoother over the filter is assessed.Finally, Introduction

Conclusions References
Tables Figures

Back Close
Full to comply with operational constraints, we design here a new flavour of the sequential fixed-lag smoother algorithm, referred to as half-fixed basis, to overcome implementation issues.Our main goal is to determine whether this smoother can improve present reanalysis schemes.In Sect.2, we present the SEEK filter and the SEEK smoother formulations as detailed in Cosme et al. (2010).Some advantages and drawbacks linked to implementation are stressed.Section 3 summarizes the set-up of our twin experiments.The smoother is then implemented according to its theoretical formulation.In Sect. 4 we dwell upon a sensitive step of the implementation of a smoother: the parametrisation and the dynamical propagation of error covariances.In Sect. 5 we expose and examine the main results of a short smoother reanalysis.The filter reanalysis is used as a reference to point out the improvements of a four-dimensional extension of the data assimilation.To deal with implementation issues raised in Sect.4, a half-fixed basis smoother algorithm is developed and tested in Sect.6. Section 7 concludes and gives perspectives.
2 The reduced rank square-root filter and smoother algorithms The Kalman filter and smoother formulations can be found in Anderson and Moore (1979), Simon (2006) and Evensen (2007).For the smoothers, see also Cosme et al. (2011).Here we only provide an overview of the sequential algorithms and an intuitive interpretation of the equations.We also decline these algorithms in their square-root transformation so they can be implemented in large geophysical problems.Finally, we expose the modifications to introduce for the half-fixed basis smoother to be implemented.Introduction

Conclusions References
Tables Figures

Back Close
Full

Notations
Notations are similar to the ones used in Cosme et al. (2010).Superscripts refer to the type of states ("a" for analysed and "f" for forecast).Subscripts are used to specify the transition between two times (for instance k − 1,k is used for the transition matrix between times k −1 and k) or the time of a state and information it contains (for instance k|k − 1 indicates that the state is estimated at time t k , given all observations available up to time t k−1 ).Subscripts 0 represent the initial conditions.

The Kalman filter
The sequential form of the filter is given by a succession of two steps, Forecast and analysis, summarized in Eqs. ( 1)-( 7).Initial conditions must first be prescribed: x a 0 , an initial state, and P a 0 , an initial error covariance matrix.
Initialization x a 0 and P a 0 Forecast step (1)

Conclusions References
Tables Figures

Back Close
Full Observations at time t k are then used to perform the filter analysis step.First the innovation vector d k is computed (Eq.3), providing the difference between observations (y k ) and the model state (x f k|k−1 ) projected into the observation space with the observation operator H k .The innovation error covariance matrix G k is also defined (Eq.4), with R k the observation error matrix.G k is then used to compute the Kalman gain K k|k (Eq.5).Then, the update of the forecast state, x a k|k , is computed (Eq.6), balancing observations and the model estimate thanks to the Kalman gain.Statistics on the residual error, P a k|k , are also estimated through Eq. ( 7).At the end of the analysis step, the Kalman filter provides the best state estimate given all observations available up to time t k .
Analysis and forecast steps are performed successively from the first to the last observed date.

The sequential smoother
The smoother uses observations at a time t k to improve past estimates at times t i , with t i < t k .Thus, an analysed state at time t i is now noted x a i |k , meaning it contains all informations from observations available until time t k .Obviously, i is not unique, meaning observations at time t k can be used by the smoother to analyse several past states (at different times t i ).The set of time indices at which the retrospective analyses are produced is noted Σ k .In our configuration, Σ i = {k − L,...,k − 1}, corresponding to Introduction

Conclusions References
Tables Figures

Back Close
Full a fixed-lag smoother according to the nomenclature used in Cosme et al. (2010).With this configuration, the smoother performs retrospective analysis for the L states previous to time t k only (L being called the lag of the smoother).This can be interpreted as time localisation of the smoother, restricting the past influence of observations.It also limits the numerical storage of the smoother.
Forecast step Analysis step Equations of the optimal linear fixed-lag smoother are summarized in Eqs. ( 8)-( 14).To perform the retrospective analysis, the smoother needs the introduction of crosscovariances matrices, that is, matrices of covariances between state errors at two different times.These matrix are involved in the smoother gain (Eq.11), which is used for the smoother analysis (Eq.12).Thereby, the cross-covariance matrices enable the use of information at time t k to correct a past state at time t i .Note that in the forecast step, the Eq. ( 2) of the filter is splitted into two equations for the smoother (Eqs.9 and 10) to bring out the cross-covariances matrix.Introduction

Conclusions References
Tables Figures

Back Close
Full We now expose algorithms in a reduced-rank form so they can be applied to large geophysics systems.We use the Singular Evolutive Extended Kalman (SEEK) filter that has already been implemented with real systems (e.g.Verron et al., 1999;Testut et al., 2003;Brankart et al., 2003 andCastruccio et al., 2006).A synthesis on the SEEK filter can be found in Brasseur and Verron (2006) or Rozier et al. (2007).The main idea is to use a square-root decomposition of the error covariance matrix so it can be written as: , where S f is a n×n matrix, n being the length of the state vector.The filter equations are then reformulated including the square-root decomposition, as exposed in Eqs. ( 15)-(24).
To be computable at low cost, some assumption are introduced.First, the dimension of S f can be reduced assuming that errors only occur on a low dimension subspace, r, of the state space (with r n).In practice, these directions can be identified keeping the r first Empirical Orthogonal Functions (EOFs) computed from a time series of model states.When considering only the r main directions of the error, S f becomes a n × r matrix, the columns of which are often referred to as error modes.The term I + Γ in the smoother gain is then r × r, and easily inversible if r is small enough.Moreover, the propagation of error covariances (Eq.16) is more affordable (only r model iterations are needed instead of n).The model error term is now noted δ in this equation.
Another assumption is made on R, so it can be easily inversible in the Kalman gain equation: it is considered diagonal here.Note that other assumptions can be used (as presented in Testut et al., 2003, Brankart et al., 2003, or Brankart et al., 2009) to introduce observation error correlations while keeping the matrix diagonal.Initialization Forecast step

Conclusions References
Tables Figures

Back Close
Full Filter analysis step Smoother analysis step With the SEEK formulation of the filter, the sequential smoother implementation becomes straightforward.Only three extra equations are needed (Eqs. 22,23 and 24).The cross-covariance terms are here directly introduced in the smoother gain.The smoother analysis is performed using the smoother gain and the innovation vector computed from the filter.And finally, the smoother analysis covariances are computed.The smoother implementation does not require additional assumption nor significant extra CPU times, with respect to the filter.The only limitation lies in the storage of the smoother covariance matrices, but with the fixed-lag formulation, the number of restrospective analyses, and thus the number of covariance matrices to store for each observation, is limited.The smoother also presents the advantage of being applicable simultaneously to or after the filter (in this case, calculations of the gains must be performed again).Introduction

Conclusions References
Tables Figures

Back Close
Full

Model error and evolutive covariances
A smoother requires accurate cross-covariances between the filter forecast state and past estimates (Eq.22).Thus the filter must provide accurate forecast (and analysis, for the next steps) error modes.
Theoretically speaking, the forecast error modes result from the combination of the dynamical propagation of past analysis modes, which ensures the statistical connection through time, and a white-in-time model error, what makes the short-term crosstemporal decorrelation of modes occur (Eq.16).Hence, both elements (dynamical propagation and model error) play a key role in the behaviour of the smoother.In particular, it is theoretically obvious that a smoother must rely on an evolutive filter (with dynamical propagation of the modes), termed in this way in contrast to a static filter such as Ensemble Optimal Interpolation (Evensen, 2003).And it is self-evident that the model error must be accounted for accurately.
The model error parameterisation in Ensemble filters has long been and still remains a considerable issue, and is still an active topic of research (Houtekamer et al., 2009;Brankart et al., 2010).The covariance inflation approach (Li et al., 2009) is quite popular but inadequate to decorrelate error modes through time (Cosme et al., 2010).Recent efforts has been undertaken for a better, adaptive estimation of the forecast error covariances in the SEEK filter (Brankart et al., 2010(Brankart et al., , 2011) ) but further work is necessary to make them robust and applicable routinely.In order to disconnect the smoother and the model error issues here, we have decided to stick to a perfect model set-up and neglect the model error term δ.This obviously puts limitations to the scope of our experiments, as detailed in Sect. 4. This assumption leads to a new form of Eq. ( 16): The propagative term MS a implies r model iterations, r being the number of columns in S a .To be affordable at a reasonable cost, r must be kept quite small.A remaining problem lies in the divergence of the filter that can occur due to the order reduction, Introduction

Conclusions References
Tables Figures

Back Close
Full i.e. the error estimated by the filter (S f and S a ), can become inconsistent with the true error.The estimation of the errors through time can be controlled thanks to several parameters.This point will be discussed in Sect. 4 with the implementation of an evolutive filter.Finally, note that even if δ is neglected in the covariance propagation, retrospective influence of observations is artificially limited thanks to the fixed-lag form of the smoother.

A half-fixed basis smoother
The model error parameterisation issue, combined with the prohibitive cost of the modes propagation, often leads to use static filters, as various oceanographic operational centers have already turned to (Brasseur et al., 2005;Martin et al., 2007;Oke et al., 2008).It can be expected that these operational centers will have to adapt their assimilation schemes to perform the future reanalyses of the ocean circulation, as it has already been done (Ferry et al., 2010), and as it is the general case now for the atmospheric circulation.For this reason, and in spite of the theoretical concerns mentioned previously, it seems worth trying to design a smoother based on a static filter (minimizing the impacts of the violations in the theory) and testing it numerically.
In this configuration, only Eq. ( 25) needs to be rewritten in a static form: ). Kalman gains of the filter (Eq.18) and the smoother (Eq.22) are also affected, with the terms S f k|k−1 being replaced by S a 0 .Another term in Eq. ( 22) needs to be specify: the analysis error covariance matrix S a i |k−1 .This matrix can be considered as fixed (so the smoother would also be a fixed basis type), but a more theoretically sound approach consists in using the analysis covariance matrix computed from S a 0 with the filter at the appropriate time.This smoother scheme of smoother is referred to as half-fixed basis, because a part of the cross-covariance matrix is kept fixed (S f k|k−1 ) while the other part is time dependent (S a i |k−1 ).We will expose and discuss more in detail this smoother in Sect.6. Introduction

Conclusions References
Tables Figures

Back Close
Full

Twin experiments and assimilation settings
The smoother is implemented and assessed in the framework of twin experiments, but with the concern to remain close to a real problem (through model and simulated observations).The experimental settings are described in this section.The nomenclature of experiments is summarized in Table 2.

Model and configuration
The ocean circulation model NEMO (Nucleus for European Modelling of Ocean, Madec, 2008) is used in a tropical Atlantic configuration (from 61.5 • W to 15 • E, and 15 • S to 17.75 • N) and with a resolution of ¼ degree.This configuration, previously developed and used by Ubelmann et al. (2009), is named TATL4 (Tropical ATLantic ¼ degree).Boundary conditions are extracted from an model simulation spanning the 1958-2007 period (Dussin et al., 2009).An initial condition for TATL4 is extracted on 5 January 1995.Atmospheric forcings (Brodeau et al., 2010) are identical.An interannual simulation is then performed untill the end of 2005.This run is used as a reference for the twin experiment and is called REF.
Two main signals dominate the dynamics of the tropical Atlantic, as illustrated by Fig. 1.On the one hand, Tropical Instability Waves (TIWs) develop during summer and fall at 3 • N.They propagate from east to west and exhibit a strong signal on the sea surface temperature (Legeckis, 1977;Allen et al., 1995).On the other hand, eddies develop and circulate along the north-east coast of the Brazil.These eddies, know as Brazil Rings (Richardson and Walsh, 1986;Garzoli and Katz, 1993), are due to the north-Brazil current retroflection and exhibit a strong signal on velocity (from the surface to 200-500 m depth).Due to their propagative signatures, these two signals are interesting objects of studies for a four-dimensional assimilation experiment.Introduction

Conclusions References
Tables Figures

Back Close
Full

Perturbation
To perform a twin experiment, a perturbed ocean state is generated using the interannual variability.An initial condition from REF on 25 May 2003 is used to restart the model on 25 May 2005.This error on initial condition leads to a perturbed simulation called FALSE, computed until the end of 2005.The data assimilation experiment starts one month after the beginning of FALSE, so that the perturbed simulation can partially adjust with the forcings and damp spurious, high frequency oscillations.The goal of the present data assimilation experiment is then to correct FALSE so it becomes closer to REF.Note that even if a perturbation of the initial condition is not sufficient to assess the performance of the smoother for a long term reanalysis, it allows the careful study of the impact of the smoother analysis and the quality of corrected states, at least at the beginning of the experiment.

Simulated observations
Synthetic observations are extracted from REF following the procedure described by Cosme et al. (2010).Two sets of observations are generated: in situ data of temperature and salinity, and altimetric measurements.
In situ data mimic ARGO profilers network.Every two days, a set of vertical profiles is available, 6 • apart from each other.This pattern is shifted by 2 • from one assimilation step to the next.It results in a 2 • density network every 18 days, close to the average true ARGO network (one profile every 2 • and every 15 days).The simulated in situ data network is illustrated on Fig. 2, for one assimilation cycle (every 2 days) and the full coverage (after 18 days).Altimetric data mimic Jason-type satellite tracks.Every 2 days, a limited number of tracks of Sea Surface Height (SSH) are available.The periodicity of these tracks is 10 days.Figure 3 provides an example of SSH tracks available for one assimilation cycle (every 2 days) and shows the complete coverage obtained in 10 days of consecutive altimetric observations.Introduction

Conclusions References
Tables Figures

Back Close
Full We consider the mesurement error is null here: the observations are not perturbed and can be described as perfect observations.But the matrix R is not zero because of the truncation error.Its parametrisation is presented in Sect. 4.

Initial statistics
As the only source in the initial state error is connected to model's time variations, the statistics S

Localisation
A consequence of the order reduction is that statistics may not be able to faithfully descibe all the correlations, especially long-distance correlations (Houtekamer and Mitchell, 1998).To prevent the spurious influence of distant observations during an analysis step, the filter is used with a localisation scheme (Brankart et al., 2009).This method rules out long-distance corrections, so that only observations in the neighbourhood of an analysed point are used for the correction.The size of the neighbourhood is defined as 15 • zonal and 10 • meridional in length (illustrated by the black boxe on Fig. 1).It is chosen large enough so that at least one observation is available for each analysis.The zonal length is larger than the meridional length because the dynamics in tropical region is mostly zonal, so the correlations are more consistent in this direction.

Conclusions References
Tables Figures

Back Close
Full

Evolutive filter
We now present the results of the SEEK smoother in its original form, i.e. with evolving error modes, as presented in Cosme et al. (2010).For the reasons mentioned in Sect.2.5, we pay particular attention to the accurate estimation of the forecast error modes (defined by Eq. 25 in our configuration).It is well known that due to the truncature of S essentially, the error modes can collapse or become inconsistent with the true error (error on the mean) with time.The most common method to avoid this is to use covariance inflation.But for the reasons given in Sect.2.5, it is not our choice here.Instead, we chose to tune a few parameters controlable in the system: the number of error modes (columns in S), the size of the localisation domain, and the amplitude of the (diagonal) observation error matrix R. The adopted criterion of consistency between errors on the mean and errors estimated by the modes is that the RMS error computed with the mean, (with i = 1 : N the set of the longitudes, j = 1 : N the set of the latitudes, k = 1 : P the set of the vertical levels, x i ,j,k the mean state of the ocean and x ref i ,j,k the true state of the ocean, REF) must be of the same order of magnitude as the square root of the trace of the covariances matrix (tr(SS T )).
The first two parameters, though, are partly constrained by practical considerations.A high number of error modes allows an accurate representation of the error space, but requires more CPU times for modes propagation.Several sensitivity experiments, not detailed here, suggested that 39 was a fair number.The localisation parameters must also be prescribed optimally to avoid spurious corrections due to distant observations and no correction due to absence of close observations.Again, sensitivity experiments showed that having at least one observation for each analysed grid point was a fair choice.

Conclusions References
Tables Figures

Back Close
Full The remaining parameter to be specified is the matrix R, which represents the confidence to the observations and to the resulting analysis.This matrix includes the observation error (including the accuracy of mesurements), but not only.As mentioned in Sect.2.4, this matrix is considered diagonal.This assumption implies that the diagonal terms should be inflated to include the ignored non-diagonal terms.Moreover, the truncation error (due to the reduction order) should also be included in R.This error, as well as the non-diagonal terms of R, are not easy to quantify, and a wrong definition of the diagonal terms of R can lead to inconsistent error statistics with the real error.Precedent studies point out these aspects, as for instance Liu and Rabier (2002), Rabier (2006) or Oke et al. (2008).To parametrise correctly R, several filtering experiments are conducted, with different values for observations error as precised in Table 1.The experiments last 50 days, starting from 25 June considered as day 0 for the results.For each experiment, the true RMS error is compare to the mean estimate error of the filter (tr(SS T )).Results are illustrated with the temperature error levels on Fig. 4. The red curve represents the true error, and the full black curve, the estimated error.The error of FALSE is also shown with the black dotted line.To be consistent, the estimated error (black) should be close to the true error (red).
Results show that for the error statistics to be consistent, prescribed observation errors have to be large.Indeed, the best estimated error is obtained with the largest observations error (ErrObs04 in Table 1).This suggests that the truncature error included in R is ten times larger than a traditional mesurement error.On the other hand, the mean true error (in red) is poorly influenced by this parametrisation.Consequently, large observation errors avoid the collapse of error modes without affecting the quality of the mean state estimate.Also, as a (bad) counterpart of neglecting the model error, the filter converges to an asymptotic level of error, what makes the smoother assessment relevant in the first few weeks of the experiment only.The results are similar for all other assimilated variables (salinity and SSH, not shown).The filter is less efficient for velocities, but still the estimated error on these variables is better with high observation errors, as for the assimilated variables.

Conclusions References
Tables Figures

Back Close
Full The parametrisation of an evolutive filter is difficult.The definition of the error covariance matrix and the control of its time evolution is tricky, and may lead to consider unusual values for certain parameters (such as the observations error in our case).This would be especially true with a more complex system where error sources would not only be linked to the initial condition, and it certainly is a major obstacle for the smoother implementation.A strategy to avoid this problem will be exposed in Sect.6, but first we discuss the smoother effects with the previously defined evolutive filter in the next section.

Smoother's impact
The smoother is tuned using the parameters defined for the SEEK filter in the previous section.The fixed-lag parameter, i.e. the temporal localisation of influence of the observations, is set to 10 days (5 retrospective assimilations).We verified that extending the lag to more than 10 days does not improve the smoother results.This is due to the decrease in amplitude of the cross-covariance matrix S a S f T with time, either because S a and S f T become orthogonal due to the model dynamics working on the latter, or because of the continuous decrease of the former due to the successive retrospective analyses.

Smoothing improvements on error statistics
The smoother impact on the mean error levels is shown on Fig. 5, for two assimilated variables (SSH and temperature) and a non-assimilated variable (zonal velocity).As expected, the error on SSH or velocity is reduced with the smoother, though to a limited extent, due to relatively small initial errors in SSH.In the Tropics, where the large scale dynamics are not dominated by geostrophy, variations in SSH are low in amplitude, therefore hard to control with data assimilation (Ubelmann et al., 2009).The small improvements on velocity is due to the fact that corrections on this variable mostly Introduction

Conclusions References
Tables Figures

Back Close
Full occur near the surface, and errors are computed over the 3 spatial dimensions.The signature of the smoother impact is then mitigate by the 3-D signal.Results are more contrasted on the thermodynamical variables: temperature, and salinity.Compared to the filter analysis, the mean error level of the smoothed states is 10 % to 15 % lower at the beginning of the experiment (before day 10).After day 10, the smoother effects tend to decrease, concomitantly with the filter corrections.This behaviour is satisfactory, and directly related to the settings of our experiment, where the errors are introduced in the initial state only.In such settings, the filter converges with time toward an asymptotic error level, and the innovations tend to be zero.Like the filter correction, the smoother corrections are based on the filter innovation (Eq.19).If the innovation is null, whatever the reason is, the smoother will not bring any correction.A smoother is not meant to counterbalance spurious tuning of the filter, but only to make optimal (4-D) use of observations especially when they are sparse in nature, space, and time.In the case of a reanalysis, the filter is expected to have a significant impact during all the experiment, along with the smoother.As we are interested in the smoother effects here, we now focus on the first part of the experiment, when the smoothed error level is significantly lower than the filtered error level.

The smoother's dynamics estimation
The smoother is able to reduce the error level of the filter analysis.We now examine its capacity to estimate dynamically balanced states of the ocean circulation.Indeed, a Kalman filter analysis estimate results from a trade-off between dynamics (from the background estimate) and statistics (from the correction term).But it is still supposed to represent the state of a dynamical system and as such, must be dynamically balanced, that is, respectful of the dynamics represented by the system's model.A smoother estimate must be, at least in theory, closer to the truth than the filter esti- to the filtered states), we perform a simple test: analysed states (from the filter or the smoother at day 2, i.e. the first analysed date) are used to initialize a free run of the model (respectively FreeFilt and FreeSmooth in Table 2).The implicit idea is that if a state is dynamically inconsistent, this should be reflected on the error evolution, as illustrated by Rozier et al. (2007).
Results for error levels in SSH, temperature and zonal velocity are shown on Fig. 6.First, the error level at day 10 is the same for the filtered run (full black line) and the FreeSmooth (dotted red line), for all variables, and this is a notable point.Indeed, the same observations are used in both cases, but the filter assimilates them sequentially in time while the smoother assimilates them on a same date (here, day 2).It means that the smoother could extract the same information from observations, but to correct a same state.More generally, the evolution of error levels (on both variables) for FreeSmooth is better during the first 50 days.Beyond, errors tend to converge to the same level (though more slowly for the temperature), to become close to the FALSE error level.These results suggest that the smoother corrections are consistent with the dynamics, i.e. the reduction of the error level has a real signification on a dynamical point of view and does not introduce significant instabilities.
A region particularly well-indicated to illustrate this point is the north Brazil zone.Because of its chaotic dynamics, the initial condition strongly affects the subsequent circulation in this region (especially on the Brazil rings).Figure 7 shows the evolution of the error (in velocity) between the free runs and the reference along a transect (between 44 • W and 52 • W) at 5 • N. Figure 8 shows snapshots of the absolute velocity around this same transect, for different times.The alternation between positive and negative anomalies (Fig. 7) is typical of phase shifts affecting the propagative structures, here the Brazil rings.Until day 50, this phase shift is stronger in FreeFilt.If the rings seem quite similar in the filtered state and the smoothed state at day 0 (Fig. 8), they clearly show different evolutions.Rings in FreeSmooth are much close to the rings in REF.For instance, a ring is correctly formed at day 20 or 30 for FreeSmooth, but this same ring is not completely untied from the north-Brazil current with FreeFilt.Thus, even if not Introduction

Conclusions References
Tables Figures

Back Close
Full directly visible in the analysed states (day 0), the dynamics is obviously more coherent with REF in the smoothed state.Still, error levels on the free runs (Fig. 7) and snapshots (Fig. 8) after day 50 show that the dynamics in FreeSmooth tends to become also unphased with REF (though the dynamics is not exactly the same as in FreeFilt).Given all these considerations, it is obvious that the smoothed solution leads to a better estimation of the dynamics, compare to a filtered solution at a same date.The four-dimensional use of the information from the observations has a real impact on analysed states.If the parametrisation linked to the filter is optimal, then the implementation of a smoother gives a better estimated solution, on both statistical and dynamical point of view.In our experiment though, these results are limited to the beginning of the experiment due to the settings of the problem.Despite its benefits, a remaining obstacle to implement a smoother is the parametrisation of the evolutive filter.Even if this formulation is not fully consistent with the smoother theory, we will now investigate the possibility and the efficiency of retrospective analyses in a context of fixed-basis formulation of the filter.These results are exposed in the next section.

Smoother based on a static filter
We present in this section a new formulation for a smoother, based on a static filter, a so-called optimal interpolation scheme (noted OI).As introduced in Sect.2.6, this formulation avoids the problems linked to the evolutive filter (cost and parametrisation), and is close to present systems of operational centers.The OI is implemented with the same parametrisation as the filter, except that Eq. ( 25) is now replaced by the following equation: An hypothesis inherent to the implemention of an optimal interpolation scheme is that the analyses times are distant enough to make the residual errors decorrelate from one step to the next.To implement the OI scheme with our configuration, we define a new observation network with a 10-day frequency.Observations themselves are the same (i.e. they are extracted from REF every 2 days), but are gathered together at the nearest analysis days, and assimilated all together these days.For instance, observations between day 5 and 15 are gathered and assimilated at day 10.Next assimilation step occurs at day 20, with all observations available between day 15 and 25.Day 10 is now the date of the first assimilation step.
Figure 9 presents a diagram of the smoother implementation.At the filter analysis time k, the analysis covariance matrix S a k is computed from the background S a 0 , simultaneously to the analysis state.This is not done usually with an OI scheme.The retrospective analysis at time k using the observation at a time k + 10 is performed (corresponding to a 10 day interval), using the cross-covariances matrix S a k S a 0 T .Thus, the smoother uses informations on residual errors at time k, combined with S a 0 instead of S f k+10|k .If S a 0 is fixed, the matrix S a k depends on the time of the analysis (this matrix is computed every 10 days after an OI analysis step).This form of smoother can be then considered as half-fixed.The large red circle and the blue plus sign on Fig. 9 represent the smoother analysis with a 10 days retrospective analysis.Now, it is also possible to perform smoother analyses at intermediate times, every 2 days for instance.For each of such analyses, the background state is provided by the model simulation during the OI process.The background error modes must be defined appropriately, based on already existing bases.In the following experiments, the modes at times k + 2,...,k + 8 are simply prescribed with S a k , but other sound solutions are possible, such as a linear combination of S

Conclusions References
Tables Figures

Back Close
Full

Smoother's impact on the optimal interpolation solution
Results on RMS error levels for the optimal interpolation and the smoother based on OI are shown on Fig. 10.The smoother impact (red circles) is visible on each variable, assimilated or not.First, the evolution of error levels for the smoothed solution is continous.Discontinuities inherent to sequential assimilation solutions (visible here every 10 days, during OI analysis) are smoothed out.Also, even with a very simple configuration of cross-covariance matrices, the smoothed solution exhibits lower error levels than the filtered one, except for some points (between day 20 and 30 for the velocity).
It is finally worth noticing that this smoothing scheme is almost cost-free and could be applied with current assimilation systems in operational oceanographic centers.
Note that a 2-days configuration of a smoother can also be used with a 10-days frequency evolutive filter.The smoother needs the propagation of error statistics between the beginning of an interval and the day of the retrospective assimilation.This scheme is then similar to the asynchronous Kalman filter described by Hunt et al. (2004).

Dynamics with a half-fixed basis smoother
As in the previous section, we here examine the dynamical balance of the smoother solution.Free runs are initialized at day 10 with the analysed states from OI (FreeOI) or the smoother based on OI (FreeSmoothOI).The evolution of error levels are shown on Fig. 11.Even if initial error levels for OI and the smoother are close (especially on temperature and velocity), the behaviours of both free runs are significantly different.The error level of FreeOI tends to increase (after day 50) for each variable.On the contrary, for each variable, the error level of FreeSmoothOI tends to decrease continously.This is blatant with velocity.Even after day 50, the error level keeps decreasing, suggesting that the dynamics stay consistent and close to REF.

Conclusions References
Tables Figures

Back Close
Full This example shows once again that the improvement of the solution due to the smoother is not only statistical, but has a real dynamical meaning.The smoother can be used with a non-optimal system (fixed-basis) and still exhibits significant improvements, both on the average error level and on the dynamics of the solution.Still, one should keep in mind that the parametrisation of the OI here is not optimal, since it is based on the same parametrisation of an evolutive filter.Moreover, as already discussed, many possibilities can be considered to use a smoother scheme based on OI.We choosed a simple configuration here, but more sophisticated ways to define the cross-covariance error matrix could improve the results.These results also suggest that a smoother scheme could be introduced in current operational centers systems, opening new perspectives for upcoming reanalyses of the ocean circulation.

Conclusions
The reduced-rank sequential smoother (Cosme et al., 2010) based on the SEEK filter has been further developed and implemented in a real ocean circulation model.The main purpose was to study the obstacles to implementing such an assimilation scheme in a real reanalysis system, and to study effects of the smoother on the estimation of the corrected states.The smoother was first implemented as described in its seminal formulation, i.e. with an evolutive filter.In a second step, a simplified scheme was introduced, based on optimal interpolation as many of operational centers use, and its benefits were exposed.A simple twin experiment was set up for this study, based on a perturbation of the initial condition.This configuration limited the benefits of the analysis throughout time, but a focus on the first assimilation steps enabled us to draw relevant conclusions about the smoother behaviour and benefits.
Technically, the implementation of the smoother algorithm is rather straightforward (when a filter algorithm is already implemented).It requires only a few extra calculation compared to the filter, and can be performed simultaneously to or after a filter pass.The smoother can then be considered as an additional layer to the filter algorithm.

Conclusions References
Tables Figures

Back Close
Full A major obstacle to implementing a smoother lies in the parametrisation of an evolutive filter, theoretically needed to define accurately the cross-crovariance error matrices.We used a criterion based on mean error statistics to specify the parameters associated to the filter.We found that, for the mean error statistics of the filter to be consistent with the true error, we had to prescribe large observation errors.Moreover, the filter efficiency was mostly significant during the first 10 days of the experiment.To preserve optimality over a longer period, the filter should be used with adaptative parametrisations (Brankart et al., 2010).
To circumvent this issue, we considered implementing a smoother based on a simpler system of optimal interpolation.To do so, we defined a new algorithm of half-fixed basis smoother.The advantage of such formulation is that it could be implemented almost as is with current assimilation systems of operational centers.This smoother was tested with a 10-day frequency observation network, but we also pointed out the ability of the smoother to provide a solution with a higher frequency (2 days in this case).
The smoother efficiency was studied in both cases: evolutive and fixed error modes.
Overall, the smoother was able to improve the results compare to a 3D assimilation schemes, filter or optimal interpolation.Error levels were reduced with the retrospective assimilation of observations.The smoother impact was mostly significant during the first 10 days of the experiment, just like the filter, underlying the fact that the smoother efficiency is strongly related to the parametrisation of the filter (or the optimal interpolation).
We also could verify that the smoothed solution was dynamically consistent, and in better dynamical adequacy with the reference than the filtered solution.Thus, the corrections introduced with the smoother are not only statistical but also have a dynamical consistency.This point was surprisingly true in the case of the half-fixed basis smoother, what is encouraging in the context of using the smoother to improve the quality of reanalyses.
We suggest that the smoother could be performed with actual reanalysis systems.Indeed, the algorithm could be implemented easily and with a negligible cost, and the Figures smoother has shown encouraging results with an optimal interpolation system.In other terms, there is no reason to not use it.Introduction

Conclusions References
Tables Figures

Back Close
Full  FreeSmooth Run initialized on day 2 of the data assimilation experiment, with the smoothed state.
FreeOI Run initialized on day 10 of the data assimilation experiment, with the state analysed with the optimal interpolation.
FreeSmoothOI Run initialized on day 10 of the data assimilation experiment, with the smoothed state basis on optimal interpolation.Introduction

Conclusions References
Tables Figures

Back Close
Full 17.75 • N) and with a resolution of ¼ degree.This con-assimilation experiment is then comes closer to REF.Note that e initial condition is not sufficient the smoother for a long term rea study of the impact of the smoot of corrected states, at least at th ment.

Simulated observations
Synthetic observations are extrac procedure described by Cosme observations are generated: in s salinity, and altimetric measurem In situ data mimic ARGO pr days, a set of vertical profiles is a other.This pattern is shifted by 2 to the next.It results in a 2 • den close to the average true ARGO 2 • and every 15 days).The sim is illustrated on figure 2, for one   neighbourhood of an analysed point are used for the correction.The size of the neighbourhood is defined as 15 • zonal and 10 • meridional in length (illustrated by the black boxe on figure1).It is chosen large enough so that at least one observation is available for each analysis.The zonal length is larger than the meridional length because the dynamics in tropical region is mostly zonal, so the correlations are more consistent in this direction.

Evolutive filter
We now present the results of the SEEK smoother in its original form, i.e. with evolving error modes, as presented in Cosme et al. (2010).For the reasons mentioned in section flation.But for the reasons given in section 2.5, it is not our choice here.Instead, we chose to tune a few parameters controlable in the system: the number of error modes (columns in S), the size of the localisation domain, and the amplitude of the (diagonal) observation error matrix R. The adopted criterion of consistency between errors on the mean and errors estimated by the modes is that the RMS error computed with the mean,  aspects, as for instance Liu and Rabier (2002), Rabier (2006) or Oke et al. (2008).To parametrise correctly R, several filtering experiments are conducted, with different values for observations error as precised in table 4. The experiments last 50 days, starting from june,25 considered as day 0 for the results.For each experiment, the true RMS error is compare to the mean estimate error of the filter (tr(SS T )).Results are    the filter is optimal, then the implementation of a smoother ives a better estimated solution, on both statistical and dyamical point of view.In our experiment though, these reults are limited to the beginning of the experiment due to e settings of the problem.Despite its benefits, a remaining obstacle to implement a moother is the parametrisation of the evolutive filter.Even this formulation is not fully consistent with the smoother they are extracted from REF every 2 days), but are gathered together at the nearest analysis days, and assimilated all together these days.For instance, observations between day 5 and 15 are gathered and assimilated at day 10.Next assimilation step occurs at day 20, with all observations available between day 15 and 25.Day 10 is now the date of the first assimilation step.
Figure 9 presents a diagram of the smoother implemen- SSH temperature smoother impact (red circles) is visible on each variable, assimilated or not.First, the evolution of error levels for the smoothed solution is continous.Discontinuities inherent to sequential assimilation solutions (visible here every 10 days, during OI analysis) are smoothed out.Also, even with a very simple configuration of cross-covariance matrices, the smoothed solution exhibits lower error levels than the filtered one, except for some points (between day 20 and 30 for the velocity).It is finally worth noticing that this smoothing scheme is almost cost-free and could be applied with current 6.1 Smoother's impact on the optimal interpolation so-smoother impact (red circles) is visible on each variable, similated or not.First, the evolution of error levels for smoothed solution is continous.Discontinuities inheren sequential assimilation solutions (visible here every 10 d during OI analysis) are smoothed out.Also, even wi very simple configuration of cross-covariance matrices, smoothed solution exhibits lower error levels than the tered one, except for some points (between day 20 and 30 the velocity).It is finally worth noticing that this smooth scheme is almost cost-free and could be applied with cur assimilation systems in operational oceanographic center Note that a 2-days configuration of a smoother can also used with a 10-days frequency evolutive filter.The smoo needs the propagation of error statistics between the be ning of an interval and the day of the retrospective ass ilation.This scheme is then similar to the asynchron Kalman filter described by Hunt et al. (2004).

Dynamics with a half-fixed basis smoother
As in the previous section, we here examine the dynam balance of the smoother solution.Free runs are initial at day 10 with the analysed states from OI (FreeOI) or smoother based on OI (FreeSmoothOI).The evolution of ror levels are shown on figure 11.Even if initial error le for OI and the smoother are close (especially on tempera and velocity), the behaviours of both free runs are sig cantly different.The error level of FreeOI tends to incre (after day 50) for each variable.On the contrary, for e variable, the error level of FreeSmoothOI tends to decre continously.This is blatant with velocity.Even after 50, the error level keeps decreasing, suggesting that the namics stay consistent and close to REF. Figure 12 illustr

Conclusions
The reduced-rank sequential smoother (Cosme et al., 2010) based on the SEEK filter has been further developed and im plemented in a real ocean circulation model.The main pur pose was to study the obstacles to implementing such an as similation scheme in a real reanalysis system, and to study effects of the smoother on the estimation of the corrected states.The smoother was first implemented as described in its seminal formulation, i.e. with an evolutive filter.In a second step, a simplified scheme was introduced, based on optimal interpolation as many of operational centers use, and its benefits were exposed.A simple twin experiment was se up for this study, based on a perturbation of the initial condition.This configuration limited the benefits of the analysis throughout time, but a focus on the first assimilation steps enabled us to draw relevant conclusions about the smoother behaviour and benefits.
Technically, the implementation of the smoother algorithm is rather straightforward (when a filter algorithm is already implemented).It requires only a few extra calculation com pared to the filter, and can be performed simultaneously to o after a filter pass.The smoother can then be considered as an additional layer to the filter algorithm.
A major obstacle to implementing a smoother lies in the parametrisation of an evolutive filter, theoretically needed to define accurately the cross-crovariance error matrices.We used a criterion based on mean error statistics to specify the parameters associated to the filter.We found that, for the mean error statistics of the filter to be consistent with the true error, we had to prescribe large observation errors.Moreover the filter efficiency was mostly significant during the first 10 days of the experiment.To preserve optimality over a longer period, the filter should be used with adaptative parametrisa tions (Brankart et al., 2010).• C) and the zonal velocity (m s −1 ), for the optimal interpolation (full black line) and the free runs restarted with: the corrected state at day 10 with OI (full red line), the smoothed state based on OI at day 10 (dotted red line).The RMS error for FALSE is also indicate with the dotted black line.
assimilation methods for geophysics have evolved continuously since their origins in the 70's.In the branch of estimation theory, the Kalman filter (Kalman, 1960) has been widely used in oceanography.Its implementation with large numerical Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The filter performs a forecast step by propagating x a k−1|k−1 with the dynamical linear model operator M k−1,k (Eq.1), leading to the forecast state x f k|k−1 at time t k .Error covariance matrix P a k−1|k−1 is also propagated with the model operator (equation 2) and a model error Q k−1,k is added to take into account model uncertainties and approximations.Thus, at the end of the forecast step (at time t k ), an estimate of the ocean state x f k|k−1 and error statistics P f k|k−1 associated to this state are provided by the filter, given all observations available up to time t k−1 .
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | initial error covariance are calculated based on model's time variability.270 EOFs are computed from REF, using ocean states in Summer and Fall between 1995 and 2000.Winter and Spring are not considered because the experiment spans the Summer months only and because TIWs or Brazil rings signatures are mostly present in Summer and Fall.The first 39 EOFs, that represent about 95 % of the total variance of the 270 initial EOFs, are selected to form the initial error modes.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | mate.But it has also undergone a series of statistically-based corrections that move this dynamics/statistics trade-off toward statistics, what may further alters its dynamical balance.To check the representation of the dynamics in the smoothed states (compare Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | was not tested here.These intermediate smoother analyses are represented with red squares and blue crosses on Fig. 9.
Discussion Paper | Discussion Paper | Discussion Paper | Figure 12 illustrates this point in the north-Brazil current region.During the 40 first days, the Brazil rings are quite similar between FreeOI or FreeSmoothOI.But at day 80, the rings and the branch of the north-Brazil current's retroflection are clearly better simulated in FreeSmoothOI.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 1 .
Fig. 1.TATL4 configuration: sea surface temperature is colored and exhibits the development of TIWs in the center of the bassin; black contours indicate the 0.5m.s−1 iso-velocity at 30m depth, exhibiting the north Brazil current and the rings formation.The black square boxe illustrates the influence zone of an observation (see section3.5).

Fig. 1 .Fig. 2 .Fig. 3 .
Fig. 1.TATL4 configuration: sea surface temperature is colored and exhibits the development of TIWs in the center of the bassin; black contours indicate the 0.5 m s −1 iso-velocity at 30 m depth, exhibiting the north Brazil current and the rings formation.The black square boxe illustrates the influence zone of an observation (see Sect. 3.5).

Fig. 2 .
Fig. 2. Horizontal distribution of in situ observations network simulated for the experiment.Left: observations available for one assimilation step (here: 30 June 2005).Right: observations coverage after 18 days.

Fig. 4 .Fig. 5 .
Fig. 4. Influence of temperature observation error on the error statistics.Red curve represents the true RMS error on temperature and black curve the error estimated with the filter (y axis: day of the experiment, x-axis: RMS error, in • C).The black dotted curve shows the RMS error of FALSE.Each figure corresponds to a different parametrisation of temperature error in R.Values are specified in Table1.

Fig. 5 .
Fig. 5. RMS error for the SSH (m), the temperature (• C) and the zonal velocity (m s −1 ), for the filter (full black line) and the smoother (circles).The trajectory of the run FALSE is also represented with the dotted line.

Fig. 6 .
Fig.6.RMS error for the SSH (m), the temperature ( • C) and the zonal velocity (m.s −1 ), for the filtered run (full black line) and the free runs restarted with: the filtered state at day 2 (full red line), the smoothed state at day 2 (dotted red line).The RMS error for FALSE is also shown with the dotted black line.

Fig. 6 .
Fig.6.RMS error for the SSH (m), the temperature ( • C) and the zonal velocity (m s −1 ), for the filtered run (full black line) and the free runs restarted with: the filtered state at day 2 (full red line), the smoothed state at day 2 (dotted red line).The RMS error for FALSE is also shown with the dotted black line.

Fig. 7 .
Fig. 7. Evolution of absolute error on velocity (in m s−1 ), between REF and FreeFilt (left) and FreeSmooth (right).The error is calculated along a section localized at 5• N, and between 40 • W and 50 • W.

Fig. 9 .Fig. 10 .
Fig. 9. Schematic representation of the half-fixed basis smoother with a 10-days frequency observations network.Analysis covariance matrix (in green) are computed after each optimal interpolation analysis (at time k and k +10).The cross-covariance matrix used for the smoother analysis between date k and k + 8 (respectively k + 10 and k + 20) is indicated in red (respectively blue).Smoothed states are indicated with red or blue symbols (see Sect. 6 for more explanations).

Fig. 10 .Fig. 11 .
Fig. 10.RMS error for the SSH (m), the temperature ( • C) and the zonal velocity (m s −1 ), for the optimal interpolation (full black line) and the smoother (red circles).

Fig. 11 .
Fig. 11.RMS error for the SSH (m), the temperature (• C) and the zonal velocity (m s −1 ), for the optimal interpolation (full black line) and the free runs restarted with: the corrected state at day 10 with OI (full red line), the smoothed state based on OI at day 10 (dotted red line).The RMS error for FALSE is also indicate with the dotted black line.

Table 2 .
Nomenclature of the different runs.

Table 4 .
Observation errors of SSH, temperature and salinity for sensitivity experiment ErrObs01 to ErrObs04.