Wednesday, July 30, 2014

Uncertainty and Information Theory

We all are persuaded that uncertainty is a big topic, in life but also, in hydrology. So important that many hydrologists dedicate their life to its estimation, in connection to hydrological processes. Uncertainty since it is uncertain also generate confusion, and some of tis literature is  confuse and confusing (I don't want to cite negatively anyone, but I could).
Whatever the case, one of the best talk I attended to at last Fall American Geophysical Union Meeting, was the invited lecture by Hoshin Gupta. Hoshin has an outstanding (really outstanding, I mean) carrier in finding calibration methods, indentifiability of parameters and understanding uncertainty in models. Recently (see for instance Gong et al., 2013) he started to apply concepts derived from information theory to hydrology.  BTW, you can find the pdfs of his AGU’s presentations here: on the necessity to apply information theory concept to evaluate models structural hypotheses, and another one about Information theory and Bayesian inference in hydrology (both with a lot of citations).

I never really understood why hydrologists do not use information theory  concepts. I-Theory is a well developed mathematica theory with a lot of tools, and could help to get out from the fuzziness around  the determination of uncertainty in models. Besides, using the concept of I-Theory information/uncertainty one can gain knowledge about the complexity of processes outputs and, possibly, infer something about the "complexity" of models required to mathematically account for it in a proper way (remind: "Everything should be made as simple as possible but not simpler").

Hoshin is not the only one that was attracted by information theory. In my occasional browsing of the topic, I also found some other interesting papers: the first one, by  Majda and Gershgorin, is concerned by climate models. This is encouraging, because climate models are certainly at least as involved as hydrological models are, and, if not, even more. A second is Weijs et al. (2013): this is concerned with time series: we compare time series, therefore knowing how much information is hidden in a time serie (at least with reference according to some encoding key) is certainly useful. For Wejis and van de Giesen, this paper is just a coming back to the topic (see also Weijs et al., 2010, and Weijs CV)

Another paper came from  Rudell (GS) on EOS remarkably highlighting that the I-Theory applications to hydrology attracted last year  many more people than use to be.
For making me feeling among the smarter, I  bought a book, by Mezard (see also, and GS) and Montanari (Andrea, not our colleague Alberto who also has quite a production on uncertainty: please see his website) which can be a further source of ideas and thoughts.

So far, I never actually read carefully any one of the papers (or the book), but excited at the idea to have time to do it in deep.

References

Gong, W., H. V. Gupta, D. Yang, K. Sricharan, and A. O. Hero III (2013), Estimating epistemic and aleatory uncertainties during hydrologic modeling: An information theoretic approach, Water Resour. Res., 49, 2253–2273, doi:10.1002/wrcr.20161.

Mézard, M. and Montanari, A. , Information, Physics, and Computation, Oxford University press, 2009

Majda, A. J.,  and Gershgorin, B., Quantifying uncertainty in climate change science through empirical information theory, PNAS, August 24, 2010, vol. 107,no. 34, 14958–14963

Ruddel, B.L, N. A. Brunsell and P. C. Stoy, Applying Information Theory in the Geosciences to Quantify Process Uncertainty, Feedback, Scale, Eos, Vol. 94, No. 5, 29 January 2013

 Weijs, S. V.;  Schoups, G.  and van de Giesen, N., Why hydrological predictions should be evaluated using information theory, Hydrol. Earth Syst. Sci., 14, 2545-2558, 2010, www.hydrol-earth-syst-sci.net/14/2545/2010/, doi:10.5194/hess-14-2545-2010

Weijs, S. V., van de Giesen, N. and Parlange, M. B., Data compression to define information content of hydrological time series, Hydrol. Earth Syst. Sci., 17, 3171–3187, 2013 www.hydrol-earth-syst-sci.net/17/3171/2013/ doi:10.5194/hess-17-3171-2013

Tuesday, July 22, 2014

Patterns for the application of modern informatics to the integration of PDEs: the case of the Boussinesq Equation

Today Francesco Serafin graduated finishing his master in civil and environmental  engineering. In brief, the scope of his thesis was to implement a series of classes, eventually ported to OMS, to solve the groundwater Boussinesq equation (under the link, please find also specific reference to previous work), but with a more large scope to envision an object oriented structure which could work for any PDE. Let Francesco's introduction talk:

"Mathematical models play a fundamental role in many scientific and engineering fields in today’s world. They are used for example in geotechnics to evalute the hillslope stability, in weather science to predict weather trends and produce weather reports, in structural design to study the resistance to stress, and in fluid dynamics to compute fluid flows and air flows.

Consequently mathematical models are evolving all the time: more and more new numerical methods are being invented to solve the Partial Differential Equations (PDE)s that describe physical problems with increasing precision, and more and more complex and efficient processor units are being created to reduce the computational time.
Therefore, the code into which the mathematical models are translated has to be “dynamic” in order to be easily updated on the basis of the continuous developments (Formetta et al. (2014) [16]).
On the other hand, completely different physical problems are often de- scribed using similar PDEs. For this reason, the numerical methods which provide solutions to different problems can be the same. This suggest the implementation of an IT infrastructure that hosts a standard structure for solving PDEs and that can serve various disciplines with the minimum of hassles.

This work is focused on the application of what is envisioned above, with the main purpose of the creation of an abstract code for implementing every type of mathematical model described by PDEs.

We work on hydrological topics but we hope to design a structure of general interest. Obviously the final goal of any work of this type is to find a proper numerical solver, and therefore, part of the thesis is devoted to the analysis of the problem under scrutiny, and the description of the solution found."

Tuesday, July 8, 2014

Quickness and exactitude

I put here an internal review of one of our manuscript, because, I hope, it can be useful in general. The topic is evaluating the rainfall runoff of a small catchment (but I hoped it was en estimation of the global hydrological cycle, even if without evapotranspiration measurements).

"The paper is written in a good English (finally). However good English does not mean is a good paper. It lacks of focus and is not concise (lack of exactitude and quickness, see at the end of the post). Objectives are not clear, and the novelties of the paper not evident. However, I am not desperate to obtain at the end something reasonable: but this just because I know the amount of work behind it, and, in part, the row-material.

Making a rainfall-runoff model cannot be usually considered an exercise at the frontier of our science (citing conversations with Ignacio Rodriguez-Iturbe. However, it could be, as testified by Gunther Bloschl's ERC). It, making rainfall-runoff, I mean, certainly can bring information about a certain basin. However, in our case, the works of N* and M* already filled this space. So what it is the goal of this paper ?
The initial idea was to assess the uncertainty in prediction of discharges by using appropriate statistical techniques. In particular, the idea was to assess the uncertainty inherent to rainfall extrapolation from point measurements to spatial measurements. 
This task has been only partially fulfilled. For the following reasons: errors due to instruments precision were not included (just the hypothesis of perfect functioning measures was applied);  the way rainfall has been included in the model (is not yet clear if average rainfall, one point for each hillslope was used, average rainfall volume for any information or other approximations were utlised: and no sensitivity analysis with respect to the way distribute rainfall was squeezed into the model was performed); the interplay between rainfall and discharge forecasting is not well developed, at least as it could be, i.e. explaining how it works inside the whole procedure is not explained well.  
Therefore the overall rainfall prediction analysis is incomplete, and I expect it would be completed for the thesis. 
The technical novelty we apply in this work is that we use a calibration tool (LUCA) to assess variograms, and we do it at hourly time step, while others do usually at daily time step. A few questions here: how much this approach improves rainfall estimates ? i.e., taking uncalibrated variograms and/or constant variograms (not varying in time) how much difference do we get ? How much this affects the forecasting of the volumes of water? Which comprehensive effect has this on the forecasting of the discharges ?

It could be that all of these approximation have negligible effects on the forecasting of discharges. But this would be indeed good to know and an achievement, which was not obtained so far. 

A second topic of interest was the simulation of the whole hydrological cycle, and a tentative to close the hydrological budget with the Priestley-Taylor simulation of evapotranspiration. This simulations were done but not shown at all in the manuscript. Why not ? Do the simulated discharges and the  simulated ET sum to the total volume of rainfall ? If not, which interpretation do we have about the missing mass ?  Are we able to assess the uncertainty in predictions of each single component of the hydrological cycle obtained with this method? Are we able to observe interannual variability (both in discharges and evapotranspiration, and, if the case, in storage) ? Is this variability estimate reliable, at least as a gross budget ?

Having missed to answer to each one of the questions above the paper results a wandering around that breaks our karma (citation from Vijay K. Gupta).  Please save us with more rigor. 

Regarding quickness and exactitude, I suggest the reading of Italo Calvino's Six Memos for the next Millennium.^1^2

^1 - Here a video seminar on the Six Memos by Paolo Granata
^2 - Hainging around, in a digression maybe, and unfortunately in Italian, the Discorso sulla Matematica (Talk on Mathematics) inspired and guided by Calvino's lectures, written by Gabriele Lolli

Monday, July 7, 2014

The History of Noise

 Discovered in the page of Angelo Vulpiani, a preminent Italian physicist, I believe this paper on noise by L. Cohen  is an  amusing paper that can redirect to more technical readings.
Noise, Brownian motions (a nice  paper by Cecconi et al.), stochastic equations were never touched in my blog before, but they have indeed an important role Hydrology, and one of the warhorses of my friend and master Ignacio Rodriguez-Iturbe (see Random Functions and Hydrology). As Amilcare Porporato said, part of the recent Ecohydrological way is based upon the writing of Master equations, and solving them, under the appropriate assumptions about  noise.
The paper indeed treats noise as the product of atoms and molecules, or, from a deeper perspective, as produced by quantum effects.
Noise in hydrology has problably a different origin, in chaos from one side and on heterogeneity from the other (well heterogeneity is, in a sense, randomness, so the phrase is quite tautological).

Friday, July 4, 2014

Cartoon guide to statistics

Since I am supporting the idea that Hydrologists should know very well statistics, it is with pleasure that I discovered in Rbloggers these two cartoon guides to statistics.

The first is the Cartoon guide to statistics by Gomick and Smith from which the figure above is an excerpt.

"Witty, pedagogical and comprehensive, this is the best book of the bunch! It provides a historical perspective and covers quite advanced topics such as confidence intervals, regression analysis and probability theory. The book contains a fair deal of mathematical notation but still manages to be accessible." MarkR

The second one is the Manga guide to statistics by Shin Takahashi and illustrated by Iroha Inoue.

"As opposed to the Cartoon Guide to Statistics the Manga Guide reads more like a standard comic book with panels and a story line. The story centers around the schoolgirl Rui who wants to learn statistics to impress the handsome Mr. Igarashi. To her rescue comes Mr. Yamamoto, a stats nerd with thick glasses. The story and the artwork is archetypal manga (including very stereotype gender roles) but if you can live with that it is a pretty fun story."MarkR