The Information Quality Cluster has formally defined information quality as a combination of the following four aspects of quality, spanning the full life cycle of data products:
1. Scientific quality defined in terms of accuracy, precision, uncertainty, validity and suitability for use (fitness for purpose); 2. Product quality that takes the following into account: the degree to which the scientific quality is assessed and documented; how accurate, complete and up-to-date the metadata and documentation are; the manner in which the data and metadata are formatted; the degree to which the associated information is published and traceable throughout the data lifecycle; 3. Stewardship quality addressing questions such as how well data are being managed, preserved, and made accessible; and 4. Service quality that deals with how easy it is for users to discover, get, understand, trust, and use a given data product along with its metadata, as well as ensuring that an archive has the requisite knowledge base and people functioning as subject matter experts available to help its data users.
The purpose of this session is to focus on scientific quality, and especially on uncertainty. A panel of invited speakers from a variety of Earth science disciplines will address questions such as: How is uncertainty determined and characterized in the products of their research or application? What are the major side effects and limitations of common statistical techniques used to quantify and characterize uncertainty? What is the impact of uncertainty on the quality of their data products? How is data uncertainty accounted for when multiple sources of data are spliced and woven into a single product? How do they document and convey the information about uncertainty to other scientific users? What is the best way of conveying uncertainty to the (possibly skeptical) public?
Ultimately, this session intends to provide expert knowledge from different perspectives on a relatively focused topic–Data Uncertainty–that is extremely challenging but critical in both establishing and elevating the user communities’ confidence in Earth Science data. Hopefully, these presentations will lead to a community-wide discussion during and after the meeting on the problems, challenges, and solutions to a wide range of issues surrounding data uncertainty.
Discussion initiated during this plenary session inspired by the invited panelists’ presentations will be continued in the afternoon breakout session,
http://sched.co/As6H, for a more collaborative interchange of technical information intended to help advance the Scientific Quality of Earth science data and to discuss effective ways to capture and communicate uncertainty to a broader community.
AGENDA
A. Introduction - David Moroni
B. Panelists' Presentations: - Title TBD - Carol Anne Clayson (Woods Hole Oceanographic Institution)
- Data uncertainty: what is it, where does it come from, and why should we care? - Amy Braverman (Jet Propulsion Laboratory, California Institute of Technology)
- Challenges in Evaluating a Global Climate Models with the Limited Observational Data Record - Isla Simpson (National Center for Atmospheric Research)
C. Q&A - All
PRESENTATION ABSTRACTS & SPEAKER BIOGRAPHIES:
1. Title TBD - Carol Anne Clayson (Woods Hole Oceanographic Institution)
2. Data uncertainty: what is it, where does it come from, and why should we care? - Amy Braverman (Jet Propulsion Laboratory, California Institute of Technology)
Abstract: NASA, NOAA, and other space agencies are producing massive quantities of data that will ultimately be used for science and decision making. These data are typically collected by observing systems that capture indirect measurements of the phenomena of interest (i.e. radiances), and then apply complex algorithms to infer the underlying geophysical phenomena. But, the physical observing mechanisms and algorithms are imperfect, and this creates uncertainty about the results. Probability provides a formal mechanism for quantifying uncertainty, in the form of probability distributions, that is coherent, intuitive, and mathematically precise. In this talk I will discuss why I believe that probabilities should be used to define data uncertainty in the remote sensing context, and the consequences of not doing so.
Biography: Amy Braverman is Principal Statistician at the Jet Propulsion Laboratory. She holds a B.A. in Economics from Swarthmore College, and after working for eight years in litigation support consulting, she entered UCLA where she earned an M.A. in Mathematics and Ph.D. in Statistics. She joined JPL in 1999 as a post doc, and has been there ever since. Dr. Braverman's research interests span several areas, all related to the use of massive remote sensing data sets in Earth and climate science: information-theoretic approaches to data reduction, data fusion using spatial and spatio-temporal statistical methods, development of new statistical methodologies for the evaluation of climate models by comparison to observations, and uncertainty quantification for remote sensing applications. She is a Fellow of the American Statistical Association, and serves on the editorial board of the Journal on Uncertainty Quantification. She is a Program Leader of the Statistical and Applied Mathematical Sciences' (SAMSI's) upcoming program on Mathematical and Statistical Methods for Climate and the Earth System (2017-2018), and served in similar roles for previous SAMSI programs on uncertainty quantification and massive data sets. In December 2008 she led the National Research Council's (NRC) Workshop on Uncertainty Management in the Use of Remote Sensing Data for Climate Studies, and was a member of the NRC's Committee on Applied and Theoretical Statistics from 2004-2010. In 2008 she and colleagues founded the American Statistical Association's Advisory Committee on Climate Change Policy.
3. Challenges in Evaluating a Global Climate Models with the Limited Observational Data Record - Isla Simpson (National Center for Atmospheric Research)
Abstract: A continuing challenge when it comes to evaluating global climate models is accounting for the uncertainties associated with the limited length of observation based records. For many aspects of both the climatology and shorter term climate variability, the limited observational data record results in an uncertain ground truth against which to compare our models. This motivates combining the use of the observational records that we do have with large model ensembles to assess both the uncertainty associated with short term records and the fidelity of our model simulations. Here, the case of the Northern Hemisphere extra-tropical circulation response to the El Nino Southern Oscillation (ENSO) will be presented as an example and it will be shown that substantial uncertainties exist on the composite mean response to ENSO over the limited observational record due, in large part, to internal atmospheric variability (weather noise). Biography: Isla Simpson is a Scientist in the Climate Analysis Section of the Climate and Global Dynamics Division, NCAR, studying large scale atmospheric dynamics and its representation in Global Climate Models. She has a Ph.D. degree in Atmospheric Physics from the Imperial College, London. She works to understand dynamical mechanisms involved in the variability and change of the large scale