Climate Model Diagnostic Analyzer (CMDA) is a collaborative platform to support the full life cycle of a data analysis process, from data discovery, to data customization, to analysis, to reanalysis, to publication/sharing, and to reproduction [1,2]. CMDA was initially developed to demonstrate the methodology to evaluate and diagnose climate models through the comprehensive use of multiple observational data, reanalysis data, and model outputs. It has evolved to support collaborative scientific activities and the full life cycle of data analysis.


CMDA has three subsystems that are highly integrated. Data System manages datasets used by CMDA analysis tools, Analysis System manages CMDA analysis tools which are all web services, Provenance System manages the meta data of CMDA datasets and the provenance of CMDA analysis history.


These three subsystems are not only highly integrated but also easily expandable. New datasets can be smoothly added to the Data System and scanned to be visible to the other subsystems. New analysis tools can be simply registered to be available in the Analysis System. After the new analysis tools are registered, new meta data generated by the new analysis tools can be captured in the database of the Provenance System without any further configuration. The Provenance System supports the search of the usage history, publication/sharing and reproduction of the results.


CMDA is fully integrated into the OpenNEX environment ( where a user can login, keep track of their project progress, find associated analysis tools, register new analysis tools, find the provenance of their previous analysis runs, search for new datasets and analysis tools. A user can run CMDA analysis webservices from the OpenNEX environment, find the provenance of their runs, publish their run results, and share them with their colleagues. CMDA provides two user interfaces for its analysis web services: HTML interface and Jupyter Notebook interface.


CMDA Webservices

The HTML CMDA Webservices interface provides a user with a full control on input parameter selections for a given service such as dataset selection, dataset subsetting condition, dataset analysis functionality, and output data visualization parameters.

Jupyter Notebook Server

For more advanced users with a python programming background we provide a Jupyter Notebook server. It allows users to make direct API requests to the CMDA webservices to retrieve the datasets of their choice and perform their own analysis. A full interactive user input selection is available and a comprehensive interactive user output visualization is provided.


CMDA Datasets

CMDA hosts over 2000 datasets covering model datasets, observational datasets, and reanalysis datasets. The model datasets include CMIP5 historical runs, CMIP5 AMIP runs, CMIP5 RCP4.5 projection runs, and WRF model runs with various physical parameterizations. The observation datasets cover many satellite data (AIRS, AMSR-E, AVISO, CERES, GRACE, GPCP, GPM, ISCCP, MISR, MODIS, MLS, QuickSCAT, SMAP, TES, TRMM) and ship-floats data (ARGO). The reanalysis datasets include ECMWF and GLDAP data. The datasets used for the NASA Summer School are listed in this document.


[1]     Educational and Scientific Applications of Climate Model Diagnostic Analyzer, Seungwon Lee et al., IEEE International Congress on Big Data, Honolulu, HI, June, 2017.

[2]     Climate model diagnostic analyzer, Seungwon Lee et al., IEEE International Conference on Big Data, Santa Clara, CA, October, 2015.