Mario Valle Web

Tutorial on Large Scientific Data Management, Format and Archiving Issues

[Lo speaker, cioè io]

Many users at CSCS still use “in-house” archiving formats, thus making it difficult to develop and maintain visual exploratory tools, data browsers, data viewers. Furthermore, these ad-hoc formats might not support parallel I/O, distributed file I/O, different byte-endianess, random access, etc…, and as such, are more difficult to visualize. We would like to encourage users to think of data as a long-term resource, adding non-numerical information to promote self-describing meta-data, allow semantic search and increase data longevity.

The tutorial will put an emphasis on several well-known, open-source, license-free data archiving formats in use at many HPC centers around the world (HDF5, netCDF, CGNS, etc.). We will cover the issues described above and study different examples of complex, multi-data, multi-variable, hierarchical data archiving problems. We will further demonstrate how the Visualization staff can help you develop an archiving protocol for your data to foster a perennial use of your data, develop or use well-accepted standard data browsers, and develop interfaces for standard visualization packages such as AVS, Ensight, VTK, TECPLOT.