Understanding the global environment depends on the ability of people around the world to share observations and, more importantly, to understand observations made by many producers. If detailed documentation describing the data is readily accessible and understandable, the data will be trusted. To be able to integrate this documentation in distributed data systems (such as GEOSS federated catalogues), this documentation has to be structured in a common way. ISO 19115 standards for metadata offer a internationally accepted way to describe the data in which metadata catalogues can be populated by data producers and users can be able to discovery, select and access the data.
When users look for a dataset they search and select in the metadata catalogues the data that fits for their purpose. There are several metadata entries that can be used for assessing the fit for purpose such us, the resolution, the spatial extent, etc but one of the most important ones are the quality, the lineage and the usage. In ISO, quality is expressed as a set of quality indicators that can be numerical measures or conformance test checks, the lineage is expressed as a list of sources and processes and the usage as list of common known applications.
Metadata is difficult to generate and sometimes this results on poor metadata that jeopardizes the ability to find what is really available. To stimulate the metadata creation and to assess metadata completeness, the rubric system (CORS, 2012) was developed by NOAA as a tool that interprets standard metadata in XML format and generates a score that evaluates the completeness in the metadata and presents a comprehensive visual report on black fields that can be later improved. The scoring method used is based on the presence or absence of the information, attributing one point to metadata that contain certain information.
Unfortunately, the original rubric does not evaluate any of the quality elements previously mentioned.
- Help providers to asses the completeness of their metadata records, in general, and on quality aspects, in particular, extending the rubric system.
- Help producers to improve their quality metadata for better visualization and comparison in the GEO Portal.
- Provide a tool for data users that help for a better understanding of the metadata content.
The original rubric analyzes 8 big metadata blocks: Identification, Connection, Extent, Distribution, Description, Content, Lineage and Acquisition Information. In the rubric GeoViQua extension, two new information groups related with ISO quality have been added (Quality and Usage) to increase the information assessed. The extension also improves lineage information analysis. Moreover, the new information groups have increased the “Total Spiral Score” up to a maximum score of 48. To focus on quality metadata, the extended tool generates summary tables showing clearly and concisely not only if there is presence of quality elements but also the actual metadata content.
Both the original and the extended rubric tool are a XSL style sheet that can be added to any ISO metadata records in XML without any further modification. Applying the rubric report to GEOSS metadata catalogues will help to determine the best way to select a handful of important metadata elements that are enough to characterize datasets. These few elements can be part of the GEO Label.
One of the objectives of GeoViQua is to demonstrate that, with simple modifications in some pages, it is possible to include quality enhancements in the GEO Portal to make filtering and visualization more productive and attractive. To do that, Sapienza Consulting, as ESA subcontractor for the GeoViQua project, has provided a mirror copy of the GEO Portal that is being modified by the project members, and centralizes several visualization improvements.
The Search results page is the part of the portal with the highest potential of improvements by incorporating visual components for quality information. It has been modified in several ways.
One important feature is the possibility to select more than one result with a check box and to execute an operation based on the underlying metadata document(s). The current list of possible operations for one or multiple results starting from a search results listing page are shown in Figure 1.
Figure 1. Selectable visualization options for the GEO portal results.
A screenshot of the metadata comparison showing a map and color-coded comparison of specific metadata elements of three example datasets is displayed in Figure 2. To complete the availability of the graphical search, the possibility of generating statistical plots of the selected records has been added to the GEO Portal Mirror GUI by integrating it in the metadata comparison tool. It means that it acts automatically over the same group of selected records to compare and that the attributes plotted are predefined. In the future an interactive way of selecting attributes is foreseen. The plot display is added to the metadata comparison interface, as shown in Figure 2.
Figure 2. Example of the metadata comparison and the statistical plots tools.
The Provenance viewer provides a detailed visualization of the documented description of processes used to obtain the data on its current state. The processes and source data used on each step are related using hypertext links, and buttons to collapse the information presented, see Figure 3.
Figure 3. Example of the provenance viewer.