Banner - Abstracts Banner
Banner - NOAATECH
Home - takes you to the index page. Tab - Agenda takes you to Tuesday's agenda. Tab reading - Abstracts
Tab reading Local Info
Tab reading register




Metadata Tool

James Berger
National Oceanographic Data Center

We will build a web-form Metadata Tool that will allow scientist to thoroughly document and QC their environmental data, regardless of data type or format. Drop-down lists will dynamically narrow after each
selection, to increase accuracy and ease of use, and indicate completeness of documentation. New terms and citations can be added to the Tool's dictionary as needed to support any depth of documentation. The Tool's Metadata Data Base (MDB) contains such a wealth of scientific information and access to data, that it
will be a valuable assets at every stage of a scientific project. When data and documentation arrive at NODC, the Tool will verify its accuracy and completeness. FGDC, NOAA Portal and customer metadata requirements will be satisfied by reports generated from the MDB.

Problem - Oceanographic data covers every scientific discipline. So, oceanographic data consists of a large number of diverse data sets in a wide range of formats, usually from short term projects. Historically, ‘standard formats' covered few data types, and imposed expensive and error-prone reformatting requirements, which discouraged data submission and delayed data access. Data types not covered by standard formats are stored in originator format with minimal inventory or access.

The Solution starts with an ASCII dump of the originators data into a Working Archive. Build a web tool that simplifies the documentation process to a selection from drop-down lists, continually evaluates documentation completeness and accuracy, and accommodates all level of documentation. Build a Dictionary of terms to be used in the Metadata Tool and allow all user to add terms. Provide QA routines to evaluate data and documentation. Use the documentation to automate the drudge jobs, and make data and documentation available to customers immediately – providing real-time pier review.

Reuse Existing Formats - After each selection, the Metadata Tool will display a list of existing formats that match you selections, so far. You can use an existing format, as is, or edit it.

Dynamic Metadata - automate data reformatting. If a target format statement is entered into the Metadata Tool, its attributes can be used to find and retrieve all candidate data sets. Since each candidate data set has a format statement, each data column can be mapped to the target format. The program can compare unit, character format, etc. fields to call conversion subroutines as needed. Inventories can be constructed on the fly. Data transfer to standard data bases can be automated. Any number of data sets can be merged into a COTS
analysis-display utility. FGDC and NOAA Portal requirements can be satisfied by a standard report from the MDB.

Open Source - We propose to use platform-independent, open-source software, i.e., HTML, Perl/CGI, MySQL, Java and Javascript, and the same open source policy that built Perl and LINUX. This policy will encourage participation from the world-wide environmental community, tap free resources to design and build the system, avoid the costs and lack-of-control of proprietary systems and build a constituency of interested users. We welcome expertise from everyone, and will recognize their contribution.

Auditorium - Paper
Wednesday - 10:30 - 10:50 A.M.

 

Publication of the NOAA Office of the CIO/High Performance Computing and Communications
Last Updated: 09/27/01
Designer/Webmaster: Jward