A One-Stop-Shop for Geological Sample Data Collected from the World's Oceans and Lakes
November 23, 2023
Most of the Earth is covered by water, with five large ocean basins covering more than 70% of its surface. These basins and the earth beneath are the realm of marine geologists, who play a critical role in a wide range of industries and research fields, including energy exploration, offshore engineering, environmental consulting, government agencies, and academia. Marine geologists use their expertise to identify and assess natural marine resources, locate and characterize potential marine geohazards, advise offshore development, and inform our understanding of Earth's history, evolution, and the effects of climate change.
The National Oceanic and Atmospheric Administration (NOAA) provides support to the marine geological community through its National Centers for Environmental Information (NCEI), which hosts and maintains the Index to Marine and Lacustrine Geological Samples
(IMLGS). The IMLGS provides a central location for researchers to discover and access digital geological data gleaned from physical samples collected worldwide from the bottom of oceans and lakes as well as the actual physical samples underlying the digital data and curated at partner institutions.
This data management resource consists of an Oracle relational database, a tailored data ingest pipeline, a web-accessible map viewer that enables users to visualize and query the database, and an application programming interface that facilitates communication between the database and the map viewer.
"There is no other resource like it in the world," said Clint Edrington, the Northern Gulf Institute's Marine Geology Data Manager for NCEI. "The cost to the scientist for obtaining a physical geological sample from the IMLGS community is the cost of shipping as opposed to sourcing new money to mobilize an expedition and collect new samples. And for some researchers, the IMLGS digital data meet their needs."
Even in the best of times, funding for sampling the world's seabeds and lakebeds is limited, especially for expensive, hard to reach areas like the deep ocean. The IMLGS provides scientists with an alternative approach for obtaining hard-to-get-to marine and lacustrine geological samples, enabling new research to be conducted that might otherwise not be possible.
HOW IT STARTED
In 1977, the National Science Foundation sponsored the inaugural meeting for curators of marine geological samples (the Curators Community) that included representatives from 18 sample repositories. One of their goals was to establish a data center that disseminates information about each repository's physical samples and data holdings, eliminating the need for users to make the same time-consuming inquiry to multiple repositories. Members of NOAA's National Geophysical Data Center, the predecessor to NCEI, were in attendance and agreed to serve as the Curators Community's central data center, and thus the IMLGS was conceived. Soon after that first meeting, the Curators Community began submitting data to the IMLGS and continues to do so to this day.
The original group of repositories was in America, but during the years that followed, the Curators Community grew not only in numbers but also became international in scope. Today, the IMLGS references samples associated with 30 entities from the U.S., Canada, France, Germany, and the United Kingdom. Active members of the Curators Community continue to gather biannually at partner institutions to share and discuss ideas, best practices, and the status and development of the IMLGS.
HOW IT WORKS
"The unsung hero that makes the IMLGS work is an agreed upon and consistent metadata, including a controlled set of vocabularies," said Edrington. The IMLGS database contains a combination of technical information – such as lithology, mineralogy, texture, and geologic age – as well as non-technical information – such as ship name, collection site latitude and longitude, water depth, and the device used to collect samples.
To facilitate consistent metadata, Curators submit sample data to NCEI by filling out a template, which is later ingested into the IMLGS database. "The Curator enters numerical or text information into some fields like 'Date Collected,' but for other fields such as 'Lithology,' free-hand description is not allowed. Instead, the Curator follows the controlled set of vocabularies to enter the appropriate code," explained Edrington. "This procedure ensures database consistency."
The specifications for metadata and the contents of controlled vocabularies for the IMLGS have evolved over the years based on the needs of the Curators Community and will continue to evolve with changes made by community consensus.
To increase efficiency, NCEI recently updated the API to enable trusted Curator partners to submit data through it instead of the template. "This capability is technically advanced, and not all Curators will be able to take advantage of it at first, so we will still maintain the data submission template for the foreseeable future."
HOW IT'S USED
Data accessed from the IMLGS can be used by different user groups to accomplish a variety of tasks, such as assessing benthic habitats, evaluating offshore geohazards, and understanding past climate changes. For example, a Ph.D. student made use of the IMLGS when studying the role that wildfires played in the rapid transition of ecosystems from grasslands and savannahs to what is now the Sarah Desert.
To unravel the history of North African wildfires, the student used the IMLGS to find core samples from offshore of West Africa, samples that were collected mostly during the 1950's - 1970's and now archived at an IMLGS partner sample repository, to investigate fire biomarkers in marine sediment. As another real-world example, an energy company conducted a desktop study for offshore wind development near Hawaii and utilized the IMLGS to locate existing digital data for areas of the seabed where they had little or no data.
There are currently 228,785 samples in the IMLGS database, with most digital sample data having corresponding physical samples in existence at partner repositories. Many of the physical samples are 'prized' samples, collected from the world's most remote and/or interesting environments, such as the mid-ocean ridge.
Edrington anticipates that the IMLGS portal will see continued use in the years ahead, especially as funding uncertainty limits the availability of other geological data resources, such as the National Science Foundation's decades-long scientific drilling program that is ending. "The IMLGS and its partner repositories will continue to provide the research community an alternative approach for obtaining marine and lacustrine geological sample material that furthers scientific research."
The National Centers for Environmental Information
is the nation's leading authority for environmental data, managing one of the largest archives of atmospheric, coastal, geophysical, and oceanic research information and contributing to the mission of NOAA's National Environmental Satellite, Data, and Information Service with new products and services that enable better data discovery.
The Northern Gulf Institute
is a NOAA Cooperative Institute with six academic institutions located across the US Gulf Coast states, conducting research and outreach on the interconnections among Gulf of Mexico ecosystems for informed decision making. One of NGI's four research themes is Effective and Efficient Data Management Systems Supporting a Data-Driven Economy.
By Nilde Maggie Dannreuther
and Clint Edrington
with the Northern Gulf Institute, Mississippi State University.