Skip to:

Research on interconnections among Gulf of Mexico ecosystems.
Outreach for decisions based on those interconnections.

NGI News

Study Introduces Mitohelper - Tool that Improves the Usability of Fish eDNA Data for Research

May 10, 2022

A school of vermillion snapper
A school of vermillion snapper swim over the reef in the Gulf of Mexico. Credit: G. P. Schmahl, Flower Garden Banks National Marine Sanctuary. Image available at the NOAA Photo Library.
Biodiversity supports the stability and health of ocean ecosystems and functions, which benefit humans. Research using fish environmental DNA (eDNA) data can detect aquatic biodiversity non-invasively, efficiently, and broadly, providing information that informs marine resource management and planning.

A publicly available repository for fish eDNA data is MitoFish, a comprehensive and standardized mitochondrial genome database. However, the DNA sequences in MitoFish can be difficult to use because they are not well annotated. To improve the usability of this data, researchers developed Mitohelper – a tool to annotate and align the fish mitochondrial sequences in MitoFish. By improving the accuracy of taxonomic classifications in MitoFish, Mitohelper helps ecologists plan experimental designs and select DNA sequencing and analysis strategies, with results that can inform monitoring and assessment of marine ecosystem health. This tool is described in the paper Mitohelper: A mitochondrial reference sequence analysis tool for fish eDNA studies, published in Environmental DNA in February 2021.

Interactions among marine organisms and their environments form networks that support ecosystem functions such as climate regulation, primary productivity, biogeochemical cycling, and food sourcing that benefit humans. However, the loss of biodiversity in the ocean raises concerns about risks to these important ecosystem functions. Having a better understanding about the status of marine species diversity, which eDNA data can help determine, can lead to improved ecosystem-based management.

"We are surrounded by eDNA," said Luke Thompson, the study's co-author. "We are familiar with forensic DNA evidence used in criminal investigations or tests for viral RNA in COVID-19 PCR tests. Biologists use DNA sequencing of ocean water to find trace evidence of fish that swam though the water to learn about species diversity."

Thompson explained that to make the most of DNA sequencing in the ocean, such as to take stock of fish populations or monitor endangered species, we need to have complete and well-curated reference databases.

The quantity of DNA data in reference databases is rapidly increasing as high-throughput sequencing technology produces DNA and RNA sequences quickly and cost-effectively. However, most of the DNA data that is in the MitoFish reference database are not fully annotated, at best having only accession numbers and taxonomic information. That is where the Mitohelper assists, as it improves the annotations in MitoFish by adding gene names and additional taxonomic classifications.

After the MitoFish reference datasets are improved, researchers can then use the Mitohelper's getrecord and getalignment commands to enhance the usability of the data even more. Drawing from a user-provided list of fish taxonomic names, the getrecord command looks up mitochondrial gene information, which is useful for surveying the presence/absence of mitochondrial reference sequences for specific fish taxa. The getalignment command connects mitochondrial gene sequences (which are often partial) to a user-specified full-length reference sequence, which enables the visualization and assessment of overlapping gene sequencing regions.

'Omics and bioinformatics tools support marine systems studies and the sustainable use of ocean resources. Scientists with expertise in these areas are building a foundation that support the use of these tools by addressing needs for infrastructure, competence, capacity, and application. The development of Mitohelper supports capacity with software that improves the use of reference databases which also supports application through research that improves understanding of the information delivered by DNA sequencing efforts.

Mitohelper and its reference datasets are updated approximately monthly and are available at

Study author Shen Jean Lim is with the National Oceanic and Atmospheric Administration (NOAA) Cooperative Institute for Marine and Atmospheric Studies (CIMAS) and the NOAA Atlantic Oceanographic and Meteorological Laboratory (AOML). Study author Luke Thompson is with the Northern Gulf Institute (NGI) and the AOML.

This research was carried out in part under the auspices of the Cooperative Institute for Marine and Atmospheric Studies (CIMAS), a Cooperative Institute of the University of Miami, and the National Oceanic and Atmospheric Administration (NOAA), cooperative agreement # NA20OAR4320472. This work was also supported by award NA06OAR4320264 06111039 to the Northern Gulf Institute (NGI) at Mississippi State University from NOAA's Office of Oceanic and Atmospheric Research (OAR).

Summary by Nilde Maggie Dannreuther and Luke Thompson, Northern Gulf Institute, Mississippi State University.