SSAI Modernizes NASA's Earth Science Data and Information System (ESDIS) Metrics System by Optimizing Processing Time and Power Requirements
NASA's Earth Science Data and Information System (ESDIS) Metrics System (EMS) collects and organizes various data file metrics from Earth Observing System Data and Information System (EOSDIS) Data Providers. The data and analysis reports EMS generates provides NASA managers with the information needed to determine the how to best apply resources to support the science community. SSAI staff lead EMS team and successfully upgraded this major, multi-component system to meet an increasing demand for processing and power.
An SSAI staff-lead EMS team and successfully upgraded a major, multi-component system to meet an increasing demand for processing and power.
Overview
The ESDIS Metrics System (EMS) is a collection of servers, databases, commercial off-the-shelf (COTS) software applications, and locally-developed Oracle and Perl custom code designed to support ESDIS project management by collecting and organizing various metrics from EOSDIS Distributed Active Archive Centers (DAACs) and other Data Providers.
SSAI-led teams are responsible for all EMS software, life-cycle processes from development to production, including regular maintenance and upgrades.
The EMS processing system collects and processes raw logs from DAAC providers that include information from all ingest, archive, and distribution interfaces throughout EOSDIS. The logs are processed to become EMS metrics that depict the use of the products and services that are stored in databases or delivered via the internet.
The EMS went live nearly two decades ago and since then, the quantity of logs received and require processing has exponentially increased. To respond to this need, an SSAI-led EMS team developed and implemented a series of upgrades to meet current processing needs and drastically improve processing power. This effort involved optimizing algorithms as well as developing and implementing parallel processing and a dedicated prestaging system.
Results
Since implementing parallel processing and prestaging in EMS, daily processing time has been reduced from being more than 20 hours, which is nearly full capacity, to approximately 10 hours. This 50% decrease in processing time has enabled massive reprocessing efforts that were previously impossible prior to implimenting these performance improvements. Moreover, these improvements upgraded system performance while providing early validation of raw records and improved error notification to logs providers.
Due to these efforts, EMS processing capacity effectively meets the demand for processing and power. EMS also can automatically handle large amounts of reprocessing without affecting routine daily processing, which saves time, effort, and resources.
Attribution: Gradient drift at English Wikipedia, Public domain, via Wikimedia Commons