At AGU’s Fall Meeting, the preeminent international Earth and space science meeting, researchers unveiled the world’s largest database of Extremely Low Frequency (ELF)/Very Low Frequency (VLF) data. The open-access database is named WALDO, or Worldwide Archive of Low-frequency Data and Observations. Researchers will be able to access nearly 1000 terabytes (TB) of data to further scientific efforts in fields like ionospheric remote sensing, earthquake forecasting, subterranean prospecting, and space weather effects. Space weather, which can lead to beautiful auroras in the night sky or destructive effects on power grids and satellites, are especially important for scientists and engineers to understand and predict.
The work to preserve hundreds of terabytes of ELF/VLF electromagnetic wave measurements and open it for researchers worldwide is a joint project of Stanford University, Georgia Institute of Technology and the University of Colorado Denver with support from the National Science Foundation and Department of Defense.
“It’s exciting that we saved this data all these years because right now is the time when it is becoming most valuable with advances in computing power, Big Data algorithms, and artificial intelligence,” said Mark Golkowski, PhD, professor of Electrical Engineering at CU Denver.
The culmination of a legacy
Golkowski and Morris Cohen, PhD, associate professor in the School of Electrical and Computer Engineering at Georgia Tech, initiated the WALDO project as the culmination of a legacy that began at Stanford following World War II. Professor Robert Helliwell pioneered the field and the use of large antennas to capture low-frequency radio waves in remote locations like Antarctica and Alaska to study the complex physics of near-Earth space. At Stanford, Helliwell eventually passed the torch to Professor Umran Inan, who served as advisor to Golkowski and Cohen when they were students in his research program.
“If there is one thing our advisor instilled in us, it was the sanctity of high quality science observations and the importance of preserving them,” said Golkowski. “Unfortunately, this kind of archival work is often put on the back burner and it’s only later that people say, ‘if only we had data from 10 years ago, we would know if this was an anomaly or not.’ Losing data is like the burning of the library at Alexandria. When it’s gone, it’s gone.”
For years, researchers have transferred data from magnetic tapes to CDs to DVDs as technology advanced and outdated storage methods threatened the data. The advent of massive cloud storage has now made the data accessible to researchers all over the world.
Through the efforts of Golkowski, Cohen, and their students, nearly 80,000 DVDs of data is uploading to the cloud. While most data is from the last 20 years, some recordings date back to the 1970s and 80s.
A living repository of data
At the time of the meeting, 200TB of data is available on the cloud, with another 800 TB to come. WALDO will be a living repository and researchers will continue to add data from ongoing observations made by Georgia Tech and CU Denver, like those collected during the 2017 Great American Solar Eclipse.
The recordings capture a snapshot of the Earth’s quickly changing atmosphere and space environment, which is why the effort to maintain existing data is crucial for future research.
“It’s shown me the effort necessary as a civilization to keep from losing the past,” says Cohen.
While there is no question that the data on WALDO is a record of the planet’s past and can inform its present, anybody with experience in data analysis knows that researchers have to comb through a lot of noise and lackluster observations to find the gem that will advance knowledge.
“This was the inspiration for the name ‘WALDO,'” said Golkowski. “We based it on the children’s cartoon character who is always hiding among the masses in his characteristic sweater.”
Inspiring new discoveries
Golkowski and Cohen hope that opening up the database will inspire new discoveries and new uses for the datasets. Ever improving computational power and data algorithms will no doubt play a role.
Since arriving at Georgia Tech, Cohen has found that signals at 60 Hz and its harmonics—the annoying noise that comes from power grids—can be used as a diagnostic for power grids and cybersecurity systems. At CU Denver, Golkowski has used ELF observations of lightning to diagnose the upper lower atmosphere, which could eventually improve communication systems.
Cohen said the data’s “out of left field” application to power grid cybersecurity is just one of many findings to come.
“We have a sense of the known unknowns, but who knows who many unknown unknowns are still out there,” said Cohen. “By making this data public, our hope is for other researchers to use these data sets in ways we haven’t imagined yet.”
“Finally, we have an answer to the question, `Where’s WALDO?’,” said Cohen.
Funding for these recordings over the years have come from organizations like the National Science Foundation, NASA, the Department of Defense.