4.0 Article

DiTing: A large-scale Chinese seismic benchmark dataset for artificial intelligence in seismology

Journal

EARTHQUAKE SCIENCE
Volume 36, Issue 2, Pages 84-94

Publisher

KEAI PUBLISHING LTD
DOI: 10.1016/j.eqs.2022.01.022

Keywords

artificial intelligence; benchmark dataset; earthquake detection; seismic phase identification; first-motion polarity

Ask authors/readers for more resources

In recent years, the potential of artificial intelligence technology in seismic signal recognition has been recognized, leading to a new wave of research. The construction of a large-scale, high-quality labeled dataset is crucial for the development and application of artificial intelligence in seismology. In this study, a dataset called DiTing was constructed based on seismic cataloging reports from the China Earthquake Networks Center, which can serve as a benchmark for machine learning model development and data-driven seismological research.
In recent years, artificial intelligence technology has exhibited great potential in seismic signal recognition, setting off a new wave of research. Vast amounts of high-quality labeled data are required to develop and apply artificial intelligence in seismology research. In this study, based on the 2013-2020 seismic cataloging reports of the China Earthquake Networks Center, we constructed an artificial intelligence seismological training dataset (DiTing) with the largest known total time length. Data were recorded using broadband and short-period seismometers. The obtained dataset included 2,734,748 three -component waveform traces from 787,010 regional seismic events, the corresponding P-and S-phase arrival time labels, and 641,025 P-wave first-motion polarity labels. All waveforms were sampled at 50 Hz and cut to a time length of 180 s starting from a random number of seconds before the occurrence of an earthquake. Each three-component waveform contained a considerable amount of descriptive information, such as the epicentral distance, back azimuth, and signal-to-noise ratios. The magnitudes of seismic events, epicentral distance, signal-to-noise ratio of P-wave data, and signal-to-noise ratio of S-wave data ranged from 0 to 7.7, 0 to 330 km, -0.05 to 5.31 dB, and -0.05 to 4.73 dB, respectively. The dataset compiled in this study can serve as a high-quality benchmark for machine learning model development and data-driven seismological research on earthquake detection, seismic phase picking, first-motion polarity determination, earthquake magnitude prediction, early warning systems, and strong ground-motion prediction. Such research will further promote the development and application of artificial intelligence in seismology.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.0
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available