EFDA-JET-CP(08)04/12
Overview of Intelligent Data Retrieval Methods for Waveforms and Images in Massive Fusion Databases
JET database contains more than 42 Tbytes of data (waveforms and images) and it doubles its size about every two years. ITER database is expected to be orders of magnitude above this quantity. Therefore, data access in such huge databases can no longer be efficiently based on shot number or temporal interval. Taking into account that diagnostics generate reproducible signal patterns (structural shapes) for similar physical behaviour, high level data access systems can be developed. In these systems, the input parameter is a pattern and the outputs are the shot numbers and the temporal locations where similar patterns appear inside the database. These pattern oriented techniques can be used for first data screening of any type of morphological aspect of waveforms and images. The article shows a new technique to look for similar images in huge databases in a fast an efficient way. Also, previous techniques to search for similar waveforms and to retrieve time-series data or images containing any kind of patterns are reviewed.