Page tree
Skip to end of metadata
Go to start of metadata

SDO/HMI data coverage

The coverage of all SDO/HMI data series present at MEDOC (PSUD) is displayed on a specific web page, which is updated daily. Colors represent the percentage of files present at PSUD compared to the total number of existing files in the series in a given day (ordered by month and day of month). The color scale is highly non-linear, so that just one missing group of files (actually a SUNUM) is visible (we expect 120 per day for the 720s series), and more than 10% missing files appears already as quite bad.

Example:

White means no existing data, this can be because of

  • SDO's "holidays" (around 2016-08-03) or any other interruption of the data flow. Then the gap would look the same in all series.
  • no HARP at that date (for hmi.sharp series). Then the gap would look the same in all hmi.sharp series.
  • any other issue when replicating the metadata.

When not all available data are present at PSUD, this can be because

  • data retrieval failed once (there are automated retries, but not for old data). We can launch new retries manually.
  • data retrieval fails permanently, maybe the data are not online anymore.

Delay between observation time and download time

With the advent of FLARECAST predictions, the timeliness of nrt data download becomes important.

The minimum possible delay is limited by processing time at JSOC (currently about 1.10 hours for the mharp_720s_nrt series, to which the sharp series are linked).

Plotting delays

At ~ebuchlin/sdosql/plot-delays.py on the cluster, there is a script that plots the delay between observation time and download time, as a function of observation time, over a month, for the hmi.sharp_cea_720s_nrt (of course it can be generalized to other series). It can be run with

python3 ~ebuchlin/sdosql/plot-delays.py 2017-12

for the month of 2017-12; this will write a file in the current directory with a plot. Alternatively, you can run

%run ~ebuchlin/sdosql/plot-delays.py 2017-12

from ipython3 to get an interactive plot (zooming/panning/...), and the plot file.

In the plot files, the delay axis spans 0 to 24hr; delays of more than 24hr are shown as 24hr but in orange, and missing data are shown as 0hr but in red.

Examples

2016-05

This was a period when data retrieval was done daily, so each day (just after retrieval) the delay (to the next retrieval) became about 24hr (sometimes more), before the time to next retrieval decreased progressively to 0 (the the delay decreased to the minimum value). There are also a few missing data.

2017-06

The data retrieval cadence is now hourly. With a few exceptions, the delay is between 1.1 and 2.1hr, as can be seen on this zoom:

where the combined effects of the 1hr-cadence retrieval and 12min-cadence data are clearly visible.

2017-03

The gap is not missing data, but lack of HARPs... (it is empty, not filled with red points)

2017-11

From 2017-09 and the NetDRMS upgrade, data retrieval became less stable, so data retrieval can crash and leave us with no data for a few days (and with missing data) if we do not have proper supervision. We are working with the JSOC team developing NetDRMS to identify the possible issues, and improve the stability of data retrieval. We are also developing means of supervision (as demonstrated here). However, delays of maximum 2hr (or less, if retrieval launched more frequently), i.e. very close to JSOC file production delays, still appear within reach.

  • No labels