View Source

File names for trained models:

In order to save the trained model with a name decided by the user, it is sufficient to write it in the configuration json file in the field "description": SHAUN: I think that we need to make this as human understandable as possible since there will be a lot of similarly trained models for each algorithm (e.g., "algorithm_abovec_24_0_00_Br_all" for system trained for flares above C1.0 over 24-hr windows at 0-hr latency issued from 00:00UT SHARPs using Br properties that correspond to all available).

"algorithm":{ "phase": "training", <--- do not touch
"config_name": "HybridLasso", <--- do not touch
"description": "HybridLasso_test", <---- HERE
"HybridLasso": true, <--- do not touch

In the prediction step, the name that was used in the training phase needs to be reported in the configuration json file, this time in the "config_name" field :

"algorithm": {
"phase": "execution", <--- do not touch
"config_name": "HybridLasso_test", <---- HERE
"description": "HybridLasso"

"type" : ["classification" | "clustering" | "regression"] <--- to fill this field check this page: https://dev.flarecast.eu/confluence/x/hQDV
}

The writing of the trained model is working locally, let see if other fixings are needed once we run them on the cluster.

Core Training Configuration Workflow

Integrate prediction algorithm X into the infrastructure (personally I don’t mind which, but Michele’s suggestion of 1 supervised and 1 unsupervised is good)
The training time range to be used will be from 00:00:00 UT on 14-Sep-2012 (start of SHARP NRT data availability) to 23:59:59 UT on 31-Dec-2015
1. Concern was raised previously about number of features for training, so a large training time range seems to be preferred
2. This allows for the testing time range to be from 00:00:00 UT on 1-Jan-2016 to <most recently processed time stamp>, but this will be performed under WP5
Run integrated prediction algorithm with the following training configuration settings:
1. 24-hr forecast window
2. 0-hr latency
3. use only 00:00 UT time stamps (to avoid SDO 24-hr periodic orbital effects)
4. create individual training configuration files separately using Blos and Br properties for flaring levels of:
  1. C-class only (i.e., >= C1.0 and < M1.0)
    algorithm_cclass_24_0_00_Blos_all
    algorithm_cclass_24_0_00_Br_all
  2. M-class only (i.e., >= M1.0 and < X1.0)
    algorithm_mclass_24_0_00_Blos_all
    algorithm_mclass_24_0_00_Br_all
  3. X-class only (i.e., >= X1.0)
    algorithm_xclass_24_0_00_Blos_all
    algorithm_xclass_24_0_00_Br_all
  4. Above M-class (i.e., >= M1.0)
    algorithm_abovem_24_0_00_Blos_all
    algorithm_abovem_24_0_00_Br_all
  5. Above C-class (i.e., >= C1.0)
    algorithm_abovec_24_0_00_Blos_all
    algorithm_abovec_24_0_00_Br_all
5. There is no explicit reason why we should use all properties (but is worth doing for completeness), so another set of 10x configuration files should be prepared for a reduced set of "optimized"/"feature selected" properties
  1. C-class only (i.e., >= C1.0 and < M1.0)
    algorithm_cclass_24_0_00_Blos_opt
    algorithm_cclass_24_0_00_Br_opt
  2. M-class only (i.e., >= M1.0 and < X1.0)
    algorithm_mclass_24_0_00_Blos_opt
    algorithm_mclass_24_0_00_Br_opt
  3. X-class only (i.e., >= X1.0)
    algorithm_xclass_24_0_00_Blos_opt
    algorithm_xclass_24_0_00_Br_opt
  4. Above M-class (i.e., >= M1.0)
    algorithm_abovem_24_0_00_Blos_opt
    algorithm_abovem_24_0_00_Br_opt
  5. Above C-class (i.e., >= C1.0)
    algorithm_abovec_24_0_00_Blos_opt
    algorithm_abovec_abovec_24_0_00_Br_opt
6. NOTE: there may be no point doing X-class only for either of these all or "optimized"/"feature selected" property set cases, given their rarity in the training time period
7. NOTE: can run configurations with a combination of Blos and Br properties (possible filename tag Bmix) that would need to be run for all and "optimized"/"feature selected" property sets both with their own flaring level scenarios – therefore creating another 10 configuration files (or 8 if "X-class only" cases left out)
Run all 20 training configuration parameter files (or 16 if "X-class only" cases left out)
Write variables of all 20 trained prediction models into Prediction Configuration DB (or 16 if "X-class only" cases left out)
Integrate next prediction algorithm and repeat steps 2–5
Prediction DB can be filled for each integrated prediction algorithm by launching all 20x (or 16x) trained prediction models for that algorithm on the chosen testing time range
1. NOTE: the SDO/HMI image alignment bug from 13-Apr-2016 onwards will limit the availability of properties to make predictions from, until the replacement HMI data are available (UPSud is monitoring and downloading when available)
Broader WP5 validation can be explored by choosing different durations of forecast window and repeating steps 2–5 and 7 for all integrated prediction algorithms

Example Configuration JSON

In this page we summarize the set of parameters for training the model.

Please update/change whatever is needed.

"flare_history_window": 24 <--- we add this field to set the time interval (in hours) in which we check the occurance of a flare in the past.

It works if at least one item of the following dictionary is set to true

"flare_history_features": {"flare_past": true, "flare_index_past": true}

dataset": { "cadence":"24h",

                      "name":"production_02",

            "type": {"point-in-time": true,

                  "longitudinal": false

                  "time-series": false

},

            "time_interval": {"start_time": "2012-09-01T00:00:00Z",

                              "end_time": "2015-12-31T00:00:00Z"

},

"labels":{"flare_index":false,

              "imminence":false,

              "n_flare":false,

              "flaring":true,

              "flaring_ptime":false,

              "largest_flare":false,

              "duration_flare":false,

              "flaring_etime":false,

              "flaring_stime":false,

              "first_flare_class":false

},


SHAUN: Question to Marco/Dario - Does the following 'flare' structure need to be separate from the 'dataset' structure?
"flare":{"class":1,      <-- flare_class = {'A': 0.01, 'B': 0.1, 'C': 1, 'M': 10, 'X': 100} is this conversion table ok?

         "class_max":1,  <-- new field where define the flare upper bound

         "window":24,

         "latency":0,   <-- SHAUN: Question to Marco/Dario - Do these need to be present to filter same-format predictions (e.g., for ensemble forecasting)?

         "issuing":"00" <-- SHAUN: Question to Marco/Dario - Do these need to be present to filter same-format predictions (e.g., for ensemble forecasting)?

},

Cristina: D. Shaun Bloomfield we added the "latency" variable to the flarecast engine (the default value was set to "0"). In order to add also the "issuing" variable, could you please tell me what quantity is represented by it?

Shaun: Cristina Campi I was thinking that this would be a good way to capture the UT time of the SHARPs being used. In this sense, it corresponds to the current implementation of "cadence":"24h" in the "dataset" structure, but would be more human-interpretable for the description of the training configuration (and appearance in its filename).

Cristina: D. Shaun Bloomfield, Ok, I am sorry but I am not sure I understood correctly: does it substitute the cadence field we use so far (i.e. it can assume values ike "24h", "12h" and so on) or is it a list of UT time to be used (for example "00,03,06,09,12,15,18,21" for a 3-hour cadence?)

Shaun: Cristina Campi, due to the SDO orbital periodic effects I don't think that we can ever combine properties across difference UT times. I see it as replacing the "cadence" tag, with "issuing":"00" being implemented in the engine as a request for cadence=24h in the property DB reading.

Cristina: D. Shaun Bloomfield For now I am going to use "issuing" to replace the "cadence" field (sorry to keep bothering you, just to be sure I understood for future develpment/use: for now we set "issuing" to "00" and it corresponds to the "24h" cadence. If we want a "12h" cadence which value do we need to set? "issuing: 12"? How does it interact with the starting time?)

Shaun: Cristina Campi not a problem, these are important thoughts and questions to have! Just a random thought off the top of my head - I would think that a 12-hr cadence forecast would have to be a combination of, e.g., a "window":12, "issuing":"00" trained system and a second "window":12, "issuing":"12" trained system. Right now, if "issuing" is set to "XX" and we leave "window":24 then this means that the engine would need to filter the full property DB for the occurrence of timestamps of XX:00 UT and build a 24-hr forecast training based on just that different SHARP timestamp. For now, we do not need to worry about this because our focus is on 24-hr forecasts with 0-hr latency from 00:00 UT - I just want there to be the algorithmic structure available to parameterize these changes in case we have time to do it.

Please select here all the properties you want to take into account

"properties":{"alpha_exp_cwt_blos": {

                                    "alpha":true, <- - TRUE

                                    "fit_r":false,   <- - FALSE

                                    "sigma":false},

               "alpha_exp_cwt_br":{

                                    "alpha":true,

                                    "fit_r":false,

                                    "sigma":false},

            "alpha_exp_cwt_btot":{

                                    "alpha":true,

                                    "fit_r":false,

                                    "sigma":false},

            "alpha_exp_fft_blos":{

                                    "alpha":true,

                                    "fit_r":false,

                                     "sigma":false},

            "alpha_exp_fft_br":{

                                    "alpha":true,

                                    "fit_r":false,

                                    "sigma":false},

            "alpha_exp_fft_btot":{

                                    "alpha":true,

                                    "fit_r":false,

                                    "sigma":false},

            "beff_blos":{

                                    "beff":true,

                                    "err_sep_length":false,

                                    "err_signed_flux":false,

                                    "sep_length":false,

                                    "signed_flux":false},

            "beff_br":{

                                    "beff":true,

                                    "err_sep_length":false,

                                    "err_signed_flux":false,

                                    "sep_length":false,

                                    "signed_flux":false},

            "decay_index_blos":{

                                    "l_over_min_hmin":true,

                                    "lmax_over_hmin":true,

                                    "max_l_over_hmin":true,

                                    "tot_l_over_hmin":true},

            "decay_index_br":{

                                    "l_over_min_hmin":true,

                                    "lmax_over_hmin":true,

                                    "max_l_over_hmin":true,

                                    "tot_l_over_hmin":true},

            "flare_association": true,

            "ising_energy_blos":{

                                  "ising_energy":true,

                                  "num_neg":false,

                                  "num_pos":false},

            "ising_energy_br":{

                                 "ising_energy":true,

                                 "num_neg":false,

                                 "num_pos":false},

            "ising_energy_part_blos":{

                                 "ising_energy_part":true,

                                 "num_neg":false,

                                 "num_pos":false},

            "ising_energy_part_br":{

                                 "ising_energy_part":true,

                                 "num_neg":false,

                                 "num_pos":false},

            "mpil_blos":{

                        "max_length":true,

                        "tot_length":true,

                        "tot_usflux":true},

            "mpil_br":{

                       "max_length":true,

                       "tot_length":true,

                       "tot_usflux":true},

            "nn_currents":{

                      "err_inet":false,

                      "err_ipn_nn":false,

                      "err_its":false,

                      "err_its_pot":false,

                      "err_tot_neg":false,

                      "err_tot_pos":false,

                      "err_tot_us_cur":false,

                      "flimb":false,

                      "iimb":false,

                      "ipn_nn":false,

                      "its":false,

                      "its_pot":false,

                      "net_curr":false,

                      "num_currents":false,

                      "tot_neg":false,

                      "tot_pos":false,

                      "tot_us_cur":true},

        "r_value_blos_logr":true,

        "r_value_br_logr":true,

        "srs":{

              "area":true,

              "dlong_hg":true,

              "mcint_com":false,

              "mcint_pen":false,

              "mcint_zur":false,

              "mtwil_class":false,

              "n_spots":true},

        "wlsg_blos":{

            "tot_len_pil":false,

            "value_int": true,

            "value_tot":false},

        "wlsg_br":{

            "tot_len_pil":false,

            "value_int": true,

            "value_tot":false},

        "helicity_energy_bvec":{

                                    "pos_dhdt_in": false,

                                    "abs_neg_dhdt_in": false,

                                    "abs_tot_dhdt_in": true,

                                    "tot_uns_dhdt_in": true,

                                    "pos_dhdt_sh": false,

                                    "abs_neg_dhdt_sh": false,

                                    "abs_tot_dhdt_sh": true,

                                    "tot_uns_dhdt_sh": true,

                                    "abs_tot_dhdt": true,

                                    "abs_tot_dhdt_in_plus_sh": false,

                                    "tot_uns_dhdt": true,

                                    "pos_dedt_in": false,

                                    "abs_neg_dedt_in": false,

                                    "abs_tot_dedt_in": true,

                                    "tot_uns_dedt_in": true,

                                    "pos_dedt_sh": false,

                                    "abs_neg_dedt_sh": false,

                                    "abs_tot_dedt_sh": true,

                                    "tot_uns_dedt_sh": true,

                                    "abs_tot_dedt": true,

                                    "tot_uns_dedt": true},

       "flow_field_bvec":{

                          "v_mean": true,

                          "v_median": true,

                          "vz_mean": true,

                          "vz_max": true,

                          "diver": true,

                          "cover": true,

                          "shear": true,

                          "diver_mean": true,

                          "cover_mean": true,

                          "shear_mean": true,

                          "diver_max": true,

                          "cover_max": true,

                          "shear_max": true,

                          "w_diver": true,

                          "w_cover": true,

                          "w_shear": true,

                          "w_diver_mean": true,

                          "w_cover_mean": true,

                          "w_shear_mean": true,

                          "w_diver_max": true,

                          "w_cover_max": true,

                          "w_shear_max": true},

        "gs_slf":{

                  "g_s": true,

                  "slf": true,

                  "d_l_f": false,

                  "weight_cent": false,

                  "lead_cent": false,

                  "foll_cent": false,

                  "fit_coeff": false},

        "frdim_Blos":{

            "frdim": true,

            "frdim_err": false},

        "frdim_Br":{

            "frdim": true,

            "frdim_err": false},

        "frdim_Btot":{

            "frdim": true,

            "frdim_err": false},

        "sfunction_Blos":{

                        "zq": true,

                        "zq_err" : false,

                        "q": false,

                        "sf":false,

                        "rd":false},

        "sfunction_Br":{

                        "zq": true,

                        "zq_err" : false,

                        "q": false,

                        "sf":false,

                        "rd":false},

        "sfunction_Btot":{

                        "zq": true,

                        "zq_err" : false,

                        "q": false,

                        "sf":false,

                        "rd":false},

       "mf_spectrum_Blos":{

                        "dq": true,

                        "dq_err": false,

                        "q": false,

                        "alpha":false,

                        "alpha_err":false,

                        "falpha": false,

                        "falpha_err":false},

                        "mf_spectrum_Br":{

                        "dq": true,

                        "dq_err": false,

                        "q": false,

                        "alpha":false,

                        "alpha_err":false,

                        "falpha": false,

                        "falpha_err":false},

                        "mf_spectrum_Btot":{

                        "dq": true,

                        "dq_err": false,

                        "q": false,

                        "alpha":false,

                        "alpha_err":false,

                        "falpha": false,

                        "falpha_err":false},

      "sharp_kw": {

                                    "gamma": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": false,

                                                "median": false,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "hgradbh": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": false

},

                                    "hgradbt": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": false

},

                                    "hgradbz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": false

},

                                    "hz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": false

},

                                    "jz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "sflux": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "snetjzpp": {

                                                "total": true

},

                                    "twistp": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "usflux": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "ushz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "usiz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true