View Source

Core Training Configuration Workflow

Integrate prediction algorithm X into the infrastructure (personally I don’t mind which, but Michele’s suggestion of 1 supervised and 1 unsupervised is good)
The training time range to be used will be from 00:00:00 UT on 14-Sep-2012 (start of SHARP NRT data availability) to 23:59:59 UT on 31-Dec-2015
1. Concern was raised previously about number of features for training, so a large training time range seems to be preferred
2. This allows for the testing time range to be from 00:00:00 UT on 1-Jan-2016 to <most recently processed time stamp>, but this will be performed under WP5
Run integrated prediction algorithm with the following training configuration settings:
1. Use only 00:00 UT time stamps (to avoid SDO 24-hr periodic orbital effects)
2. 24-hr forecast window
3. 0-hr latency
4. create separate training configuration files for flaring levels of:
  1. C-class only (i.e., >= C1.0 and < M1.0)
  2. M-class only (i.e., >= M1.0 and < X1.0)
  3. X-class only (i.e., >= X1.0)
  4. Above M-class (i.e., >= M1.0)
  5. Above C-class (i.e., >= C1.0)
Run all 5 training configuration parameter files
Write variables of all 5 trained prediction models into Prediction Configuration DB
1. Personally, I’m not sure if separate entries or a grouped JSON entry is better
Integrate next prediction algorithm and repeat steps 2–5
Prediction DB can be filled for each integrated prediction algorithm by launching all 5x trained prediction models for that algorithm on the chosen testing time range
1. NOTE: the SDO/HMI image alignment bug from 13-Apr-2016 onwards will limit the availability of properties to make predictions from, until the replacement HMI data are available (UPSud is monitoring and downloading when available)
Broader WP5 validation can be explored by choosing different durations of forecast window and repeating steps 2–5 and 7 for all integrated prediction algorithms

File name for trained model:

In order to save the trained model with a name decided by the user, it is sufficient to write it in the configuration json file in the field "description": SHAUN: I think that we need to make this as human understandable as possible since there will be a lot of similarly trained models for each algorithm (e.g., "algorithm_abovec_24_0_00_Br_all" for system trained for flares above C1.0 over 24-hr windows at 0-hr latency issued from 00:00UT SHARPs using Br properties that correspond to all available). This would lead to the following list of configurations for a single algorithm using Br data:

algorithm_cclass_24_0_00_Br_all

algorithm_mclass_24_0_00_Br_all

algorithm_xclass_24_0_00_Br_all

algorithm_abovem_24_0_00_Br_all

algorithm_abovec_24_0_00_Br_all

and the same algorithm using Blos data:

algorithm_cclass_24_0_00_Blos_all

algorithm_mclass_24_0_00_Blos_all

algorithm_xclass_24_0_00_Blos_all

algorithm_abovem_24_0_00_Blos_all

algorithm_abovec_24_0_00_Blos_all

There is no explicit reason why we should use all properties, so another set of configuration files should be prepared and run for a reduced "optimized"/"feature selected" property set.

NOTE: there may be no point doing X-class, given the rarity of their occurrence in the training time period.

"algorithm":{ "phase": "training", <--- do not touch
"config_name": "HybridLasso", <--- do not touch
"description": "HybridLasso_test", <---- HERE
"HybridLasso": true, <--- do not touch

In the prediction step, the name that was used in the training phase needs to be reported in the configuration json file, this time in the "config_name" field :

"algorithm": {
"phase": "execution", <--- do not touch
"config_name": "HybridLasso_test", <---- HERE
"description": "HybridLasso"
}

The writing of the trained model is working locally, let see if other fixings are needed once we run them on the cluster.

Example Configuration JSON

In this page we summarize the set of parameters for training the model.

Please update/change whatever is needed.

dataset": { "cadence":"24h",

                      "name":"production_02",

            "type": {"point-in-time": true,

                  "longitudinal": false

                  "time-series": false

},

            "time_interval": {"start_time": "2012-09-01T00:00:00Z",

                              "end_time": "2015-12-31T00:00:00Z"

},

"labels":{"flare_index":false,

              "imminence":false,

              "n_flare":false,

              "flaring":true,

              "flaring_ptime":false,

              "largest_flare":false,

              "duration_flare":false,

              "flaring_etime":false,

              "flaring_stime":false,

              "first_flare_class":false

},


SHAUN: Question to Marco/Dario - Does the following 'flare' structure need to be separate from the 'dataset' structure?
"flare":{"class":1,       <-- flare_class = {'A': 0.01, 'B': 0.1, 'C': 1, 'M': 10, 'X': 100} is this conversion table ok?

         "class_max":1,  <-- new field where define the flare upper bound

         "window":24,

         "latency":0,  <-- SHAUN: Question to Marco/Dario - Do these need to be present to filter same-format predictions (e.g., for ensemble forecasting)?

         "issuing":00 <-- SHAUN: Question to Marco/Dario - Do these need to be present to filter same-format predictions (e.g., for ensemble forecasting)?

},

Please select here all the properties you want to take into account

"properties":{"alpha_exp_cwt_blos": {

                                    "alpha":true, <- - TRUE

                                    "fit_r":false,   <- - FALSE

                                    "sigma":false},

               "alpha_exp_cwt_br":{

                                    "alpha":true,

                                    "fit_r":false,

                                    "sigma":false},

            "alpha_exp_cwt_btot":{

                                    "alpha":true,

                                    "fit_r":false,

                                    "sigma":false},

            "alpha_exp_fft_blos":{

                                    "alpha":true,

                                    "fit_r":false,

                                     "sigma":false},

            "alpha_exp_fft_br":{

                                    "alpha":true,

                                    "fit_r":false,

                                    "sigma":false},

            "alpha_exp_fft_btot":{

                                    "alpha":true,

                                    "fit_r":false,

                                    "sigma":false},

            "beff_blos":{

                                    "beff":true,

                                    "err_sep_length":false,

                                    "err_signed_flux":false,

                                    "sep_length":false,

                                    "signed_flux":false},

            "beff_br":{

                                    "beff":true,

                                    "err_sep_length":false,

                                    "err_signed_flux":false,

                                    "sep_length":false,

                                    "signed_flux":false},

            "decay_index_blos":{

                                    "l_over_min_hmin":true,

                                    "lmax_over_hmin":true,

                                    "max_l_over_hmin":true,

                                    "tot_l_over_hmin":true},

            "decay_index_br":{

                                    "l_over_min_hmin":true,

                                    "lmax_over_hmin":true,

                                    "max_l_over_hmin":true,

                                    "tot_l_over_hmin":true},

            "flare_association": true,

            "ising_energy_blos":{

                                  "ising_energy":true,

                                  "num_neg":false,

                                  "num_pos":false},

            "ising_energy_br":{

                                 "ising_energy":true,

                                 "num_neg":false,

                                 "num_pos":false},

            "ising_energy_part_blos":{

                                 "ising_energy_part":true,

                                 "num_neg":false,

                                 "num_pos":false},

            "ising_energy_part_br":{

                                 "ising_energy_part":true,

                                 "num_neg":false,

                                 "num_pos":false},

            "mpil_blos":{

                        "max_length":true,

                        "tot_length":true,

                        "tot_usflux":true},

            "mpil_br":{

                       "max_length":true,

                       "tot_length":true,

                       "tot_usflux":true},

            "nn_currents":{

                      "err_inet":false,

                      "err_ipn_nn":false,

                      "err_its":false,

                      "err_its_pot":false,

                      "err_tot_neg":false,

                      "err_tot_pos":false,

                      "err_tot_us_cur":false,

                      "flimb":false,

                      "iimb":false,

                      "ipn_nn":false,

                      "its":false,

                      "its_pot":false,

                      "net_curr":false,

                      "num_currents":false,

                      "tot_neg":false,

                      "tot_pos":false,

                      "tot_us_cur":true},

        "r_value_blos_logr":true,

        "r_value_br_logr":true,

        "srs":{

              "area":true,

              "dlong_hg":true,

              "mcint_com":false,

              "mcint_pen":false,

              "mcint_zur":false,

              "mtwil_class":false,

              "n_spots":true},

        "wlsg_blos":{

            "tot_len_pil":false,

            "value_int": true,

            "value_tot":false},

        "wlsg_br":{

            "tot_len_pil":false,

            "value_int": true,

            "value_tot":false},

        "helicity_energy_bvec":{

                                    "pos_dhdt_in": false,

                                    "abs_neg_dhdt_in": false,

                                    "abs_tot_dhdt_in": true,

                                    "tot_uns_dhdt_in": true,

                                    "pos_dhdt_sh": false,

                                    "abs_neg_dhdt_sh": false,

                                    "abs_tot_dhdt_sh": true,

                                    "tot_uns_dhdt_sh": true,

                                    "abs_tot_dhdt": true,

                                    "abs_tot_dhdt_in_plus_sh": false,

                                    "tot_uns_dhdt": true,

                                    "pos_dedt_in": false,

                                    "abs_neg_dedt_in": false,

                                    "abs_tot_dedt_in": true,

                                    "tot_uns_dedt_in": true,

                                    "pos_dedt_sh": false,

                                    "abs_neg_dedt_sh": false,

                                    "abs_tot_dedt_sh": true,

                                    "tot_uns_dedt_sh": true,

                                    "abs_tot_dedt": true,

                                    "tot_uns_dedt": true},

       "flow_field_bvec":{

                          "v_mean": true,

                          "v_median": true,

                          "vz_mean": true,

                          "vz_max": true,

                          "diver": true,

                          "cover": true,

                          "shear": true,

                          "diver_mean": true,

                          "cover_mean": true,

                          "shear_mean": true,

                          "diver_max": true,

                          "cover_max": true,

                          "shear_max": true,

                          "w_diver": true,

                          "w_cover": true,

                          "w_shear": true,

                          "w_diver_mean": true,

                          "w_cover_mean": true,

                          "w_shear_mean": true,

                          "w_diver_max": true,

                          "w_cover_max": true,

                          "w_shear_max": true},

        "gs_slf":{

                  "g_s": true,

                  "slf": true,

                  "d_l_f": false,

                  "weight_cent": false,

                  "lead_cent": false,

                  "foll_cent": false,

                  "fit_coeff": false},

        "frdim_Blos":{

            "frdim": true,

            "frdim_err": false},

        "frdim_Br":{

            "frdim": true,

            "frdim_err": false},

        "frdim_Btot":{

            "frdim": true,

            "frdim_err": false},

        "sfunction_Blos":{

                        "zq": true,

                        "zq_err" : false,

                        "q": false,

                        "sf":false,

                        "rd":false},

        "sfunction_Br":{

                        "zq": true,

                        "zq_err" : false,

                        "q": false,

                        "sf":false,

                        "rd":false},

        "sfunction_Btot":{

                        "zq": true,

                        "zq_err" : false,

                        "q": false,

                        "sf":false,

                        "rd":false},

       "mf_spectrum_Blos":{

                        "dq": true,

                        "dq_err": false,

                        "q": false,

                        "alpha":false,

                        "alpha_err":false,

                        "falpha": false,

                        "falpha_err":false},

                        "mf_spectrum_Br":{

                        "dq": true,

                        "dq_err": false,

                        "q": false,

                        "alpha":false,

                        "alpha_err":false,

                        "falpha": false,

                        "falpha_err":false},

                        "mf_spectrum_Btot":{

                        "dq": true,

                        "dq_err": false,

                        "q": false,

                        "alpha":false,

                        "alpha_err":false,

                        "falpha": false,

                        "falpha_err":false},

      "sharp_kw": {

                                    "gamma": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": false,

                                                "median": false,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "hgradbh": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": false

},

                                    "hgradbt": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": false

},

                                    "hgradbz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": false

},

                                    "hz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": false

},

                                    "jz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "sflux": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "snetjzpp": {

                                                "total": true

},

                                    "twistp": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "usflux": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "ushz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true

},

                                    "usiz": {

                                                "ave": true,

                                                "kurtosis": false,

                                                "max": true,

                                                "median": true,

                                                "skewness": false,

                                                "stddev": false,

                                                "total": true