Input and Outputs of FLARECAST Prediction Algorithms | | |
| | |
Linear Regression | | |
Phase | Input | Output |
Training | (A) predictors matrix, X | (C) coefficients vector |
| (B) response matrix, y | (D) fitted values |
| | (E) residuals, optional |
| | (F) standard errors of coefficients vector, SE |
| | (G) t-statistics of coefficient vector, t-stat |
| | (H) p-value of coefficient vector, p-value |
Prediction/Testing | (A) object, Object of class inheriting from linear regression training command "lm" (R specific - to convert to JSON) | (C) predict.lm, a vector of predicted probabilities for the newdata dataframe |
| (B) newdata, A data frame in which to look for variables with which to predict. (R specific - to convert to JSON) | |
| | |
Probit Regression | | |
Phase | Input | Output |
Training | (A) predictors matrix, X | (D) coefficients vector |
| (B) response matrix, y | (E) fitted values |
| (C) family, binomial(probit) | (F) residuals, optional |
| | (G) deviance, -2*logLik (a positive number), optional |
| | (H) aic, Akaike's Information Criterion value, optional |
| | (I) standard errors of coefficients vector, SE |
| | (J) z-statistics of coefficient vector, z-stat |
| | (K) p-value of coefficient vector, p-value |
Prediction/Testing | (A) object, Object of class inheriting from probit regression training command "glm" (R specific - to convert to python) | (C) predict.probit, a vector of predicted probabilities for the newdata dataframe |
| (B) newdata, A data frame in which to look for variables with which to predict. (R specific - to convert to python) | |
| | |
Logit Regression | | |
Phase | Input | Output |
Training | (A) predictors matrix, X | (C) coefficients vector |
| (B) response matrix, y | (D) fitted values |
| (C) family, binomial(logit) | (E) residuals, optional |
| | (F) deviance, -2*logLik (a positive number), optional |
| | (G) aic, Akaike's Information Criterion value, optional |
| | (H) standard errors of coefficients vector, SE |
| | (I) z-statistics of coefficient vector, z-stat |
| | (J) p-value of coefficient vector, p-value |
Prediction/Testing | (A) object, Object of class inheriting from logit regression training command "glm" (R specific - to convert to python) | (C) predict.logit, a vector of predicted probabilities for the newdata dataframe |
| (B) newdata, A data frame in which to look for variables with which to predict. (R specific - to convert to python) | |
| | |
Multilayer Perceptron | | |
Phase | Input | Output |
Training | Parameters | Parameters |
| n, integer, no. of observations | p.nnet, R model object, type "nnet" |
| K, integer, no. of features | p.nnet$wts, double vector, weights of the connections in the mlp |
| X, double precision matrix, n by K | p.nnet$fitted.values, double vector, vector of fitted probabilities in every element of the training set with MLP |
| y, double precision vector, n by 1 (actually containing 0s and 1s) | |
| HyperParameters | |
| size, integer, no. of hiddent neurons, default 4 | |
| decay, double, weight decay parameter, default 0.1 | |
| entropy, logical, flag for log-likelihood (or entropy) training, default, TRUE | |
| maxit, integer, maximum no. of iterations in the optimization, default 2000 | |
| MaxNWts, integer, maximum no. of weights in the neural network, default 2000 | |
Prediction/Testing | Parameters | Parameters |
n2, integer, no. of observations in the testing set | p.nnet.predict, double vector, n2 by 1, vector of probabilities for every element of the testing set with the MLP |
| K, integer, no. of features, same as in training step | |
| p.nnet, R model object, type "nnet" | |
| newdata, double matrix n2 by K, matrix of predictors values for the testing set, structured in the same logic as the X matrix in training step | |
| | |
| | |
Support Vector Machines | | |
Phase | Input | Output |
Training | Parameters | Parameters |
| n, integer, no. of observations | p.svm, R model object, type "svm" |
| K, integer, no. of features | p.svm$fitted, double vector, vector of fitted probabilities in every element of the training set with SVM |
| X, double precision matrix, n by K | |
| y, double precision vector, n by 1 (actually containing 0s and 1s) | |
| HyperParameters | |
| gamma, double, parameter for the RBF (or gausian) kernel of eps-regression SVM used, default 0.5 | |
| cost, double, parameter in the penalty term in the Lagrangean formulation of eps-regression SVM, default 8.0 | |
| probability, logical, flag to return probabilities in sample, default TRUE | |
Prediction/Testing | Parameters | Parameters |
n2, integer, no. of observations in the testing set | p.svm.predict, double vector, n2 by 1, vector of probabilities for every element of the testing set with the SVM |
| K, integer, no. of features, same as in training step | |
| p.svm, R model object, type "svm" | |
| newdata, double matrix n2 by K, matrix of predictors values for the testing set, structured in the same logic as the X matrix in training step | |
| | |
| | |
Random Forests | | |
Phase | Input | Output |
Training | Parameters | Parameters |
| n, integer, no. of observations | p.randomForest, R model object, type "randomForest" |
| K, integer, no. of features | p.randomForest$predicted, double vector, vector of fitted probabilities in every element of the *training set* with RF |
| X, double precision matrix, n by K | |
| y, double precision vector, n by 1 (actually containing 0s and 1s) | |
| HyperParameters | |
| importance, logical, flag to compute importance of every predictor, default TRUE | |
| ntree, integer, no. of trees to grow, default 500 | |
| mtry, integer, Number of variables randomly sampled as candidates at each split, default, max(floor(K/3), 1), when in doubt just put, default 2 (typically K=8) | |
| na.action, string, what to do with NaNs, default "na.omit" | |
Prediction/Testing | Parameters | Parameters |
n2, integer, no. of observations in the testing set | p.randomForest.predict, double vector, n2 by 1, vector of probabilities for every element of the testing set with the RF |
| K, integer, no. of features, same as in training step | |
| p.randomForest, R model object, type "randomForest" | |
| newdata, double matrix n2 by K, matrix of predictors values for the testing set, | |
| structured in the same logic as the X matrix in training step | |
| | |
| | |