Page tree
Input and Outputs of FLARECAST Prediction Algorithms
Linear Regression
PhaseInputOutput
Training(A) predictors matrix, X(C) coefficients vector
(B) response matrix, y(D) fitted values
(E) residuals, optional
(F) standard errors of coefficients vector, SE
(G) t-statistics of coefficient vector, t-stat
(H) p-value of coefficient vector, p-value
Prediction/Testing(A) object, Object of class inheriting from linear regression training command "lm" (R specific - to convert to JSON)(C) predict.lm, a vector of predicted probabilities for the newdata dataframe
(B) newdata, A data frame in which to look for variables with which to predict. (R specific - to convert to JSON)
Probit Regression
PhaseInputOutput
Training(A) predictors matrix, X(D) coefficients vector
(B) response matrix, y(E) fitted values
(C) family, binomial(probit)(F) residuals, optional
(G) deviance, -2*logLik (a positive number), optional
(H) aic, Akaike's Information Criterion value, optional
(I) standard errors of coefficients vector, SE
(J) z-statistics of coefficient vector, z-stat
(K) p-value of coefficient vector, p-value
Prediction/Testing(A) object, Object of class inheriting from probit regression training command "glm" (R specific - to convert to python)(C) predict.probit, a vector of predicted probabilities for the newdata dataframe
(B) newdata, A data frame in which to look for variables with which to predict. (R specific - to convert to python)
Logit Regression
PhaseInputOutput
Training(A) predictors matrix, X(C) coefficients vector
(B) response matrix, y(D) fitted values
(C) family, binomial(logit)(E) residuals, optional
(F) deviance, -2*logLik (a positive number), optional
(G) aic, Akaike's Information Criterion value, optional
(H) standard errors of coefficients vector, SE
(I) z-statistics of coefficient vector, z-stat
(J) p-value of coefficient vector, p-value
Prediction/Testing(A) object, Object of class inheriting from logit regression training command "glm" (R specific - to convert to python)(C) predict.logit, a vector of predicted probabilities for the newdata dataframe
(B) newdata, A data frame in which to look for variables with which to predict. (R specific - to convert to python)
Multilayer Perceptron
PhaseInputOutput
TrainingParametersParameters
n, integer, no. of observationsp.nnet, R model object, type "nnet"
K, integer, no. of featuresp.nnet$wts, double vector, weights of the connections in the mlp
X, double precision matrix, n by Kp.nnet$fitted.values, double vector, vector of fitted probabilities in every element of the training set with MLP
y, double precision vector, n by 1 (actually containing 0s and 1s)
HyperParameters
size, integer, no. of hiddent neurons, default 4
decay, double, weight decay parameter, default 0.1
entropy, logical, flag for log-likelihood (or entropy) training, default, TRUE
maxit, integer, maximum no. of iterations in the optimization, default 2000
MaxNWts, integer, maximum no. of weights in the neural network, default 2000
Prediction/TestingParametersParameters
n2, integer, no. of observations in the testing setp.nnet.predict, double vector, n2 by 1, vector of probabilities for every element of the testing set with the MLP
K, integer, no. of features, same as in training step
p.nnet, R model object, type "nnet"
newdata, double matrix n2 by K, matrix of predictors values for the testing set, structured in the same logic as the X matrix in training step
Support Vector Machines
PhaseInputOutput
TrainingParametersParameters
n, integer, no. of observationsp.svm, R model object, type "svm"
K, integer, no. of featuresp.svm$fitted, double vector, vector of fitted probabilities in every element of the training set with SVM
X, double precision matrix, n by K
y, double precision vector, n by 1 (actually containing 0s and 1s)
HyperParameters
gamma, double, parameter for the RBF (or gausian) kernel of eps-regression SVM used, default 0.5
cost, double, parameter in the penalty term in the Lagrangean formulation of eps-regression SVM, default 8.0
probability, logical, flag to return probabilities in sample, default TRUE
Prediction/TestingParametersParameters
n2, integer, no. of observations in the testing setp.svm.predict, double vector, n2 by 1, vector of probabilities for every element of the testing set with the SVM
K, integer, no. of features, same as in training step
p.svm, R model object, type "svm"
newdata, double matrix n2 by K, matrix of predictors values for the testing set, structured in the same logic as the X matrix in training step
Random Forests
PhaseInputOutput
TrainingParametersParameters
n, integer, no. of observationsp.randomForest, R model object, type "randomForest"
K, integer, no. of featuresp.randomForest$predicted, double vector, vector of fitted probabilities in every element of the *training set* with RF
X, double precision matrix, n by K
y, double precision vector, n by 1 (actually containing 0s and 1s)
HyperParameters
importance, logical, flag to compute importance of every predictor, default TRUE
ntree, integer, no. of trees to grow, default 500
mtry, integer, Number of variables randomly sampled as candidates at each split, default, max(floor(K/3), 1), when in doubt just put, default 2 (typically K=8)
na.action, string, what to do with NaNs, default "na.omit"
Prediction/TestingParametersParameters
n2, integer, no. of observations in the testing setp.randomForest.predict, double vector, n2 by 1, vector of probabilities for every element of the testing set with the RF
K, integer, no. of features, same as in training step
p.randomForest, R model object, type "randomForest"
newdata, double matrix n2 by K, matrix of predictors values for the testing set,
structured in the same logic as the X matrix in training step