prediction_input_output.xlsx

Input and Outputs of FLARECAST Prediction Algorithms

Linear Regression
Phase	Input	Output
Training	(A) predictors matrix, X	(C) coefficients vector
	(B) response matrix, y	(D) fitted values
		(E) residuals, optional
		(F) standard errors of coefficients vector, SE
		(G) t-statistics of coefficient vector, t-stat
		(H) p-value of coefficient vector, p-value
Prediction/Testing	(A) object, Object of class inheriting from linear regression training command "lm" (R specific - to convert to JSON)	(C) predict.lm, a vector of predicted probabilities for the newdata dataframe
	(B) newdata, A data frame in which to look for variables with which to predict. (R specific - to convert to JSON)

Probit Regression
Phase	Input	Output
Training	(A) predictors matrix, X	(D) coefficients vector
	(B) response matrix, y	(E) fitted values
	(C) family, binomial(probit)	(F) residuals, optional
		(G) deviance, -2*logLik (a positive number), optional
		(H) aic, Akaike's Information Criterion value, optional
		(I) standard errors of coefficients vector, SE
		(J) z-statistics of coefficient vector, z-stat
		(K) p-value of coefficient vector, p-value
Prediction/Testing	(A) object, Object of class inheriting from probit regression training command "glm" (R specific - to convert to python)	(C) predict.probit, a vector of predicted probabilities for the newdata dataframe
	(B) newdata, A data frame in which to look for variables with which to predict. (R specific - to convert to python)

Logit Regression
Phase	Input	Output
Training	(A) predictors matrix, X	(C) coefficients vector
	(B) response matrix, y	(D) fitted values
	(C) family, binomial(logit)	(E) residuals, optional
		(F) deviance, -2*logLik (a positive number), optional
		(G) aic, Akaike's Information Criterion value, optional
		(H) standard errors of coefficients vector, SE
		(I) z-statistics of coefficient vector, z-stat
		(J) p-value of coefficient vector, p-value
Prediction/Testing	(A) object, Object of class inheriting from logit regression training command "glm" (R specific - to convert to python)	(C) predict.logit, a vector of predicted probabilities for the newdata dataframe
	(B) newdata, A data frame in which to look for variables with which to predict. (R specific - to convert to python)

Multilayer Perceptron
Phase	Input	Output
Training	Parameters	Parameters
	n, integer, no. of observations	p.nnet, R model object, type "nnet"
	K, integer, no. of features	p.nnet$wts, double vector, weights of the connections in the mlp
	X, double precision matrix, n by K	p.nnet$fitted.values, double vector, vector of fitted probabilities in every element of the training set with MLP
	y, double precision vector, n by 1 (actually containing 0s and 1s)
	HyperParameters
	size, integer, no. of hiddent neurons, default 4
	decay, double, weight decay parameter, default 0.1
	entropy, logical, flag for log-likelihood (or entropy) training, default, TRUE
	maxit, integer, maximum no. of iterations in the optimization, default 2000
	MaxNWts, integer, maximum no. of weights in the neural network, default 2000
Prediction/Testing	Parameters	Parameters
Prediction/Testing	n2, integer, no. of observations in the testing set	p.nnet.predict, double vector, n2 by 1, vector of probabilities for every element of the testing set with the MLP
	K, integer, no. of features, same as in training step
	p.nnet, R model object, type "nnet"
	newdata, double matrix n2 by K, matrix of predictors values for the testing set, structured in the same logic as the X matrix in training step


Support Vector Machines
Phase	Input	Output
Training	Parameters	Parameters
	n, integer, no. of observations	p.svm, R model object, type "svm"
	K, integer, no. of features	p.svm$fitted, double vector, vector of fitted probabilities in every element of the training set with SVM
	X, double precision matrix, n by K
	y, double precision vector, n by 1 (actually containing 0s and 1s)
	HyperParameters
	gamma, double, parameter for the RBF (or gausian) kernel of eps-regression SVM used, default 0.5
	cost, double, parameter in the penalty term in the Lagrangean formulation of eps-regression SVM, default 8.0
	probability, logical, flag to return probabilities in sample, default TRUE
Prediction/Testing	Parameters	Parameters
Prediction/Testing	n2, integer, no. of observations in the testing set	p.svm.predict, double vector, n2 by 1, vector of probabilities for every element of the testing set with the SVM
	K, integer, no. of features, same as in training step
	p.svm, R model object, type "svm"
	newdata, double matrix n2 by K, matrix of predictors values for the testing set, structured in the same logic as the X matrix in training step


Random Forests
Phase	Input	Output
Training	Parameters	Parameters
	n, integer, no. of observations	p.randomForest, R model object, type "randomForest"
	K, integer, no. of features	p.randomForest$predicted, double vector, vector of fitted probabilities in every element of the training set with RF
	X, double precision matrix, n by K
	y, double precision vector, n by 1 (actually containing 0s and 1s)
	HyperParameters
	importance, logical, flag to compute importance of every predictor, default TRUE
	ntree, integer, no. of trees to grow, default 500
	mtry, integer, Number of variables randomly sampled as candidates at each split, default, max(floor(K/3), 1), when in doubt just put, default 2 (typically K=8)
	na.action, string, what to do with NaNs, default "na.omit"
Prediction/Testing	Parameters	Parameters
Prediction/Testing	n2, integer, no. of observations in the testing set	p.randomForest.predict, double vector, n2 by 1, vector of probabilities for every element of the testing set with the RF
	K, integer, no. of features, same as in training step
	p.randomForest, R model object, type "randomForest"
	newdata, double matrix n2 by K, matrix of predictors values for the testing set,
	structured in the same logic as the X matrix in training step

Space shortcuts

Page tree