Functions in DataMinerXL Software

DataMinerXL software includes the most useful predictive modeling functions. Here is the list of functions organized in terms of the following categories. The detailed descriptions of all functions, such as function arguments and return values, can be found in the manual of DataMinerXL on the Downloads page. You can find the theories and algorithms behind these functions in our book "Foundations of Predictive Analytics."

Basic Statistical Functions

Function Name Description
freq Creates frequency tables given a data table
freq_from_file Creates frequency tables given a data file
freq_2d Creates a frequency cross-table for two variables given a data table
freq_2d_from_file Creates a frequency cross-table for two variables given a data file
means Generates basic statistics: sum, average, standard deviation, minimum, and maximum given a data table
means_from_file Generates basic statistics: sum, average, standard deviation, minimum, and maximum given a data file
univariate Generates univariate statistics given a data table
univariate_from_fileGenerates univariate statistics given a data file
percentiles Calculates p-th percentiles of values in each subgroup
summary Generates descriptive statistics in classes given a data table
summary_from_file Generates descriptive statistics in classes given a data file
ranks Creates 1-based ranks of data points given a column of data
ranks_from_file Creates 1-based ranks of data points given a data file
binning Creates equal interval binning given a column of data table
QQ_plot Tests normality of a univariate sample
variable_corr_selectSelects variables by removing highly correlated variables
poly_roots Finds all roots given real coefficients of a polynomial
Lagrange_interpolationPerforms Lagrange polynomial interpolation given data points
three_moment_match_to_SLNPerforms three moment match to a shifted lognormal distribution
set Creates a set given a string/number matrix
set_union Creates a set from union of two sets
set_intersection Creates a set from intersection of two sets
set_difference Creates a set from difference of two sets

Modeling Functions for All Models

Function Name Description
model_bin_eval Evaluates a binary target model given a column of actual values and a column of predicted values
model_bin_eval_from_file Evaluates a binary target model given a data file, a name of actual values, and a name of predicted values
model_cont_eval Evaluates a continuous target model given a column of actual values and a column of predicted values
model_cont_eval_from_file Evaluates a continuous target model given a data file, a name of actual values, and a name of predicted values
model_eval Evaluates model performance given a model and a data table
model_eval_from_file Evaluates model performance given a model and a data file
model_score Scores a population given a model and a data table
model_score_from_file Scores a population given a model and a data file
model_save_scoring_code Saves the scoring code of a given model to a file

Weight of Evidence Transformation Functions

Function Name Description
woe_xcont_ybin Generates weight of evidence (WOE) of continous independent variables and a binary dependent variable given a data table
woe_xcont_ybin_from_file Generates weight of evidence (WOE) of continous independent variables and a binary dependent variable given a data file
woe_xcont_ycont Generates weight of evidence (WOE) of continous independent variables and a continous dependent variable given a data table
woe_xcont_ycont_from_file Generates weight of evidence (WOE) of continous independent variables and a continous dependent variable given a data file
woe_xcat_ybin Generates weight of evidence (WOE) of categorical independent variables and a binary dependent variable given a data table
woe_xcat_ybin_from_file Generates weight of evidence (WOE) of categorical independent variables and a binary dependent variable given a data file
woe_xcat_ycont Generates weight of evidence (WOE) of categorical independent variables and a continous dependent variable given a data table
woe_xcat_ycont_from_file Generates weight of evidence (WOE) of categorical independent variables and a continous dependent variable given a data file
woe_transform Performs weight of evidence (WOE) transformation given aWOE model and a data table
woe_transform_from_file Performs weight of evidence (WOE) transformation given a WOE model and a data file

Principal Component Analysis and Factor Analysis Functions

Function Name Description
PCA Performs principal component analysis
factor_analysis Performs factor analysis

Linear Regression Functions

Function Name Description
linear_reg Builds a linear regression model given a data table
linear_reg_from_file Builds a linear regression model given a data file
linear_reg_forward_select Builds a linear regression model by forward selection given a data table
linear_reg_forward_select_from_file Builds a linear regression model by forward selection given a data file
linear_reg_score_from_coefs Scores a population from the coefficients of a linear regression model given a data table
linear_reg_piecewise Builds a two-segment piecewise linear regression model for each variable given a data table
linear_reg_piecewise_from_file Builds a two-segment piecewise linear regression model for each variable given a data file
poly_reg Builds a polynomial regression model given a data table

Partial Least Square Regression Functions

Function Name Description
pls_reg Builds a partial least square regression model given a data table
pls_reg_from_file Builds a partial least square regression model given a data file

Logistic Regression Functions

Function Name Description
logistic_reg Builds a logistic regression model given a data table
logistic_reg_from_file Builds a logistic regression model given a data file
logistic_reg_forward_select Builds a logistic regression model by forward selection given a data table
logistic_reg_forward_select_from_file Builds a logistic regression model by forward selection given a data file
logistic_reg_score_from_coefs Scores a population from the coefficients of a logistic regression model given a data table

Time Series Analysis Functions

Function Name Description
ts_acf Calculates the autocorrelation functions (ACF) given a data table
ts_pacf Calculates the partial autocorrelation functions (PACF) given a data table
ts_ccf Calculates the cross correlation functions (CCF) given two data tables
Box_white_noise_test Tests if a time series is a white noise by Box-Ljung or Box-Pierce test
Mann_Kendall_trend_test Tests if a time series has a trend
ADF_test Tests whether a unit root is in a time series using Augmented Dickey-Fuller (ADF) test
ts_diff Calculates the differences given lag and order
ts_sma Calculates the simple moving average (SMA) of a time series
lowess Performs locally weighted scatterplot smoothing (lowess)
natural_cubic_spline Performs natural cubic spline
garch Estimates the parameters of GARCH(1, 1) model
stochastic_process Estimates the parameters of a stochastic process: normal, lognormal, or shifted lognormal
stochastic_process_simulate Simulates a stochastic process: normal, lognormal, or shifted lognormal
Holt_Winters Performs Holt-Winters exponential smoothing
Holt_Winters_forecast Performs forecast given Holt-Winters exponential smoothing
HP_filter Performs the Hodrick-Prescott filter for a time-series data
arima Builds an ARIMA model
sarima Builds a seasonal ARIMA (SARIMA) model
arima_forecast Performs forecast given an ARIMA model
sarima_forecast Performs forecast given a seasonal ARIMA (SARIMA) model
arima_simulate Simulates an ARIMA process
sarima_simulate Simulates a seasonal ARIMA (SARIMA) process
arma_to_ma Converts an ARMA process to a pure A process
arma_to_ar Converts an ARMA process to a pure AR process
acf_of_arma Calculates the autocorrealtion functions (ACF) of an ARMA process

Linear and Quadratic Discriminant Analysis Functions

Function Name Description
LDA Performs the linear discriminant analysis
QDA Performs the quadratic discriminant analysis

Survival Analysis Functions

Function Name Description
Kaplan_MeierPerforms Kaplan-Meier survival analysis

Correspondence Analysis Functions

Function Name Description
corresp_analysisPerforms simple correspondence analysis for a two-way cross table

Naive Bayes Classifier Functions

Function Name Description
naive_bayes_classifier Builds a naive Bayes classification model given a data table
naive_bayes_classifier_from_file Builds a naive Bayes classification model given a data file

Decision Tree-Based Modeling Functions

Function Name Description
tree Builds a regression or classification tree model given a data table
tree_from_file Builds a regression or classification tree model given a data file
tree_logistic_reg_boosting Builds a logistic regression boosting tree model given a data table
tree_logistic_reg_boosting_from_fileBuilds a logistic regression boosting tree model given a data file
tree_ls_reg_boosting Builds a least square boosting tree model given a data table
tree_ls_reg_boosting_from_file Builds a least square boosting tree model given a data file

Clustering and Segmentation Functions

Function Name Description
k_means Performs K-means clustering analysis given a data table
k_means_from_file Performs K-means clustering analysis given a data file
cmds Performs classical multi-dimensional scaling
mds Performs multi-dimensional scaling by Sammon's nonlinear mapping

Neural Network Functions

Function Name Description
neural_net Builds a neural network model given a data table
neural_net_from_fileBuilds a neural network model given a data file

Support Vector Machine Functions

Function Name Description
svm Build a support vector machine (SVM) model given a data table
svm_from_file Build a support vector machine (SVM) model given a data file

Optimization Functions

Function Name Description
linear_prog Solves a linear programming problem
quadratic_prog Solves a quadratic programming problem
lcp Solves a linear complementarity programming problem
nls_solver Solves a nonlinear least-square problem using the Levenberg-Marquardt algorithm
diff_evol_solver Solves a minimization problem given a function and lower/upper bounds of variables using differential evolution solver
transportation_solver Solves a transportation problem
assignment_solver Solves an assignment problem
netflow_solver Solves a minimum or maximum cost network flow problem: to find optimal flows that minimize or maximize the total cost
maxflow_solver Solves a maximum flow problem: to find optimal flows that maximize the total flows from the start node to the end node
shortest_path_solver Solves the shortest path problem: to find the shortest path from the start node to the end node

Portfolio Optimization Functions

Function Name Description
efficient_frontier Finds the efficient frontier for portfolios
Black_Litterman Finds posterior expected returns and covariance matrix using the Black-Litterman Model

Control Theory Functions

Function Name Description
pole_placement Calculates the gains K for the pole placement

Matrix Operation Functions

Function Name Description
matrix_random Generates a random matrix from a uniform distibution U(0, 1) or a standard normal distribution N(0, 1)
matrix_cov Computes the covariance matrix given a data table
matrix_cov_from_fileComputes the covariance matrix given a data file
matrix_corr Computes the correlation matrix given a data table
matrix_corr_from_fileComputes the correlation matrix given a data file
matrix_corr_from_covComputes the correlation matrix from a covariance matrix
matrix_prod Computes the product of two matrices, one matrix could be a number
matrix_directprod Computes the direct product of two matrices
matrix_elementprod Computes the elementwise product of two matrices, one matrix could be a number
matrix_plus Computes the addition of two matrices with the same dimension
matrix_minus Computes the subtraction of two matrices with the same dimension
matrix_I Creates an identity matrix
matrix_t Returns the transpose matrix of a matrix
matrix_diag Creates a diagonal matrix from a matrix or a vector
matrix_tr Returns the trace of a matrix
matrix_inv Computes the inverse of a square matrix
matrix_pinv Computes the pseudoinverse of a real matrix
matrix_complex_pinv Computes the pseudoinverse of a complex matrix
matrix_solver Solves a system of linear equations Ax = B
matrix_tridiagonal_solver Solves a system of tridiagonal linear equations Ax = B
matrix_pentadiagonal_solver Solves a system of pentadiagonal linear equations Ax = B
matrix_chol Computes the Cholesky decomposition of a symmetric positive-definite matrix
matrix_sym_eigen Computes the eigenvalue-eigenvector pairs of a symmetric matrix
matrix_eigen Computes the eigenvalue-eigenvector pairs of a square real matrix
matrix_complex_eigenComputes the eigenvalue-eigenvector pairs of a square complex matrix
matrix_svd Computes the singular value decomposition (SVD) of a matrix
matrix_LU Computes the LU decomposition of a square matrix
matrix_QR Computes the QR decomposition of a square real matrix
matrix_complex_QR Computes the QR decomposition of a square complex matrix
matrix_Schur Computes the Schur decomposition a square real matrix
matrix_complex_SchurComputes the Schur decomposition a square complex matrix
matrix_sweep Sweeps a matrix given indexes
matrix_det Computes the determinant of a square matrix
matrix_distance Computes the distance matrix given a data table
matrix_freq Creates a frequency table given a string matrix
matrix_from_vector Converts a matrix from a vector
matrix_to_vector Converts a matrix into a column vector

Fast Fourier Transform Functions

Function Name Description
FFT Performs fast Fourier transform
IFFT Performs inverse fast Fourier transform

Numerical Integration Functions

Function Name Description
gauss_legendre Generates the abscissas and weights of the Gauss-Legendre n-point quadrature formula
gauss_laguerre Generates the abscissas and weights of the Gauss-Laguerre n-point quadrature formula
gauss_hermite Generates the abscissas and weights of the Gauss-Hermite n-point quadrature formula
integral Evaluates an 1-D integration of a function given lower and upper boundaries
function_eval Evaluates a function given arguments
prime_numbers Gets prime numbers
Halton_numbers Gets Halton numbers
Sobol_numbers Gets Sobol numbers
Latin_hypercube Gets Latin hypercube sampling

Probability Functions

Function Name Description
prob_normal Computes the cumulative probability given z for the standard normal distribution: N(z) = Prob(Z < z)
prob_normal_inv Computes the percentile of a standard normal distribution: Prob(Z < z) = p
prob_normal_table Generates a table of the cumulative probabilities for the standard normal distribution: N(z) = Prob(Z < z)
prob_t Computes the cumulative probability given t and the degree of freeom for the Student's t distribution: Prob(t_n < t)
prob_t_inv Computes the percentile for the Student's t distribution: Prob(t_n < t) = p
prob_t_table Generates a table of the percentiles given a set of degrees of freedom and a set of probabilites for the Student's t distribution: Prob(t_n < t) = P
prob_chi Computes the cumulative probability given c and the degree of freeom for the Student's distribution: Prob(X^2 < c)
prob_chi_inv Computes the percentile for the Chi-Squared distribution: Prob(X^2 < c) = p
prob_chi_table Generates a table of the percentiles given a set of degrees of freedom and a set of probabilites for the Chi-Squared distribution: Prob(X^2 < c) = P
prob_f Computes the cumulative probability given f and the degree of freeom for the F-distribution: Prob(F(df1, df2) < f)
prob_f_inv Computes the percentile for the F-distribution: Prob(F(df1, df2) < f) = p
prob_f_table Generates a table of the percentiles given a set of degrees of freedom and a probability for the F-distribution: Prob(F(df1, df2) < f) = p
Cornish_Fisher_expansionComputes the percentile of a distribution with a skewness and an excess kurtosis by Cornish-Fisher expansion

Data Manipulation Functions

Function Name Description
variable_list Lists the variable names in an input data file
subset Gets a subset of a data table
data_lookup Looks up data by matching multiple keys
data_save Saves a data table into a file
data_save_tex Saves a data table into a file in TEX format
data_load Loads a data table from a file
data_partition Gets random data partition
data_sort Sorts a data table given keys and orders
sort_file Sorts a data file given keys and orders
merge_tables Merge two data tables by a single numerical key
rank_items Selects the items from the ranks by keys

Utility Functions

Function Name Description
version Displays the version number and build date/time of DataMinerXL software
function_list Lists all functions in DataMinerXL software