perform_Regression.RdThis function performs regularized regression analysis using LASSO (Least Absolute Shrinkage and Selection Operator) and/or Elastic Net regression methods. Both methods are regularization techniques that prevent overfitting by adding penalty terms to the loss function. LASSO uses L1 regularization for feature selection by shrinking coefficients to zero, while Elastic Net combines L1 and L2 penalties to handle multicollinearity and perform simultaneous feature selection. The function supports both binary and multinomial classification tasks with comprehensive model evaluation and result reporting.
perform_Regression(
data,
method = "enet",
specify_response = NULL,
train_percent = 80,
ref = NULL,
lambda = "1se",
remember = NULL,
verbose = TRUE,
cv_folds = 10,
parallel = FALSE
)A list object containing preprocessed data. Must be the output from the
perform_PreprocessingPeakData function, containing the following elements:
FunctionOrigin: Character string indicating data source
Metadata: Data frame with sample metadata including Group column
data_scaledPCA_rsdFiltered_varFiltered: Matrix of preprocessed features
Character vector specifying regression method(s) to perform. Options:
"lasso": LASSO regression only (L1 penalty, alpha = 1)
"enet": Elastic Net regression only (L1+L2 penalty, alpha = 0.5)
c("lasso", "enet"): Both methods (recommended for comparison)
Default: "enet"
Character string specifying the response variable column name.
If NULL, uses the Group column from metadata. Default: NULL
Numeric value between 1 and 99 specifying the percentage of data
to use for training. Remaining data used for testing. Default: 80
Character string specifying the reference level for the response variable.
If NULL, uses the first factor level alphabetically. Default: NULL
Character string specifying lambda selection criterion:
"1se": Lambda within one standard error of minimum (conservative, fewer features)
"min": Lambda that minimizes cross-validation error (aggressive, more features)
Default: "1se"
Numeric value for reproducible results. Sets random seed using
set.seed(remember). If NULL, no seed is set. Default: NULL
Logical indicating whether to print progress messages and results
to console. Default: TRUE
Integer specifying number of cross-validation folds for model
selection. Must be between 3 and 20. Default: 10
Logical indicating whether to use parallel processing for
cross-validation. Default: FALSE
A list containing regression results with the following structure:
FunctionOriginCharacter string identifying the source function
ModelSummaryData frame summarizing model performance metrics
DataSplitList containing training/testing data split information
LASSO_ResultsList of LASSO results (if method includes "lasso")
ElasticNet_ResultsList of Elastic Net results (if method includes "enet")
ComparisonSummaryData frame comparing methods (if both performed)
Each method-specific results list contains:
Model: Fitted cv.glmnet object
Predictions: Data frame with actual vs predicted values
ConfusionMatrix: Complete confusion matrix object
Performance: Data frame with accuracy, sensitivity, specificity, etc.
Coefficients: Data frame with non-zero coefficients and odds ratios
Lambda: Selected lambda value
Alpha: Alpha parameter used
ReferenceLevel: Reference level for classification
Perform Regularized Regression Analysis
if (FALSE) { # \dontrun{
# Load required libraries
library(glmnet)
library(caret)
library(dplyr)
# Perform both LASSO and Elastic Net regression
regression_results <- perform_Regression(
data = preprocessed_data,
method = c("lasso", "enet"),
train_percent = 75,
lambda = "1se",
remember = 123,
cv_folds = 10
)
# View model comparison
print(regression_results$ModelSummary)
print(regression_results$ComparisonSummary)
# Access LASSO results
lasso_coef <- regression_results$LASSO_Results$Coefficients
lasso_perf <- regression_results$LASSO_Results$Performance
# Access Elastic Net results
enet_coef <- regression_results$ElasticNet_Results$Coefficients
enet_perf <- regression_results$ElasticNet_Results$Performance
# View confusion matrices
print(regression_results$LASSO_Results$ConfusionMatrix$table)
print(regression_results$ElasticNet_Results$ConfusionMatrix$table)
} # }