This function performs PCA and generates multiple scores plots at once based on a list of principal component combinations, plus an optional scree plot. Each combination in the list will produce a separate scores plot.

perform_PCA(
  data,
  scorePC = list(c(1, 2), c(1, 3), c(2, 3)),
  includeQC = FALSE,
  arrangeLevels = NULL,
  scoresEllipse = TRUE,
  scoresTitle = NULL,
  scoresLegend = NULL,
  includeScree = TRUE,
  screeWhat = c("variance", "eigenvalue")[1],
  screeType = c("bar", "line", "both")[3],
  screeTitle = "Scree Plot",
  screeMaxPC = NULL,
  screeShowValues = TRUE,
  screeShowCumulative = FALSE
)

Arguments

data

List. This list must be a result from the perform_PreprocessingPeakData function.

scorePC

List. A list of 2-element numeric vectors, where each vector specifies the principal components to plot. For example: list(c(1,2), c(1,3), c(2,3)) will generate 3 plots: PC1 vs PC2, PC1 vs PC3, and PC2 vs PC3.

includeQC

Boolean. If TRUE, includes QC (Quality Control) samples in the analysis and plots. If FALSE, uses only biological samples (BS). Defaults to FALSE.

arrangeLevels

Vector. Determines how the groups will be arranged. The format could be "c('group1', 'group2', ...)". Defaults to NULL which sorts the groups in alphabetical order.

scoresEllipse

Boolean. If TRUE (default), adds an ellipse in the scores plot.

scoresTitle

String or Vector. The scores plot title(s). Can be a single string (applied to all plots) or a vector of strings (one for each plot). If NULL, defaults to "PCA Scores Plot (PC{i} vs PC{j})".

scoresLegend

String. The title in the legend section of the scores plot. Defaults to NULL which means no legend title.

includeScree

Boolean. If TRUE, generates a scree plot in addition to scores plots. Defaults to TRUE.

screeWhat

Character. What to plot in the scree plot: "variance" or "eigenvalue". Defaults to "variance".

screeType

Character. Type of scree plot: "bar", "line", or "both". Defaults to "both".

screeTitle

Character. Title for the scree plot. Defaults to "Scree Plot".

screeMaxPC

Integer. Maximum number of PCs to show in scree plot. If NULL, shows all available PCs up to 20. Defaults to NULL.

screeShowValues

Boolean. If TRUE, shows variance/eigenvalue values on top of bars and removes y-axis labels. Defaults to TRUE.

screeShowCumulative

Boolean. If TRUE, shows cumulative variance explained on top of the scree plot. Only works when screeWhat="variance". Defaults to FALSE.

Value

Returns a list containing:

  • plots: A list of ggplot objects, one for each PC combination

  • scree_plot: A ggplot object for the scree plot (if includeScree is TRUE)

  • plot_info: A data frame with information about each plot (PC combinations, variance explained)

  • pca_results: The PCA results object

  • data_used: The data matrix used for PCA

  • variance_explained: Vector of variance explained by each PC

  • eigenvalues: Vector of eigenvalues for each PC

Author

John Lennon L. Calorio

Examples

if (FALSE) { # \dontrun{
# Generate 3 different scores plots plus scree plot
multi_plots <- perform_PCA(
  data = data_from_perform_PreprocessingPeakData_function,
  scorePC = list(c(1,2), c(1,3), c(2,3)),
  includeQC = FALSE,
  scoresEllipse = TRUE,
  includeScree = TRUE,
  screeWhat = "variance",
  screeType = "both"
)

# Access individual plots
plot1 <- multi_plots$plots[[1]]  # PC1 vs PC2
plot2 <- multi_plots$plots[[2]]  # PC1 vs PC3
scree <- multi_plots$scree_plot  # Scree plot

# Display all plots
display_AllPlots(multi_plots, include_scree = TRUE)
} # }