Plot Correlation and Hierarchical Clustering Heatmaps

Generates correlation or hierarchical clustering heatmaps with advanced filtering capabilities. The function creates publication-ready visualizations by selecting the most variable features/samples based on interquartile range (IQR) and applies statistical significance masking for correlation plots.

plot_CorrelationHeatmap(
  data,
  method = "pearson",
  plot_top_n = 1000,
  plot_what = "Features",
  plot_type = "correlation",
  show_rownames = TRUE,
  show_colnames = TRUE,
  clustering_distance_rows = "euclidean",
  clustering_distance_cols = "euclidean",
  clustering_method = "ward.D",
  significance_threshold = 0.05,
  color_palette = c("blue", "white", "red"),
  fontsize_main = 12,
  fontsize_labels = 8
)

Arguments

data

A list object containing preprocessed data. Must include a component named data_scaledPCA_rsdFiltered_varFiltered (typically output from perform_PreprocessingPeakData function).

method

Character string specifying the correlation method. One of:

"pearson": Pearson product-moment correlation (default)
"spearman": Spearman's rank correlation
"kendall": Kendall's tau correlation

plot_top_n

Positive integer specifying the number of top variable features/samples to include in the plot. Must be > 0. Default is 1000.

plot_what

Character string specifying what to plot. One of:

"Features": Plot correlations between features (default)
"Samples": Plot correlations between samples

plot_type

Character string specifying the plot type. One of:

"correlation": Correlation heatmap with significance masking (default)
"hierarchical": Hierarchical clustering heatmap of raw values

show_rownames

Logical. Whether to display row names. Default is TRUE.

show_colnames

Logical. Whether to display column names. Default is TRUE.

clustering_distance_rows

Character string specifying the distance metric for row clustering. See dist for details. One of: "euclidean" (default), "maximum", "manhattan", "canberra", "binary", "minkowski".

clustering_distance_cols

Character string specifying the distance metric for column clustering. Same options as clustering_distance_rows. Default is "euclidean".

clustering_method

Character string specifying the clustering algorithm. See hclust for details. One of:

"ward.D": Ward's minimum variance method (default)
"ward.D2": Implements Ward's (1963) criterion
"single": Single linkage clustering
"complete": Complete linkage clustering
"average": UPGMA clustering
"mcquitty": WPGMA clustering
"median": WPGMC clustering
"centroid": UPGMC clustering

significance_threshold

Numeric value between 0 and 1 specifying the p-value threshold for significance masking in correlation plots. Default is 0.05.

color_palette

Character vector of length 3 specifying colors for negative correlations, neutral/non-significant, and positive correlations. Default is c("blue", "white", "red").

fontsize_main

Numeric value for main title font size. Default is 12.

fontsize_labels

Numeric value for axis label font size. Default is 8.

Value

A list containing:

plot: The generated heatmap plot object
filtered_data: Data frame of filtered data used for plotting
correlation_matrix: Correlation matrix (correlation plots only)
p_values: Matrix of p-values (correlation plots only)
top_features: Names of selected top variable features/samples
iqr_values: Data frame of IQR values for all features/samples
parameters: List of all function parameters used
summary_stats: Summary statistics of the analysis

Details

The function performs the following steps:

Validates input parameters and data structure
Calculates interquartile ranges (IQR) to identify most variable features/samples
Selects top N most variable features/samples based on IQR
For correlation plots: computes correlation matrix with p-values and applies significance masking
For hierarchical plots: uses raw filtered data with clustering
Generates publication-ready heatmap with customizable aesthetics

For correlation plots, non-significant correlations (p >= significance_threshold) are displayed as neutral color (white by default) to highlight statistically significant relationships.

Author

John Lennon L. Calorio

Examples

if (FALSE) { # \dontrun{
# Basic correlation heatmap
result <- plot_CorrelationHeatmap(
  data = preprocessed_data,
  method = "pearson",
  plot_top_n = 500
)

# Hierarchical clustering of samples
result <- plot_CorrelationHeatmap(
  data = preprocessed_data,
  plot_what = "Samples",
  plot_type = "hierarchical",
  clustering_method = "ward.D2"
)

# Custom correlation plot with Spearman correlation
result <- plot_CorrelationHeatmap(
  data = preprocessed_data,
  method = "spearman",
  significance_threshold = 0.01,
  color_palette = c("darkblue", "grey90", "darkred")
)
} # }

Arguments

Value

Details

See also

Author

Examples