Generates correlation or hierarchical clustering heatmaps with advanced filtering capabilities. The function creates publication-ready visualizations by selecting the most variable features/samples based on interquartile range (IQR) and applies statistical significance masking for correlation plots.

plot_CorrelationHeatmap(
  data,
  method = "pearson",
  plot_top_n = 1000,
  plot_what = "Features",
  plot_type = "correlation",
  show_rownames = TRUE,
  show_colnames = TRUE,
  clustering_distance_rows = "euclidean",
  clustering_distance_cols = "euclidean",
  clustering_method = "ward.D",
  significance_threshold = 0.05,
  color_palette = c("blue", "white", "red"),
  fontsize_main = 12,
  fontsize_labels = 8
)

Arguments

data

A list object containing preprocessed data. Must include a component named data_scaledPCA_rsdFiltered_varFiltered (typically output from perform_PreprocessingPeakData function).

method

Character string specifying the correlation method. One of:

  • "pearson": Pearson product-moment correlation (default)

  • "spearman": Spearman's rank correlation

  • "kendall": Kendall's tau correlation

plot_top_n

Positive integer specifying the number of top variable features/samples to include in the plot. Must be > 0. Default is 1000.

plot_what

Character string specifying what to plot. One of:

  • "Features": Plot correlations between features (default)

  • "Samples": Plot correlations between samples

plot_type

Character string specifying the plot type. One of:

  • "correlation": Correlation heatmap with significance masking (default)

  • "hierarchical": Hierarchical clustering heatmap of raw values

show_rownames

Logical. Whether to display row names. Default is TRUE.

show_colnames

Logical. Whether to display column names. Default is TRUE.

clustering_distance_rows

Character string specifying the distance metric for row clustering. See dist for details. One of: "euclidean" (default), "maximum", "manhattan", "canberra", "binary", "minkowski".

clustering_distance_cols

Character string specifying the distance metric for column clustering. Same options as clustering_distance_rows. Default is "euclidean".

clustering_method

Character string specifying the clustering algorithm. See hclust for details. One of:

  • "ward.D": Ward's minimum variance method (default)

  • "ward.D2": Implements Ward's (1963) criterion

  • "single": Single linkage clustering

  • "complete": Complete linkage clustering

  • "average": UPGMA clustering

  • "mcquitty": WPGMA clustering

  • "median": WPGMC clustering

  • "centroid": UPGMC clustering

significance_threshold

Numeric value between 0 and 1 specifying the p-value threshold for significance masking in correlation plots. Default is 0.05.

color_palette

Character vector of length 3 specifying colors for negative correlations, neutral/non-significant, and positive correlations. Default is c("blue", "white", "red").

fontsize_main

Numeric value for main title font size. Default is 12.

fontsize_labels

Numeric value for axis label font size. Default is 8.

Value

A list containing:

plot

The generated heatmap plot object

filtered_data

Data frame of filtered data used for plotting

correlation_matrix

Correlation matrix (correlation plots only)

p_values

Matrix of p-values (correlation plots only)

top_features

Names of selected top variable features/samples

iqr_values

Data frame of IQR values for all features/samples

parameters

List of all function parameters used

summary_stats

Summary statistics of the analysis

Details

The function performs the following steps:

  1. Validates input parameters and data structure

  2. Calculates interquartile ranges (IQR) to identify most variable features/samples

  3. Selects top N most variable features/samples based on IQR

  4. For correlation plots: computes correlation matrix with p-values and applies significance masking

  5. For hierarchical plots: uses raw filtered data with clustering

  6. Generates publication-ready heatmap with customizable aesthetics

For correlation plots, non-significant correlations (p >= significance_threshold) are displayed as neutral color (white by default) to highlight statistically significant relationships.

See also

Author

John Lennon L. Calorio

Examples

if (FALSE) { # \dontrun{
# Basic correlation heatmap
result <- plot_CorrelationHeatmap(
  data = preprocessed_data,
  method = "pearson",
  plot_top_n = 500
)

# Hierarchical clustering of samples
result <- plot_CorrelationHeatmap(
  data = preprocessed_data,
  plot_what = "Samples",
  plot_type = "hierarchical",
  clustering_method = "ward.D2"
)

# Custom correlation plot with Spearman correlation
result <- plot_CorrelationHeatmap(
  data = preprocessed_data,
  method = "spearman",
  significance_threshold = 0.01,
  color_palette = c("darkblue", "grey90", "darkred")
)
} # }