Subset Limma Interaction Effect with Stratified Cross-Validation — subset_limma_interaction

Runs stratified train/test splits with ancestry sets and estimates interaction effects using limma in parallel across multiple iterations. Returns aggregated statistics, iteration-level stats, and sample logs.

Usage

subset_limma_interaction_effect(
  X,
  Y,
  MX,
  MY,
  g_col,
  a_col,
  covariates = NULL,
  use_voom = TRUE,
  n_iter = 1000,
  workers = future::availableCores() - 1,
  seed = NULL
)

Arguments

X: A data frame or matrix of features (genotypes or other predictors).
Y: A response vector or matrix.
MX: Additional metadata or covariates for `X`.
MY: Additional metadata or covariates for `Y`.
g_col: Name of the genotype column in `X` used for interaction.
a_col: Name of the ancestry column used for stratified splitting.
n_iter: Integer. Number of iterations to run. Default is 1000.
workers: Number of parallel workers to use. Defaults to `future::availableCores() - 1` (reserving one core).
seed: Optional integer for random seed to ensure reproducibility.

Value

A list with the following elements:

aggregated_stats: Data frame with aggregated summary statistics across iterations.
iteration_stats: Data frame with iteration-level statistics.
sample_log: Data frame logging sample IDs used in each split.
meta: List with metadata about the run (e.g., `n_iter` and `seed`).