Skip to contents

Runs stratified train/test splits with ancestry sets and estimates interaction effects using limma in parallel across multiple iterations. Returns aggregated statistics, iteration-level stats, and sample logs.

Usage

subset_limma_interaction_effect(
  X,
  Y,
  MX,
  MY,
  g_col,
  a_col,
  covariates = NULL,
  use_voom = TRUE,
  n_iter = 1000,
  workers = future::availableCores() - 1,
  seed = NULL
)

Arguments

X

A data frame or matrix of features (genotypes or other predictors).

Y

A response vector or matrix.

MX

Additional metadata or covariates for `X`.

MY

Additional metadata or covariates for `Y`.

g_col

Name of the genotype column in `X` used for interaction.

a_col

Name of the ancestry column used for stratified splitting.

n_iter

Integer. Number of iterations to run. Default is 1000.

workers

Number of parallel workers to use. Defaults to `future::availableCores() - 1` (reserving one core).

seed

Optional integer for random seed to ensure reproducibility.

Value

A list with the following elements:

aggregated_stats

Data frame with aggregated summary statistics across iterations.

iteration_stats

Data frame with iteration-level statistics.

sample_log

Data frame logging sample IDs used in each split.

meta

List with metadata about the run (e.g., `n_iter` and `seed`).