Permutation Test for Prediction Performance Differences — perm_prediction

This function compares model prediction performance between two datasets (e.g., different ancestries) using a reference training set. It trains a classification model using tidymodels, evaluates it on test and inference datasets, and assesses the statistical significance of the performance difference via permutation testing.

Usage

perm_prediction_difference(
  X,
  Y,
  R,
  MX,
  MY,
  MR,
  g_col,
  method = c("glmnet", "rf"),
  metric = c("roc_auc"),
  cv_folds = 5,
  tune_len = 10,
  max_iter = 1000,
  B = 1000,
  seed = NULL
)

Arguments

X: Matrix of predictors for the test group (ancestry 1); samples x features
Y: Matrix of predictors for the inference group (ancestry 2)
R: Matrix of predictors for the reference training group
MX: Data frame of metadata for `X`, must include the outcome column
MY: Data frame of metadata for `Y`
MR: Data frame of metadata for `R`
g_col: Character. Name of the column in `MX`, `MY`, `MR` containing the binary outcome (must have exactly 2 levels).
method: Character. Which model to use: `"glmnet"` (default) or `"rf"` (random forest).
metric: Character. Performance metric to optimize and test, currently supports `"roc_auc"` only.
cv_folds: Integer. Number of cross-validation folds (default = 5).
tune_len: Integer. Number of levels per hyperparameter in grid search (default = 10).
max_iter: Integer. Not used currently but reserved for future logic (default = 1000).
B: Integer. Number of permutations to run (default = 1000).
seed: Optional integer seed for reproducibility.

Value

A list containing:

summary_stats: A data frame with the observed statistic, group metrics, and p-value
T_null: Null distribution of the test statistic from permutations
B_used: Number of successful permutations