Aggregate gene statistics using Rank Product — RP_aggregation • CrossAncestryGenPhen

This function aggregates results from multiple iterations or studies by calculating the Rank Product (RP) of raw p-values, the mean of test statistics (`T_obs`), and the proportion of significance based on FDR-adjusted p-values. Optionally applies jitter to p-values to break ties in low-resolution data. If jitter is not applied, the function reports the fraction of tied p-values in each iteration and the mean fraction across iterations to help users assess whether jittering might be appropriate.

Usage

RP_aggregation(
  x,
  fdr_threshold = 0.05,
  jitter_p = FALSE,
  jitter_amount = 1e-06
)

Arguments

x

A data.frame with at least the following columns:

feature: Gene or feature identifier.
T_obs: Observed test statistic per iteration.
p_value: Raw p-value.
p_adj: FDR-adjusted p-value.
iteration: Replicate or study identifier.

fdr_threshold

Numeric. FDR-adjusted p-value threshold used to determine significance (default is 0.05).

jitter_p

Logical. Whether to apply small uniform random jitter to p-values to break ties and improve rank resolution (default is FALSE).

jitter_amount

Numeric. Maximum amount of uniform noise to add to p-values during jittering (default is 1e-6).

Value

A data.frame with the following columns:

feature: Gene or feature identifier.
mean_T_obs: Mean of observed test statistics across iterations.
RP: Rank Product score (lower = stronger consistent signal).
prop_sig: Proportion of iterations with FDR-adjusted p < fdr_threshold.

Examples

if (FALSE) { # \dontrun{
result <- RP_aggregation(
  combined_results,
  jitter_p = TRUE,
  jitter_amount = 1e-6
)
head(result)
} # }