Skip to contents

Simulates RNA-seq count data using a Negative Binomial model, with parameters estimated from real count data. Useful for creating realistic baseline data for method development or DEG simulation.

Usage

simulate_NB_counts(X, MX, g_col, a_col, seed = NULL)

Arguments

X

A gene-by-sample matrix or data frame of real (integer) RNA-seq counts.

MX

A data frame with rownames matching X and columns for ancestry and group.

g_col

The name of the column in MX indicating experimental group (e.g., "group").

a_col

The name of the column in MX indicating ancestry (e.g., "ancestry").

seed

Optional random seed for reproducibility.

Value

A list containing:

counts

A simulated count matrix of the same shape as X (samples × genes).

sample_info

Metadata with ancestry, group, and interaction group.

params

A data frame of per-gene estimated means and dispersions.