generate synthetic fastq buffer
Mojo function 🡭
generate_synthetic_fastq_buffer
Section titled “generate_synthetic_fastq_buffer”def generate_synthetic_fastq_buffer(num_reads: Int, min_length: Int, max_length: Int, min_phred: Int, max_phred: Int, quality_schema: String, gc_bias: Float32 = 0.5) -> List[Byte]Generate a contiguous in-memory FASTQ buffer with configurable read length and quality distribution.
Read lengths are chosen deterministically in [min_length, max_length] (inclusive). Per-base Phred scores follow a positional decay model (high quality at 5’ end, degrading toward 3’ end), mimicking real Illumina quality profiles. Base composition follows a configurable GC content model with pseudorandom distribution.
Args:
- num_reads (
Int): Number of FASTQ records to generate. - min_length (
Int): Minimum sequence length per read (inclusive). - max_length (
Int): Maximum sequence length per read (inclusive). - min_phred (
Int): Minimum Phred score per base (inclusive) — used as the floor for 3’ end quality. - max_phred (
Int): Maximum Phred score per base (inclusive) — used as the ceiling for 5’ end quality. - quality_schema (
String): Schema name (e.g. “sanger”, “solexa”, “illumina_1.8”, “generic”). - gc_bias (
Float32): Target GC fraction in [0.0, 1.0]. Default 0.5 (50% GC). Values above 0.5 increase G/C frequency; below 0.5 increase A/T frequency.
Returns:
List: List[Byte] containing valid 4-line FASTQ data; pass to MemoryReader for parsing.
Raises:
Error: If num_reads < 0, min_length > max_length, or min_phred > max_phred.