Skip to content

generate synthetic fasta buffer

Mojo function 🡭

def generate_synthetic_fasta_buffer(num_reads: Int, min_length: Int, max_length: Int, line_width: Int = 60, gc_bias: Float32 = 0.5) -> List[Byte]

Generate a contiguous in-memory FASTA buffer with configurable sequence length and GC content.

Sequence lengths are chosen deterministically in [min_length, max_length]. Sequences are wrapped at line_width bases per line (multiline FASTA). Base composition follows the same LCG + GC-bias model as generate_synthetic_fastq_buffer.

Args:

  • num_reads (Int): Number of FASTA records to generate.
  • min_length (Int): Minimum sequence length per record (inclusive).
  • max_length (Int): Maximum sequence length per record (inclusive).
  • line_width (Int): Number of bases per sequence line. Default 60 (standard FASTA).
  • gc_bias (Float32): Target GC fraction in [0.0, 1.0]. Default 0.5.

Returns:

List: List[Byte] containing valid multiline FASTA data; pass to MemoryReader for parsing.

Raises:

Error: If arguments are invalid.