Skip to content

compute num fasta reads for size

Mojo function 🡭

def compute_num_fasta_reads_for_size(target_size_bytes: Int, min_length: Int, max_length: Int, line_width: Int = 60) -> Int

Compute the number of FASTA records needed to approximate a target size.

Estimates bytes per record based on:

  • Header: >read_<padded_i>\n (constant: 6 + num_digits + 1 bytes)
  • Sequence: seq_len bytes + ceil(seq_len / line_width) newlines

Args:

  • target_size_bytes (Int): Target total size in bytes.
  • min_length (Int): Minimum sequence length per record (inclusive).
  • max_length (Int): Maximum sequence length per record (inclusive).
  • line_width (Int): Number of bases per sequence line (default 60).

Returns:

Int: Estimated number of records needed to reach target_size_bytes.