sequencing data (fastq)

These are tools for maniupalting fasta and fastq files. For tools that can programmatically download fasta and fastq` files from repositories, see the section on data download.

general purpose (swiss army knife) tools

seqkit has a lot of subcommands for most of the operations you might ever want to do on fastq or fasta files. It is really quick and well-documented.

Splitting FASTQ data

fastqsplitter

Fastqsplitter is a tool that allows you to split FASTQ files into a desired number of output files evenly.

Installation of fastqsplitter via conda:

mamba create -n fastqsplitter -c bioconda fastqsplitter

Usage of fastqsplitter:

fastqsplitter -i input.fastq.gz -o output_1.fastq.gz -o output_2.fastq.gz -o output_3.fastq.gz

In the example above, fastqsplitter divides the input fastq into 3 evenly distributed fastq files through inferring this number from the outputs that are specified by the user.