Supported Applications
AppCiter will help you create a bibliography of the programs you wish to cite. See How.
AppCiter Programs:
No programs selected
How to use AppCiter?
AppCiter will help you create a bibliography of the programs you wish to cite.
1. Choose Your Programs
Use the button in the Name column to choose your programs. Proceed to Step 2 to view available citations.
2. Select Citations
Select citations from a custom list. Proceed to Step 3 to download.
3. Export Citations
Export citations or send them to your email.
CloseResults:
Name | Description | Links |
---|---|---|
10xbamtofastq
|
tool for converting 10x BAMs produced by Cell Ranger, Space Ranger, Cell Ranger ATAC, Cell Ranger DNA, and Long Ranger back to FASTQ files that can be used as inputs to re-run analysis.
Keywords:
High-Throughput Sequencing
|
|
A5
|
A5-miseq is a pipeline for assembling DNA sequence data generated on the Illumina sequencing platform. A5-miseq can produce high-quality microbial genome assemblies on a laptop computer without any parameter tuning by automating the process of adapter trimming, quality filtering, error correction, contig and scaffold generation and detection of misassemblies.
|
|
abeona
|
a simple transcriptome assembler based on kallisto and Cortex graphs.
Keywords:
Transcriptomics
High-Throughput Sequencing
|
|
abismal
|
abismal is a fast and memory-efficient mapper for short bisulfite sequencing reads
Keywords:
bisulfite-Seq
High-Throughput Sequencing
|
|
abPOA
|
an extended version of Partial Order Alignment (POA) that performs adaptive banded dynamic programming (DP) with an SIMD implementation.
|
|
ABRicate
|
mass screening of contigs for antibiotic resistance genes.
|
|
AbundanceBin
|
an abundance-based tool for binning metagenomic sequences.
|
|
AGAT
|
(Another Gff Analysis Toolkit) a suite of tools to handle gene annotations in any GTF/GFF format.
Keywords:
High-throughput sequencing
|
|
Assembled Genomes Compressor
|
Assembled Genomes Compressor (AGC) is a tool designed to compress collections of de-novo assembled genomes. It can be used for various types of datasets: short genomes (viruses) as well as long (humans).
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
AGFusion
|
a python package for annotating gene fusions from the human or mouse genomes.
Keywords:
Genome Annotation
High-throughput sequencing
|
|
AKT
|
(Ancestry and Kinship Toolkit) a statistical genetics tool for analysing large cohorts of whole-genome sequenced samples. It provides a handful of useful statistical genetics routines using the htslib API for input/output. This means it can seamlessly read BCF/VCF files and play nicely with bcftools.
|
|
alevin‑fry
|
is a tool for the efficient processing of single-cell data based on RAD files produced by alevin.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
Alfred
|
an efficient and versatile command-line application that computes multi-sample quality control metrics in a read-group aware manner.
|
|
AlignStats
|
AlignStats produces various alignment, whole genome coverage, and capture coverage metrics for sequence alignment files in SAM, BAM, and CRAM format.
Keywords:
High-throughput sequencing
|
|
allo
|
Multi-mapped read rescue strategy for gene regulatory analyses
Keywords:
High-throughput sequencing
|
|
AMPtk
|
AMPtk: Amplicon tool kit for processing high throughput amplicon sequencing data.
Keywords:
High-throughput sequencing
|
|
AnchorWave
|
AnchorWave (Anchored Wavefront Alignment) identifies collinear regions via conserved anchors (full-length CDS and full-length exon have been implemented currently) and breaks collinear regions into shorter fragments, i.e., anchor and inter-anchor intervals.
|
|
anvi'o
|
an open-source, community-driven analysis and visualization platform for ‘omics data. Its interactive interface facilitates the management of metagenomic contigs and associated data for automatic or human-guided identification of genome bins and their curation.
|
|
ARAGORN
|
ARAGORN identifies tRNA and tmRNA genes. The program employs heuristic algorithms to predict tRNA secondary structure, based on homology with recognized tRNA consensus sequences and ability to form a base‐paired cloverleaf.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
arcasHLA
|
high-resolution HLA typing from RNA seq.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
arcs
|
Scaffolding genome sequence assemblies using linked or long reads.
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
ARIBA
|
(Antibiotic Resistance Identification By Assembly) a tool that identifies antibiotic resistance genes by running local assemblies. It can also be used for MLST calling.
|
|
assembly‑stats
|
Get assembly statistics from FASTA and FASTQ files.
Keywords:
Genome Assembly
High-throughput sequencing
|
|
atropos
|
trim adapters from high-throughput sequencing reads.
Keywords:
High-throughput sequencing
|
|
AWS CLI
|
(Amazon Web Services Command Line Interface) a command line interface tool to manage multiple Amazon Web Services and automate them through scripts.
Keywords:
Other
High-Throughput Sequencing
|
|
Bakta
|
rapid and standardized annotation of bacterial genomes & plasmids.
|
|
Balrog
|
A universal protein model for prokaryotic gene prediction
|
|
bam‑readcount
|
bam-readcount generates metrics at single nucleotide positions.
Keywords:
High-throughput sequencing
|
|
BAMscale
|
BAMscale is a one-step tool for either 1) quantifying and normalizing the coverage of peaks or 2) generated scaled BigWig files for easy visualization of commonly used DNA-seq capture based methods.
Keywords:
High-throughput sequencing
|
|
BamToCov
|
Extract coverage information from BAM files, supporting stranded and physical coverage and streams.
Keywords:
High-throughput sequencing
|
|
bamtofastq
|
Tool for converting 10x BAMs produced by Cell Ranger
Keywords:
High-throughput sequencing
|
|
bamtools
|
a fast, flexible C++ API & toolkit for reading, writing, and manipulating BAM files.
|
|
bamUtil
|
a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable, bam.
Keywords:
High-throughput sequencing
|
|
bazam
|
is a tool to extract paired reads in FASTQ format from coordinate sorted BAM files.
Bazam is a smarter way to realign reads from one genome to another. If you've tried to use Picard SAMtoFASTQ or samtools bam2fq before and ended up unsatisfied with complicated, long running inefficient pipelines, bazam might be what you wanted. Bazam will output FASTQ in a form that can …
Keywords:
High-throughput sequencing
|
|
BBTools
|
a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. BBTools can handle common sequencing file formats such as fastq, fasta, sam, scarf, fasta+qual, compressed or raw, with autodetection of quality encoding and interleaving.
Keywords:
Genomics
High-throughput sequencing
|
|
bcalm
|
is a bioinformatics tool for constructing the compacted de Bruijn graph from sequencing data.
Keywords:
High-throughput sequencing
|
|
bcbio‑nextgen
|
provides best-practice pipelines for automated analysis of high throughput sequencing data with the goal of being quantifiable, analyzable, scalable and reproducible. The development process is fully open and sustained by contributors from multiple institutions. Bioinformaticians, biologists and the general public should be able to run these tools on inputs ranging from research materials to clinical samples to personal genomes.
|
|
BCFtools
|
a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.
Keywords:
High-throughput sequencing
WGS Analysis
|
|
BEDOPS
|
BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale. Tasks can be easily split by chromosome for distributing whole-genome analyses across a computational cluster.
|
|
bedtools
|
a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic. Bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), sophisticated analyses …
Keywords:
High-Throughput Sequencing
Genomics
|
|
BETA
|
(Binding and Expression Target Analysis) a software package that integrates ChIP-seq of transcription factors or chromatin regulators with differential gene expression data to infer direct target genes.
|
|
bfc
|
a standalone high-performance tool for correcting sequencing errors from Illumina sequencing data.
Keywords:
High-throughput sequencing
|
|
BGT
|
is a compact file format for efficiently storing and querying whole-genome genotypes of tens to hundreds of thousands of samples. It can be considered as an alternative to genotype-only BCFv2. BGT is more compact in size, more efficient to process, and more flexible on query.
|
|
BIGpre
|
A quality assessment package for next-genomics sequencing data. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well.
|
|
bioawk
|
an extension to Brian Kernighan's awk, with added support for several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q, and TAB-delimited formats with column names along with new built-in functions and a command line option to use TAB as the input/output delimiter. When the new functionality is not used, bioawk should behave exactly like the original BWK awk.
Keywords:
High-Throughput Sequencing
Genomics
|
|
biobambam2
|
tools for early stage NGS alignment file processing including fast sorting and duplicate marking.
Keywords:
High-throughput sequencing
|
|
BioHansel
|
subtype microbial whole-genome sequencing (WGS) data using SNV targeting k-mer subtyping schemes.
|
|
bioinfokit
|
The bioinfokit toolkit aims to provide various easy-to-use functionalities to analyze, visualize, and interpret the biological data generated from genome-scale omics experiments.
Keywords:
High-Throughput Sequencing
|
|
BLASR
|
(Basic Local Alignment with Successive Refinement) maps Single Molecule Sequencing (SMS) reads that are thousands of bases long, with divergence between the read and genome dominated by insertion and deletion error.
Keywords:
High-throughput sequencing
PacBio Sequencing
|
|
BLAST
|
(Basic Local Alignment Search Tool) finds regions of similarity between biological sequences.
|
|
BLAST+
|
a suite of BLAST (Basic Local Alignment Search Tool) tools that utilizes the NCBI C++ Toolkit with a number of performance and feature improvements over the legacy BLAST applications.
|
|
Bloocoo
|
is a k-mer spectrum-based read error corrector, designed to correct large datasets with a very low memory footprint. It uses the disk streaming k-mer counting algorithm contained in the GATB library, and inserts solid k-mers in a bloom-filter. The correction procedure is similar to the Musket multistage approach. Bloocoo yields similar results while requiring far less memory: as an example, it can correct whole …
|
|
bmtagger
|
aka Best Match Tagger is for removing human reads from metagenomics datasets
|
|
bmtool
|
bmtool is part of BMTagger aka Best Match Tagger, for removing human reads from metagenomics datasets.
|
|
Bowtie
|
an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers.
|
|
Bowtie 2
|
an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.
|
|
bpp
|
implements a versatile high-performance version of the BPP software
|
|
Bracken
|
(Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.
|
|
breseq
|
a computational pipeline for finding mutations relative to a reference sequence in short-read DNA re-sequencing data for microbial sized genomes. It reports single-nucleotide mutations, point insertions and deletions, large deletions, and new junctions supported by mosaic reads.
|
|
BUStools
|
bustools is a program for manipulating BUS files for single cell RNA-Seq datasets. It can be used to error correct barcodes, collapse UMIs, produce gene count or transcript compatbility count matrices, and is useful for many other tasks. See the kallisto | bustools website for examples and instructions on how to use bustools as part of a single-cell RNA-seq workflow.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
BWA
|
(Burrows-Wheeler Aligner) a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM.
|
|
C3POa
|
(Concatemeric Consensus Caller with Partial Order alignments) is a computational pipeline for calling consensi on R2C2 nanopore data.
Keywords:
Nanopore
High-Throughput Sequencing
|
|
Cactus
|
a reference-free whole-genome multiple alignment program based upon notion of Cactus graphs.
|
|
calib
|
clusters paired-end reads using their barcodes and sequences.
Keywords:
High-throughput sequencing
|
|
Canu
|
Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing. Canu specializes in assembling PacBio or Oxford Nanopore sequences. Canu operates in three phases: correction, trimming and assembly. The correction phase will improve the accuracy of bases in reads.
|
|
Captus
|
Assembly of Phylogenomic Datasets from High-Throughput Sequencing data
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
cas‑offinder
|
Cas-OFFinder is OpenCL based, ultrafast and versatile program that searches for potential off-target sites of CRISPR/Cas-derived RNA-guided endonucleases (RGEN).
|
|
cd‑hit
|
clusters and compares protein or nucleotide sequences.
|
|
cell2location
|
Comprehensive mapping of tissue cell architecture via integrated single cell and spatial transcriptomics (cell2location model)
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
CellBender
|
a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
cellranger‑arc
|
The set of analysis pipelines in this suite perform sample demultiplexing, barcode processing, identification of open chromatin regions, and simultaneous counting of transcripts and peak accessibility in single cells.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
Cell Ranger ATAC
|
a set of analysis pipelines that perform identification of open chromatin regions, motif annotation, and differential accessibility analysis for Single Cell ATAC data.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
Centrifuge
|
is a very rapid and memory-efficient system for the classification of DNA sequences from microbial samples, with better sensitivity than and comparable accuracy to other leading systems. The system uses a novel indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (e.g., 4.3 GB for ~4,100 bacterial …
|
|
ChIPs
|
ChIPs is a tool for simulating ChIP-sequencing experiments.
Keywords:
ChIP-Sequencing
High-Throughput Sequencing
|
|
chopper
|
Rust implementation of NanoFilt+NanoLyse, both originally written in Python. This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file.
|
|
Circlator
|
Circlator is a tool to circularize genome assemblies. The input is a genome assembly in FASTA format and corrected PacBio or nanopore reads in FASTA or FASTQ format. Circlator will attempt to identify each circular sequence and output a linearised version of it. It does this by assembling all reads that map to contig ends and comparing the resulting contigs with the input assembly.
|
|
CITE‑seq‑Count
|
count antibody TAGS from a CITE-seq and/or cell hashing experiment.
Keywords:
High-throughput sequencing
|
|
Clair3
|
a tool for symphonizing pileup and full-alignment for high-performance long-read variant calling
|
|
CLARK
|
fast, accurate and versatile k-mer based classification system.
Keywords:
High-throughput sequencing
|
|
clustalo
|
is the latest version of Clustal: a multiple sequence alignment program for DNA or proteins.
|
|
cooler
|
is a support library for a sparse, compressed, binary persistent storage format, also called cooler, used to store genomic interaction data, such as Hi-C contact matrices.
Keywords:
Hi-C
High-Throughput Sequencing
|
|
corset
|
Software for clustering de novo assembled transcripts and counting overlapping reads.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
covtobed
|
a tool to generate BED coverage tracks from BAM files.
It reads one (or more) alignment files (sorted BAM) and prints a BED with the coverage. It will join consecutive bases with the same coverage, and can be used to only print a BED file with the regions having a specific coverage range.
Keywords:
High-throughput sequencing
|
|
crass
|
Crass is designed to identify and reconstruct CRISPR loci from raw metagenomic data without the need for assembly or prior knowledge of CRISPR in the data set.
|
|
crimson
|
Bioinformatics tool outputs converter to JSON or YAML.
|
|
Crumble
|
controllable lossy compression of BAM/CRAM files.
Keywords:
High-throughput sequencing
|
|
Cufflinks
|
a reference-guided assembler that assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples.
|
|
Cutadapt
|
finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
|
|
Cuttlefish
|
a fast, parallel, and very lightweight memory tool to construct the compacted de Bruijn graph from genome reference(s).
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
cyvcf2
|
a cython wrapper around htslib built for fast parsing of Variant Call Format (VCF) files.
Keywords:
Variant Analysis
High-Throughput Sequencing
|
|
daligner
|
finds all significant local alignments between reads.
Keywords:
Read Alignment
High-Throughput Sequencing
|
|
dammit
|
simple de novo transcriptome annotator
|
|
dDocent
|
dDocent is simple bash wrapper to QC, assemble, map, and call SNPs from almost any kind of RAD sequencing. If you have a reference already, dDocent can be used to call SNPs from almost any type of NGS data set.
Keywords:
RADSeq
High-Throughput Sequencing
|
|
deblur
|
Deblur is a greedy deconvolution algorithm for amplicon sequencing based on Illumina Miseq/Hiseq error profiles.
Keywords:
High-throughput sequencing
|
|
deepTools
|
a suite of python tools particularly developed for the efficient analysis of high-throughput sequencing data, such as ChIP-seq, RNA-seq or MNase-seq.
Keywords:
High-throughput sequencing
Other
|
|
delly
|
an integrated structural variant (SV) prediction method that can discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read and long-read massively parallel sequencing data.
|
|
demuxlet
|
Genetic multiplexing of barcoded single cell RNA-seq.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
deSALT
|
De Bruijn graph-based Spliced Aligner for Long Transcriptome reads
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
DESeq2
|
a Bioconductor software package installed in R 3.2.2 that estimates variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
|
|
Dextractor
|
bax file decoder and data compressor.
Keywords:
PacBio Sequencing
High-Throughput Sequencing
|
|
DIAMOND
|
a high-throughput program for aligning a file of short DNA sequencing reads against a protein reference database such as NR, at 20,000 times the speed of BLASTX, with high sensitivity.
|
|
dicey
|
In-silico PCR and variant primer design
|
|
dnaio
|
is a Python 3.7+ library for very efficient parsing and writing of FASTQ and also FASTA files.
Keywords:
High-throughput sequencing
|
|
DNAscent
|
DNAscent is software designed to detect the base analogues BrdU and EdU in single molecules of DNA sequenced on the Oxford Nanopore platform
Keywords:
Nanopore
High-Throughput Sequencing
|
|
dnmtools
|
a set of tools for analyzing DNA methylation data from bisulfite sequencing
Keywords:
DNA-Sequencing
High-Throughput Sequencing
|
|
downpore
|
a suite of tools for use in genome assembly and consensus.
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
dRep
|
a python program for rapidly comparing large numbers of genomes, dRep can also "de-replicate" a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set.
|
|
DROP
|
(Detection of RNA Outlier Pipeline) pipeline to find aberrant gene expression events in RNA sequencing data.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
dsh‑bio
|
Tools for BED, FASTA, FASTQ, GAF, GFA1/2, GFF3, PAF, SAM, and VCF files
Keywords:
High-throughput sequencing
|
|
DWGSIM
|
a whole genome simulator for next-generation sequencing based off of wgsim found in SAMtools, which was written by Heng Li, and forked from DNAA. It was modified to handle ABI SOLiD and Ion Torrent data, as well as various assumptions about aligners and positions of indels. Many new features have been subsequently added.
|
|
EBSeq
|
a Bioconductor software package installed in R 3.2.2 for gene and isoform differential expression analysis of RNA-seq data.
|
|
edgeR
|
a Bioconductor software package installed in R 3.2.2 for examining differential expression of replicated count data.
|
|
EggNOG‑mapper
|
Fast genome-wide functional annotation through orthology assignment.
Keywords:
Genome Annotation
High-Throughput Sequencing
|
|
elPrep
|
a high-performance tool for analyzing .sam/.bam files (up to and including variant calling) in sequencing pipelines.
Keywords:
High-throughput sequencing
Variant Analysis
|
|
EMA
|
Fast & accurate alignment of barcoded short-reads
Keywords:
High-throughput sequencing
|
|
EMu
|
EMu is a relative abundance estimator for 16S genomic sequences
|
|
ENANO
|
a FASTQ lossless compression algorithm especially designed for nanopore sequencing FASTQ files.
Keywords:
High-throughput sequencing
|
|
EPA‑ng
|
a complete rewrite of the Evolutionary Placement Algorithm (EPA), previously implemented in RAxML. It uses libpll and pll-modules to perform maximum likelihood-based phylogenetic placement of genetic sequences on a user-supplied reference tree and alignment.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
epic2
|
Ultraperformant Chip-Seq broad domain finder based on SICER.
Keywords:
ChIP-Sequencing
High-Throughput Sequencing
|
|
ExaBayes
|
is a software package for Bayesian tree inference.
|
|
eXpress
|
a streaming tool for quantifying the abundances of a set of target sequences from sampled subsequences.
|
|
falco
|
is a drop-in C++ implementation of FastQC to assess the quality of sequence reads.
|
|
FASTA
|
a DNA and protein sequence alignment software package that searches for matching sequence patterns or words, called k-tuples.
|
|
Fasten
|
Perform random operations on fastq files, using unix streaming. Secure your analysis with Fasten!
Keywords:
High-throughput sequencing
|
|
FastME
|
FastME provides distance algorithms to infer phylogenies.
|
|
Fastool
|
Fastool is a simple and quick tool to read huge FastQ and FastA files (both normal and gzipped) and manipulate them.
It makes use of the KSeq library (http://lh3lh3.users.sourceforge.net/kseq.shtml) for fast access to FastQ/A files.
Keywords:
High-throughput sequencing
|
|
fastp
|
is a tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.
Keywords:
High-throughput sequencing
|
|
FastQC
|
a quality control tool for high throughput sequence data.
|
|
fastq‑dl
|
A tool to download FASTQs associated with Study, Experiment, or Run accessions.
Keywords:
High-throughput sequencing
|
|
fastq‑scan
|
fastq-scan reads a FASTQ from STDIN and outputs summary statistics (read lengths, per-read qualities, per-base qualities) in JSON format.
Keywords:
High-Throughput Sequencing
|
|
FastQ Screen
|
allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.
|
|
FastQTL
|
a fast, flexible, user-friendly, cluster-friendly QTL mapper.
|
|
FastTree
|
infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million sequences in a reasonable amount of time and memory.
|
|
fastv
|
an ultra-fast tool for identification of SARS-CoV-2 and other microbes from sequencing data.
|
|
FASTX_Toolkit
|
a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
Keywords:
High-throughput sequencing
|
|
fgbio
|
a set of tools to analyze genomic data with a focus on Next Generation Sequencing.
Keywords:
Genomics
High-Throughput Sequencing
|
|
fibertools‑rs
|
a CLI tool for interacting with fiberseq bam files.
Keywords:
High-throughput sequencing
|
|
Filtlong
|
a tool for filtering long reads by quality.
|
|
FLASH
|
(Fast Length Adjustment of SHort reads) is a very fast and accurate software tool to merge paired-end reads from next-generation sequencing experiments. FLASH is designed to merge pairs of reads when the original DNA fragments are shorter than twice the length of reads. The resulting longer reads can significantly improve genome assemblies. They can also improve transcriptome assembly when FLASH is used to merge …
Keywords:
High-throughput sequencing
|
|
flexbar
|
preprocesses high-throughput sequencing data efficiently
Keywords:
High-throughput sequencing
|
|
Flye
|
fast and accurate de novo assembler for single molecule sequencing reads.
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
fpa
|
Filter Pairwise Alignment filter long read mapping information to save disk space
Keywords:
High-throughput sequencing
|
|
fqgrep
|
is an approximate sequence pattern matcher for FASTQ/FASTA files.
Keywords:
High-Throughput Sequencing
|
|
fqtools
|
an efficient FASTQ manipulation suite.
Keywords:
High-Throughput Sequencing
|
|
freebayes
|
Bayesian haplotype-based polymorphism discovery and genotyping.
|
|
FsnViz
|
Tool for plotting gene fusion events detected by various tools using Circos.
Keywords:
Visualization
High-Throughput Sequencing
|
|
GATK
|
(Genome Analysis Toolkit) a software package developed to analyze high-throughput sequencing data capable of taking on projects of any size with a primary focus on variant discovery, genotyping, and data quality assurance.
|
|
GCEN
|
a command-line toolkit that allows biologists to easily build gene co-expression network and predict gene function, especially in RNA-Seq research or lncRNAs annotation
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
Genepop
|
a population genetics package that computes exact tests for Hardy-Weinberg equilibrium, for population differentiation and for genotypic disequilibrium among pairs of loci; computes estimates of F-statistics, null allele frequencies, allele size-based statistics for microsatellites, etc.; and performs analyses of isolation by distance from pairwise comparisons of individuals or population samples, including confidence intervals for “neighborhood size”.
|
|
Genion
|
Characterizing gene fusions using long transcriptomics reads
Keywords:
High-throughput sequencing
|
|
geofetch
|
Downloads data and metadata from GEO and SRA and creates standard PEPs.
Keywords:
High-Throughput Sequencing
|
|
gfastats
|
gfastats is a single fast and exhaustive tool for summary statistics and simultaneous *fa* (fasta, fastq, gfa [.gz]) genome assembly file manipulation. gfastats also allows seamless fasta<>fastq<>gfa[.gz] conversion. It has been tested in genomes even >100Gbp.
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
gffcompare
|
compares and evaluates the accuracy of RNA-Seq transcript assemblers (Cufflinks, Stringtie), collapses (merges) duplicate transcripts from multiple GTF/GFF3 files (e.g. resulted from assembly of different samples), and classifies transcripts from one or multiple GTF/GFF3 files as they relate to reference transcripts provided in a annotation file (also in GTF/GFF3 format).
|
|
gffread
|
validates, filters, converts and performs various other operations on GFF files (use gffread -h to see the various usage options). Because the program shares the same GFF parser code with Cufflinks, Stringtie, and gffcompare, it could be used to verify that a GFF file from a certain annotation source is correctly "understood" by these programs. Thus the gffread utility can be used to simply …
|
|
ghostz
|
is a highly efficient remote homologue detection tool.
|
|
GimmeMotifs
|
a suite of motif tools, including a motif prediction pipeline for ChIP-seq experiments.
Keywords:
ChIP-Sequencing
High-Throughput Sequencing
|
|
glimpse‑bio
|
GLIMPSE is a phasing and imputation method for large-scale low-coverage sequencing studies.
Keywords:
High-throughput sequencing
|
|
GMAP
|
Genomic mapping and alignment program for mRNA and EST sequences.
|
|
GNUVID
|
(GNU-based Virus IDentification) a Python3 program for Gene Novelty Unit-based Virus Identification for SARS-CoV-2. It ranks CDS nucleotide sequences in a genome fna file based on the number of observed exact CDS nucleotide matches in a public or private database. It was created to type SARS-CoV-2 genomes using a whole genome multilocus sequence typing (wgMLST) approach.
Keywords:
DNA-Sequencing
High-Throughput Sequencing
|
|
Goalign
|
a set of command line tools to manipulate multiple alignments. Implemented in Go language, Goalign aims to handle multiple alignments in Phylip, Fasta, Nexus, and Clustal formats, through several basic commands. Each command may print result (an alignment, for example) in the standard output, and thus can be piped to the standard input of the next goalign command.
|
|
gofasta
|
provides functions for working on alignments in fasta format.
|
|
GoldRush
|
memory-efficient de novo assembly of long reads
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
goleft
|
goleft is a collection of bioinformatics tools written in go distributed together as a single binary.
Keywords:
High-throughput sequencing
|
|
GraphAligner
|
Sequence to graph aligner for long reads
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
GraphMap
|
GraphMap is a novel mapper targeted at aligning long, error-prone third-generation sequencing data.
It is designed to handle Oxford Nanopore MinION 1d and 2d reads with very high sensitivity and accuracy, and also presents a significant improvement over the state-of-the-art for PacBio read mappers.
Keywords:
DNA-Sequencing
High-throughput sequencing
|
|
GraphMap2
|
GraphMap2 update containins tuning of alignments specific for long RNA reads.
GraphMap2 is a novel mapper targeted at aligning long, error-prone third-generation sequencing data.
It is designed to handle Oxford Nanopore MinION 1d and 2d reads with very high sensitivity and accuracy, and also presents a significant improvement over the state-of-the-art for PacBio read mappers.
Keywords:
DNA-Sequencing
High-throughput sequencing
|
|
GROOT
|
GROOT is a tool to type Antibiotic Resistance Genes (ARGs) in metagenomic samples (a.k.a. Resistome Profiling). It combines variation graph representation of gene sets with an LSH indexing scheme to allow for fast classification of metagenomic reads. Subsequent hierarchical local alignment of classified reads against graph traversals facilitates accurate reconstruction of full-length gene sequences using a simple scoring scheme.
|
|
gsMap
|
gsMap (genetically informed spatial mapping of cells for complex traits) integrates spatial transcriptomics (ST) data with genome-wide association study (GWAS) summary statistics to map cells to human complex traits, including diseases, in a spatially resolved manner.
|
|
gw
|
a fast browser for genomic sequencing data (.bam/.cram format) used directly from the terminal. GW also allows you to view and annotate variants from vcf/bcf files.
Keywords:
High-throughput sequencing
|
|
hera
|
a bioinformatics tool that helps analyze RNA-seq data, providing base-to-base alignment BAM files, transcript abundance estimation, and fusion gene detection.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
HHsuite
|
an open-source software package for sensitive protein sequence searching based on the pairwise alignment of hidden Markov models (HMMs).
|
|
HiCExplorer
|
is a set of programs to process, normalize, analyze and visualize Hi-C and cHi-C data.
Keywords:
Hi-C
High-Throughput Sequencing
|
|
hichip‑peaks
|
A package that can be used to find enriched peak regions from HiChIP datasets that can then be used as an input to available loop calling tools or to do differential peak analysis.
Keywords:
ChIP-Sequencing
High-Throughput Sequencing
|
|
HiCPro
|
An optimized and flexible pipeline for Hi-C data processing
Keywords:
Hi-C
High-Throughput Sequencing
|
|
HiCUP
|
A tool for mapping and performing quality control on Hi-C data
Keywords:
Hi-C
High-Throughput Sequencing
|
|
Hifiasm
|
Haplotype-resolved assembler for accurate Hifi reads
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
hifiasm_meta
|
Metagenome assembler for Hifi reads, based on hifiasm.
|
|
HiLine
|
HiC alignment and classification pipeline.
Keywords:
Hi-C
High-Throughput Sequencing
|
|
HISAT2
|
(Hierarchical Indexing for Spliced Alignment of Transcripts) a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) against the general human population (as well as against a single reference genome). HISAT2 is a successor to both HISAT and TopHat2.
|
|
HOMER
|
(Hypergeometric Optimization of Motif EnRichment) a suite of sequencing analysis and sequence motif discovery tools.
|
|
HTSeq
|
a Python package that provides infrastructure to process data from high-throughput sequencing assays.
|
|
HTSlib
|
a C library for reading/writing high-throughput sequencing data.
Keywords:
High-throughput sequencing
|
|
htstream
|
is a quality control and processing pipeline for High Throughput Sequencing data.
|
|
HULK
|
(Histosketching Using Little Kmers) a tool that creates small, fixed-size sketches from streaming microbiome sequencing data, enabling rapid metagenomic dissimilarity analysis.
|
|
HUMAnN2
|
is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads).
|
|
humann3
|
is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads).
|
|
igvtools
|
command line tools for IGV
Keywords:
High-throughput sequencing
|
|
IMSEQ
|
(IMmunogenetic SEQuence Analysis) is a fast, PCR and sequencing error aware tool to analyze high throughput data from recombined T-cell receptor or immunoglobolin gene sequencing experiments. It derives immune repertoires from sequencing data in FASTA / FASTQ format.
Keywords:
Genomics
High-throughput sequencing
|
|
InSilicoSeq
|
A sequencing simulator.
Keywords:
High-throughput sequencing
|
|
IntaRNA
|
efficient RNA-RNA interaction prediction incorporating seeding and accessibility of interacting sites.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
IQ‑TREE
|
efficient and versatile phylogenomic software by maximum likelihood.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
isoseq3
|
Scalable De Novo Isoform Discovery
Keywords:
PacBio Sequencing
High-Throughput Sequencing
|
|
IsoTree
|
an efficient de novo trascriptome assembler for RNA-Seq data. It can assemble transcripts from RNA-Seq reads (in fasta format). Unlike most of de novo assembly methods that build de Bruijn graph or splicing graph by connecting k-mers which are sets of overlapping substrings generated from reads, IsoTree constructs splicing graph by connecting reads directly. For each splicing graph, IsoTree applies an iterative scheme of …
|
|
ivar
|
is a computational package that contains functions broadly useful for viral amplicon-based sequencing.
|
|
Juicer
|
a one-click pipeline for processing terabase scale Hi-C datasets. Using Juicer, you can:
Go from raw fastq files to Hi-C maps binned at many resolutions
Automatically annotate loops and contact domains with the Juicer tools
Run the pipeline in the cloud, on LSF, Univa, or SLURM, or on a single CPU
Juicer creates hic files from raw (unaligned) reads derived from a Hi-C experiment.
Keywords:
Hi-C
High-throughput sequencing
|
|
Kaiju
|
fast and sensitive taxonomic classification for metagenomics.
|
|
kalign2
|
a fast and accurate multiple sequence alignment algorithm designed to align large numbers of protein sequences.
|
|
kallisto
|
a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
Kleborate
|
Kleborate: a tool for typing and screening pathogen genome assemblies
|
|
km
|
software for RNA-seq investigation using k-mer decomposition
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
kma
|
implements a method designed to map raw reads directly against redundant databases, in an ultra-fast manner using seed and extend. KMA is particulary good at aligning high quality reads against highly redundant databases, where unique matches often does not exist. It works for long low quality reads as well, such as those from Nanopore. Non-unique matches are resolved using the "ConClave" sorting scheme, and …
Keywords:
High-throughput sequencing
Read Alignment
|
|
KMC
|
KMC—K-mer Counter is a utility designed for counting k-mers (sequences of consecutive k symbols) in a set of reads from genome sequencing projects. K-mer counting is important for many bioinformatics applications, e.g., developing de Bruijn graph assemblers. Building de Bruijn graphs is a commonly used approach for genome assembly with data from second-generation sequencer. Unfortunately, sequencing errors (frequent in practice) results in huge memory …
Keywords:
High-throughput sequencing
|
|
KMCP
|
accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping
|
|
KneadData
|
is a tool designed to perform quality control on metagenomic sequencing data, especially data from microbiome experiments.
|
|
Kraken 2
|
a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.
|
|
krakenuniq
|
Metagenomics classifier with unique k-mer counting for more specific results
|
|
LAST
|
finds & aligns related regions of sequences. LAST is designed for moderately large data (e.g. genomes, DNA reads, proteomes).
|
|
lastz
|
LASTZ is a program for aligning DNA sequences, a pairwise aligner.
Keywords:
DNA-Sequencing
High-Throughput Sequencing
|
|
LCA
|
Lowest Common Ancestor calculation tool
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
leafcutter
|
Leafcutter quantifies RNA splicing variation using short-read RNA-seq data.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
LevioSAM2
|
Fast and accurate coordinate conversion between assemblies
Keywords:
High-throughput sequencing
|
|
Lighter
|
a kmer-based error correction method for whole genome sequencing data.
|
|
lima
|
is the standard tool to identify barcode and primer sequences in PacBio single-molecule sequencing data.
Keywords:
PacBio Sequencing
High-Throughput Sequencing
|
|
locarna
|
Tools for the structural analysis of RNA
|
|
LongGF
|
a computational algorithm and software tool for fast and accurate detection of gene fusion by long-read transcriptome sequencing
Keywords:
Genome Annotation
High-Throughput Sequencing
|
|
LongReadSum
|
LongReadSum supports FASTA, FASTQ, BAM, FAST5, and sequencing_summary.txt file formats for quick generation of QC data in HTML and text format.
|
|
Longshot
|
a variant calling tool for diploid genomes using long error prone reads such as Pacific Biosciences (PacBio) SMRT and Oxford Nanopore Technologies (ONT).
Keywords:
Variant Analysis
High-Throughput Sequencing
|
|
lordec
|
A hybrid error correction program for long, PacBio reads
Keywords:
PacBio Sequencing
High-Throughput Sequencing
|
|
lorikeet
|
is a tool for digital spoligotyping of MTB strains from Illumina read data.
Keywords:
High-throughput sequencing
|
|
LRez
|
Standalone tool and library for working with barcoded linked-reads.
Keywords:
High-throughput sequencing
|
|
MACS2
|
(Model Based Analysis of ChIP-Seq data) a novel algorithm for identifying transcript factor binding sites.
|
|
MACS3
|
Model Based Analysis for ChIP-Seq data.
Keywords:
ChIP-Sequencing
High-Throughput Sequencing
|
|
MapCaller
|
An efficient and versatile approach for short-read alignment and variant detection in high-throughput sequenced genomes.
Keywords:
High-throughput sequencing
|
|
MAPS
|
MAPS (Model-based Analysis of PLAC-Seq data) pipeline is a a set of multiple scripts used to analyze PLAC-Seq and HiChIP data.
|
|
MAPseq
|
a set of fast and accurate sequence read classification tools designed to assign taxonomy and OTU classifications to ribosomal RNA sequences. This is done by using a reference set of full-length ribosomal RNA sequences for which known taxonomies are known, and for which a set of high quality OTU clusters has been previously generated. For each read, the best guess and corresponding confidence in …
|
|
maq
|
(Mapping and Assembly with Qualities) builds mapping assemblies from short reads generated by the next-generation sequencing machines.
|
|
Mash
|
is a fast sequence distance estimator that uses the MinHash algorithm and is designed to work with genomes and metagenomes in the form of assemblies or reads.
|
|
MashMap
|
A fast approximate aligner for long DNA sequences.
Keywords:
High-throughput sequencing
|
|
mbg
|
Minimizer based sparse de Bruijn graph constructor.
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
mbgc
|
(Multiple Bacteria Genome Compressor) is a tool for compressing genomes in FASTA (or gzipped FASTA) input format.
|
|
medaka
|
a tool to create consensus sequences and variant calls from nanopore sequencing data.
|
|
mega2
|
(Manipulation Environment for Genetic Analyses) - data-handling program for facilitating genetic linkage and association analyses.
Keywords:
GWAS Analysis
High-Throughput Sequencing
|
|
megadepth
|
Megadepth is an efficient tool for extracting coverage related information from RNA and DNA-seq BAM and BigWig files.
Keywords:
High-throughput sequencing
|
|
MEGAHIT
|
an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.
|
|
mentalist
|
MLST (multi-locus sequence typing) is a classic technique for genotyping bacteria, widely applied for pathogen outbreak surveillance.
|
|
MetaEuk
|
a modular toolkit designed for large-scale gene discovery and annotation in eukaryotic metagenomic contigs.
|
|
metagenome‑atlas
|
ATLAS - Three commands to start analysing your metagenome data
|
|
MetaGraph
|
The MetaGraph framework allows for indexing and analysis of very large biological sequence collections, producing compressed indexes that can represent several petabases of input data. The indexes can be efficiently queried with any query sequence of interest.
Keywords:
Genome Assembly
High-throughput sequencing
|
|
MetaPhlAn
|
Metagenomic Phylogenetic Analysis
|
|
MetaPhlAn2
|
(Metagenomic Phylogenetic Analysis) is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.
|
|
Metaphor
|
Metagenomic Pipeline for Short Reads
|
|
MethylDackel
|
MethylDackel will process a coordinate-sorted and indexed BAM or CRAM file containing some form of BS-seq alignments and extract per-base methylation metrics from them. MethylDackel requires an indexed fasta file containing the reference genome as well.
Keywords:
bisulfite-Seq
High-throughput sequencing
|
|
MICA
|
(Metagenomic Inquiry Compressive Acceleration) a family of programs for performing compressively-accelerated metagenomic sequence searches based on BLASTX and DIAMOND.
|
|
minialign
|
Minialign is a little bit fast and moderately accurate nucleotide sequence alignment tool designed for PacBio and Nanopore long reads. It is built on three key algorithms, minimizer-based index of the minimap overlapper, array-based seed chaining, and SIMD-parallel Smith-Waterman-Gotoh extension.
Keywords:
High-throughput sequencing
|
|
miniasm
|
Miniasm is a very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final unitig sequences. Thus the per-base error rate is similar to …
Keywords:
Genome Assembly
High-throughput sequencing
|
|
Minimap2
|
is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of ≥100 bp in length, ≥1 kb genomic reads at error rate ∼15%, full-length noisy Direct RNA or cDNA reads and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap …
Keywords:
High-throughput sequencing
Read Alignment
|
|
minorseq
|
PacBio Minor Variant Calling and Phasing Tools
Keywords:
PacBio Sequencing
High-Throughput Sequencing
|
|
MIRA
|
whole genome shotgun and EST sequence assembler for Sanger, 454, Solexa (Illumina), IonTorrent data and PacBio (the later at the moment only CCS and error-corrected CLR reads).
Keywords:
Genome Assembly
High-throughput sequencing
|
|
MISO
|
a probabilistic framework that quantitates the expression level of alternatively spliced genes from RNA-Seq data, and identifies differentially regulated isoforms or exons across samples.
MISO is installed as a standalone program and as a module within python.
|
|
mlst
|
scan contig files against PubMLST typing schemes.
Keywords:
High-throughput sequencing
|
|
mmquant
|
RNA-Seq quantification tool, with special handling on multi-mapping reads.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
MMseqs2
|
an ultra fast and sensitive sequence search and clustering suite
|
|
MOB‑suite
|
MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies. The MOB-suite is designed to be a modular set of tools for the typing and reconstruction of plasmid sequences from WGS assemblies.
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
Monocle3
|
An analysis toolkit for single-cell RNA-seq.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
MOODS
|
MOODS is a collection of algorithms used to match position weight matrices (PWM) with DNA sequences.
Keywords:
DNA-Sequencing
High-Throughput Sequencing
|
|
mosdepth
|
fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.
Keywords:
High-throughput sequencing
|
|
mothur
|
a project to develop a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community. Includes accelerated versions of DOTUR and SONS and the functionality of a number of other popular tools.
|
|
mOTUs
|
marker gene-based OTU (mOTU) profiling.
|
|
msamtools
|
microbiome-related extension to samtools
|
|
mudskipper
|
is a tool for converting genomic BAM/SAM files to transcriptomic BAM/RAD files.
Keywords:
High-Throughput Sequencing
|
|
MultiQC
|
aggregates results from bioinformatics analyses across many samples into a single report.
Keywords:
High-throughput sequencing
|
|
MUMmer
|
a versatile alignment tool for DNA and protein sequences.
|
|
mwga‑utils
|
collection of utilities for processing Multispecies Whole Genome Alignments
|
|
Mykrobe
|
antibiotic resistance prediction in minutes.
|
|
NanoComp
|
Comparing runs of Oxford Nanopore sequencing data and alignments
Keywords:
Nanopore
High-Throughput Sequencing
|
|
nanoDoc
|
RNA modification detection using Nanopore raw reads with Deep One Class classification.
Keywords:
Nanopore
High-Throughput Sequencing
|
|
NanoFilt
|
Filtering and trimming of long read sequencing data.
Keywords:
High-throughput sequencing
|
|
NanoPack
|
a set of tools developed for visualization and processing of long-read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences.
Keywords:
High-throughput sequencing
PacBio Sequencing
|
|
nanoplexer
|
a standard tool to demultiplex Nanopore long read sequencing data.
Keywords:
High-throughput sequencing
Nanopore
|
|
NanoPlot
|
Plotting tool for long read sequencing data and alignments.
Keywords:
Visualization
High-Throughput Sequencing
|
|
Nanopolish
|
software package for signal-level analysis of Oxford Nanopore sequencing data.
Keywords:
Nanopore
High-Throughput Sequencing
|
|
nanoq
|
Ultra-fast quality control and summary reports for nanopore reads
Keywords:
Nanopore
High-Throughput Sequencing
|
|
NanoQC
|
Create fastQC-like plots for Oxford Nanopore sequencing data
Keywords:
Nanopore
High-Throughput Sequencing
|
|
NanoSim
|
NanoSim is a fast and scalable read simulator for Nanopore sequencing data.
Keywords:
Nanopore
High-Throughput Sequencing
|
|
NanoStat
|
calculates various statistics from a long read sequencing dataset in fastq, bam or albacore sequencing summary format.
Keywords:
High-throughput sequencing
Nanopore
|
|
NanoVar
|
a genomic structural variant (SV) caller that utilizes low-depth long-read sequencing such as Oxford Nanopore Technologies (ONT).
|
|
Nextalign
|
Viral genome sequence alignment tool
|
|
nextclade
|
SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks
|
|
Nextstrain
|
real-time tracking of pathogen evolution.
|
|
NGLess
|
(NGS Processing with Less Work) enables creation of a pipeline of work for all the first phase of NGS analysis until the point (inclusive) of annotation.
|
|
NGMLR
|
(coNvex Gap-cost alignMents for Long Reads) a long-read mapper designed to sensitively align PacBilo or Oxford Nanopore to (large) reference genomes.
Keywords:
Nanopore
High-Throughput Sequencing
|
|
ngsplot
|
Quick mining and visualization of NGS data by integrating genomic databases
Keywords:
Visualization
High-Throughput Sequencing
|
|
ninja‑nj
|
Nearly Infinite Neighbor Joining Application
|
|
OCOCO
|
the first program capable of inferring variants in a real-time, as read alignments are fed in. Ococo inputs unsorted alignments from a stream and infers single-nucleotide variants, together with a genomic consensus, using statistics stored in compact several-bit counters.
Keywords:
Variant Analysis
High-Throughput Sequencing
|
|
OLego
|
OLego is a program specifically designed for de novo spliced mapping of mRNA-seq reads.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
Oncofuse
|
Oncofuse is a framework designed to estimate the oncogenic potential of de-novo discovered gene fusions. It uses several hallmark features and employs a bayesian classifier to provide the probability of a given gene fusion being a driver mutation.
Keywords:
Genomics
High-throughput sequencing
|
|
ont_fast5_api
|
is a simple interface to HDF5 files of the Oxford Nanopore .fast5 file format.
Keywords:
High-Throughput Sequencing
|
|
ORF‑RATER
|
(Open Reading Frame - Regression Algorithm for Translational Evaluation of Ribosome-protected footprints) comprises a series of scripts for coding sequence annotation based on ribosome profiling data.
Keywords:
High-throughput sequencing
|
|
origami
|
a pipeline for processing and calling high-confidence chromatin loops associated with the ChIPped factor.
Keywords:
ChIP-Sequencing
High-Throughput Sequencing
|
|
OrthoFinder
|
a fast, accurate and comprehensive platform for comparative genomics, OrthoFinder is accurate inference of orthogroups, orthologues, gene trees and rooted species tree made easy!
Keywords:
High-throughput sequencing
|
|
pairix
|
2D indexing on bgzipped text files of paired genomic coordinates
Keywords:
High-throughput sequencing
|
|
pairtools
|
CLI tools to process mapped Hi-C data
Keywords:
Hi-C
High-Throughput Sequencing
|
|
PAML
|
A package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
panacus
|
a counting tool for pangenome graphs. It supports GFA files with P and W lines, but requires that the graph is blunt, i.e., nodes do not overlap and consequently, each link (L) points from the end of one segment (S) to the start of another.
|
|
pangolin
|
(Phylogenetic Assignment of Named Global Outbreak LINeages) software package for assigning SARS-CoV-2 genome sequences to global lineages.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
Pangolin‑DL
|
a deep-learning based method for predicting splice site strengths.
Keywords:
High-throughput sequencing
|
|
PASTA
|
is an implementation of the PASTA (Practical Alignment using Saté and TrAnsitivity) algorithm.
|
|
pbalign
|
pbalign aligns PacBio reads to reference sequences, filters aligned reads according to user-specific filtering criteria, and converts the output to either the SAM format or PacBio Compare HDF5 (e.g., .cmp.h5) format. The output Compare HDF5 file will be compatible with Quiver if --forQuiver option is specified.
Keywords:
High-throughput sequencing
PacBio Sequencing
|
|
pbbam
|
a package that provides components to create, query, & edit PacBio BAM files and associated indices. These components include a core C++ library, bindings for additional languages, and command-line utilities.
Keywords:
High-throughput sequencing
PacBio Sequencing
|
|
pbipa
|
IPA HiFi Genome Assembler
Keywords:
PacBio Sequencing
High-Throughput Sequencing
|
|
pbmm2
|
pbmm2 is a SMRT C++ wrapper for minimap2's C API. Its purpose is to support native PacBio in- and output, provide sets of recommended parameters, generate sorted output on-the-fly, and postprocess alignments. Sorted output can be used directly for polishing using GenomicConsensus, if BAM has been used as input to pbmm2. Benchmarks show that pbmm2 outperforms BLASR in mapped concordance, number of mapped bases, …
|
|
PBSIM2
|
PBSIM2: a simulator for long read sequencers with a novel generative model of quality scores
Keywords:
High-throughput sequencing
|
|
pbsv
|
PacBio structural variant (SV) calling and analysis tools
|
|
Peakhood
|
a tool that takes a set of CLIP-seq peak regions and for each region, individually extracts the most likely site context (transcript or genomic).
Keywords:
CLIP-Seq Analysis
High-Throughput Sequencing
|
|
PEER
|
a collection of Bayesian approaches to infer hidden determinants and their effects from gene expression profiles using factor analysis methods.
|
|
perbase
|
Per-base metrics on BAM/CRAM files.
Keywords:
High-throughput sequencing
|
|
PFP
|
Tool to build the parse and the dictionary for VCF files using the approach described in Prefix-Free Parsing for Building Big BWTs
Keywords:
High-throughput sequencing
|
|
phantompeakqualtools
|
Phantompeakqualtools computes informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data. It can also be used to obtain robust estimates of the predominant fragment length or characteristic tag shift values in these assays.
Keywords:
ChIP-Sequencing
High-throughput sequencing
|
|
phASER
|
(phasing and Allele Specific Expression from RNA-seq) performs haplotype phasing using read alignments in BAM format from both DNA and RNA based assays, and provides measures of haplotypic expression for RNA based assays.
|
|
PhiSpy
|
Prophage finder using multiple metrics
Keywords:
High-throughput sequencing
|
|
PhyloPhlAn
|
PhyloPhlAn is an integrated pipeline for large-scale phylogenetic profiling of genomes and metagenomes.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
phyluce
|
(phy-loo-chee) is a software package that is useful for analyzing both data collected from UCE loci and also data collection from other types of loci for phylogenomic studies at the species, population, and individual levels.
|
|
Picard
|
a set of Java command line tools for manipulating high-throughput sequencing (HTS) data and formats.
|
|
picard‑slim
|
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Keywords:
High-throughput sequencing
|
|
Pilon
|
Pilon is a software tool which can be used to automatically improve draft assemblies and find variation among strains, including large event detection.
|
|
Piranha
|
is a peak-caller for CLIP- and RIP-Seq data. It takes input in BED or BAM format and identifies regions of statistically significant read enrichment.
Keywords:
CLIP-Seq Analysis
High-Throughput Sequencing
|
|
pixelator
|
A commandline tool and library to process and analyze sequencing data from Molecular Pixelation (MPX) assays.
Keywords:
High-throughput sequencing
|
|
pLannotate
|
is web server for automatically annotating engineered plasmids.
Keywords:
Genome Annotation
High-Throughput Sequencing
|
|
PLASS
|
(Protein-Level ASSembler) a software to assemble short read sequencing data on a protein level.
|
|
plassembler
|
Quickly and accurately assemble plasmids in hybrid sequenced bacterial isolates
Keywords:
High-throughput sequencing
|
|
plastid
|
Plastid is a Python library designed specifically for nucleotide-resolution analysis of genomics and NGS data.
Keywords:
High-throughput sequencing
|
|
platon
|
Plasmid contig classification and characterization for short read draft assemblies.
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
pomoxis
|
Assembly, consensensus, and analysis tools by ONT research
Keywords:
Nanopore
High-Throughput Sequencing
|
|
popscle
|
is a suite of population scale analysis tools for single-cell genomics data.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
poretools
|
a toolkit for working with nanopore sequencing data from Oxford Nanopore
Keywords:
Nanopore
High-Throughput Sequencing
|
|
PRANK
|
a probabilistic multiple alignment program for DNA, codon and amino-acid sequences.
Keywords:
Read Alignment
High-Throughput Sequencing
|
|
preseq
|
a tool aimed at predicting the yield of distinct reads from a genomic library from an initial sequencing experiment. The estimates can then be used to examine the utility of further sequencing, optimize the sequencing depth, or to screen multiple libraries to avoid low complexity samples.
Keywords:
High-throughput sequencing
|
|
Presto
|
A bioinformatics toolkit for processing high-throughput lymphocyte receptor sequencing data.
Keywords:
High-throughput sequencing
|
|
Prokka
|
a software tool to annotate bacterial, archaeal and viral genomes quickly and produce standards-compliant output files.
Keywords:
Genome Annotation
High-Throughput Sequencing
|
|
ProPhyle
|
ProPhyle is an accurate, resource-frugal and deterministic phylogeny-based metagenomic classifier.
|
|
Proteinortho
|
a tool to detect orthologous genes within different species.
|
|
pybedtools
|
pybedtools wraps and extends BEDTools and offers feature-level manipulations from within Python.
Keywords:
High-throughput sequencing
|
|
PyEnsembl
|
a Python interface to Ensembl reference genome metadata such as exons and transcripts. PyEnsembl downloads GTF and FASTA files from the Ensembl FTP server and loads them into a local database. PyEnsembl can also work with custom reference data specified using user-supplied GTF and FASTA files.
Keywords:
High-throughput sequencing
|
|
pygtftk
|
(Python GTF toolkit) a suite providing facilities to manipulate genomic annotations in gtf format.
Keywords:
Genome Annotation
High-Throughput Sequencing
|
|
Pysam
|
a python module that makes it easy to read and manipulate genomic data sets. It is a lightweight wrapper of the htslib C-API; it provides facilities to read and write SAM/BAM/VCF/BCF/BED/GFF/GTF/FASTA/FASTQ files as well as access to the command line functionality of the SAMtools and BCFtools packages.
Pysam is installed as a module within python.
|
|
pySCENIC
|
is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
QIIME 2
|
a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results.
|
|
Qualimap
|
a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.
Keywords:
High-throughput sequencing
Read Alignment
|
|
Quartz
|
(QUAlity score Reduction at Terabyte scale) an efficient de novo quality score compression tool based on traversing the k-mer landscape of NGS read datasets.
|
|
QUAST
|
(QUality ASsessment Tool) evaluates genome assemblies by computing various metrics, including N50, length for which the collection of all contigs of that length or longer covers at least 50% of assembly length; NG50, where length of the reference genome is being covered; NA50 and NGA50, where aligned blocks instead of contigs are taken; misassemblies, misassembled and unaligned contigs or contigs bases; and genes and …
|
|
QuickTree
|
an efficient implementation of the Neighbor-Joining algorithm.
Keywords:
High-throughput sequencing
|
|
Quip
|
compresses next-generation sequencing data with extreme prejudice.
Keywords:
High-Throughput Sequencing
|
|
Racon
|
ultrafast consensus module for raw de novo genome assembly of long uncorrected reads.
Keywords:
High-throughput sequencing
|
|
RapMap
|
rapid sensitive and accurate read mapping via quasi-mapping.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
rasusa
|
Randomly subsample sequencing reads to a specified coverage.
Keywords:
High-throughput sequencing
|
|
Raven
|
a de novo genome assembler for long uncorrected reads.
|
|
RAxML
|
(Randomized Axelerated Maximum Likelihood) a tool for phylogenetic analysis and post-analysis of large phylogenies.
|
|
RAxML‑NG
|
a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion.
|
|
razers3
|
faster, fully sensitive read mapping.
Keywords:
Read Alignment
High-Throughput Sequencing
|
|
RBPBench
|
RBPBench is multi-function tool to evaluate CLIP-seq and other genomic region data using a comprehensive collection of known RNA-binding protein (RBP) binding motifs.
Keywords:
CLIP-Seq Analysis
High-Throughput Sequencing
|
|
Recentrifuge
|
Robust comparative analysis and contamination removal for metagenomics
|
|
refgenie
|
(reference genome manager) manages storage, access, and transfer of reference genome resources.
Keywords:
Genomics
High-Throughput Sequencing
|
|
regtools
|
is a set of tools that integrate DNA-seq and RNA-seq data to help interpret mutations in a regulatory and splicing context.
|
|
RingMapper
|
a code for performing RING-MaP and PAIR-MaP analysis.
|
|
RNAblueprint
|
The RNAblueprint library solves the problem of stochastically sampling RNA/DNA sequences compatible to multiple structural constraints.
|
|
rna‑map
|
An open-source tool for rapid analysis of RNA mutational profiling (MaP) experiments.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
RNA‑SeQC
|
fast, efficient RNA-Seq metrics for quality control and process optimization.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
rnashapes
|
RNAshape abstraction maps structures to a tree-like domain of shapes.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
Roary
|
Takes annotated assemblies in GFF3 format and calculates the pan genome.
|
|
RSEM
|
(RNA-Seq by Expectation-Maximization) a software package for estimating gene and isoform expression levels from RNA-Seq data.
|
|
RSeQC
|
(RNA-seq Quality Control Package) provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data.
|
|
RTG Tools
|
(RealTimeGenomics Tools) utilities for accurate VCF comparison and manipulation.
Keywords:
Variant Analysis
High-Throughput Sequencing
|
|
rustybam
|
is a bioinformatics toolkit written in the rust programing language focused around manipulation of alignment (bam and PAF), annotation (bed), and sequence (fasta and fastq) files.
Keywords:
High-throughput sequencing
|
|
Rustyread
|
Rustyread, a long-read simulator
Keywords:
High-throughput sequencing
|
|
Ryuto
|
Network-Flow based Transcriptome Reconstruction
Keywords:
Transcriptomics
High-Throughput Sequencing
|
|
Sailfish
|
enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
Salmon
|
a tool for quantifying the expression of transcripts using RNA-seq data. Salmon uses algorithms to provide very quick, accurate expression estimates using little memory and performs inference using an expressive and realistic model of RNA-seq data that takes into account experimental attributes and biases commonly observed in real RNA-seq data.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
Sambamba
|
a high performance, highly parallel, robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files. Because of its efficiency, it is an important work horse running in many sequencing centres around the world today.
Keywords:
High-throughput sequencing
|
|
samblaster
|
a fast, flexible program for marking duplicates in read-id grouped1 paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file.
Keywords:
High-throughput sequencing
|
|
SAMtools
|
(Sequence Alignment/Map) a generic format for storing large nucleotide sequence alignments that provides various utilities for manipulating alignments, including sorting, merging, indexing and generating alignments in a per-position format.
|
|
scAllele
|
scAllele is a versatile tool to detect and analyze nucleotide variants in scRNA-seq.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
Scallop
|
Scallop is an accurate reference-based transcript assembler.
|
|
Scallop‑LR
|
reference-based transcriptome assembler for long-reads RNA-seq data
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
Scalpel
|
a software package for detecting INDELs.
|
|
scCODA
|
scCODA is a toolbox for statistical models to analyze changes in compositional data, especially from single-cell RNA-seq experiments.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
scDRS
|
(single-cell disease-relevance score) is a method for associating individual cells in scRNA-seq data with disease GWASs, built on top of AnnData and Scanpy.
|
|
scMatch
|
is a single-cell gene expression profile annotation tool using reference datasets.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
scrm
|
a coalescent simulator for biological sequences.
Keywords:
High-Throughput Sequencing
|
|
scVelo
|
is a scalable toolkit for RNA velocity analysis in single cells.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
sdm
|
simple demultiplex tool for FASTQ demultiplexing and dereplication.
Keywords:
High-throughput sequencing
|
|
SEACR
|
Sparse Enrichment Analysis for CUT&RUN
Keywords:
ChIP-Sequencing
High-Throughput Sequencing
|
|
SECAPR
|
Process sequence-capture FASTQ files into alignments for phylogenetic analyses. Integrates allele phasing.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
segemehl
|
a software to map short sequencer reads to reference genomes.
Keywords:
Read Alignment
High-Throughput Sequencing
|
|
SEPP
|
(SATe-enabled Phylogenetic Placement) addresses the problem of phylogenetic placement of short reads into reference alignments and trees.
|
|
seqcomplexity
|
calculates Per-Read and Total Sequence Complexity from FastQ file.
|
|
SeqFu
|
(Sequece Fastx Utilities) a general-purpose program to manipulate and parse information from FASTA/FASTQ files.
Keywords:
High-throughput sequencing
|
|
SeqKit
|
a cross-platform ultrafast comprehensive toolkit for FASTA/Q processing.
Keywords:
High-throughput sequencing
|
|
seqMINER
|
an integrated ChIP-seq data interpretation platform.
Keywords:
ChIP-Sequencing
High-Throughput Sequencing
|
|
SeqPrep
|
SeqPrep is a program to merge paired end Illumina reads that are overlapping into a single longer read. It may also just be used for its adapter trimming feature without doing any paired end overlap.
|
|
seqtk
|
a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. It seamlessly parses both FASTA and FASTQ files, which can also be optionally compressed by gzip.
Keywords:
High-throughput sequencing
|
|
Severus
|
Severus is a somatic structural variation (SV) caller for long reads (both PacBio and ONT).
|
|
shark
|
Mapping-free filtering of useless RNA-Seq reads
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
shorah
|
Short Reads Assembly into Haplotypes (ShoRAH) program for inferring viral haplotypes from NGS data
|
|
Sickle
|
a windowed adaptive trimming tool for FASTQ files using quality.
|
|
simpleaf
|
simpleaf is a rust framework to make using alevin-fry even simpler.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
SINA
|
SINA aligns nucleotide sequences to match a pre-existing MSA using a graph based alignment algorithm similar to PoA. The graph approach allows SINA to incorporate information from many reference sequences building without blurring highly variable regions. While pure NAST implementations depend highly on finding a good match in the reference database, SINA is able to align sequences relatively distant to references with good quality …
Keywords:
High-throughput sequencing
RNA-Sequencing
|
|
SKA2
|
SKA2 - Split k-mer analysis (version 2) uses exact matching of split k-mer sequences to align closely related sequences, typically small haploid genomes such as bacteria and viruses.
Keywords:
High-throughput sequencing
|
|
skDER
|
efficient & high-resolution dereplication of microbial genomes
|
|
SKESA
|
(Strategic Kmer Extension for Scrupulous Assemblies) a de-novo sequence read assembler for microbial genomes.
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
skewer
|
implements the bit-masked k-difference matching algorithm dedicated to the task of adapter trimming. It is specially designed for processing next-generation sequencing (NGS) paired-end sequences.
|
|
slow5tools
|
a simple toolkit for converting (FAST5 <-> SLOW5), compressing, viewing, indexing and manipulating data in SLOW5 format.
Keywords:
High-throughput sequencing
|
|
smallgenomeutilities
|
a collection of scripts that is useful for dealing and manipulating NGS data of small viral genomes.
|
|
smoothxg
|
Local reconstruction of variation graphs using partial order alignment.
Pangenome graphs built from raw sets of alignments may have complex local structures generated by common patterns of genome variation. smoothxg can be used to extract the consensus pangenome graph by applying the heaviest bundle algorithm to each chain.
|
|
Snippy
|
Rapid bacterial SNP calling and core genome alignments
|
|
snp‑dists
|
converts a FASTA alignment to SNP distance matrix.
Keywords:
Read Alignment
High-Throughput Sequencing
|
|
SnpEff
|
genomic variant annotation and functional effect prediction toolbox.
|
|
SNP‑sites
|
rapidly extracts SNPs from a multi-FASTA alignment.
Keywords:
Variant Analysis
High-Throughput Sequencing
|
|
somalier
|
extracts informative sites, evaluates relatedness, and performs quality-control on BAM/CRAM/BCF/VCF/GVCF.
Keywords:
High-throughput sequencing
Variant Analysis
|
|
SortMeRNA
|
SortMeRNA is a program tool for filtering, mapping and OTU-picking NGS reads in metatranscriptomic and metagenomic data. The core algorithm is based on approximate seeds and allows for fast and sensitive analyses of nucleotide sequences. The main application of SortMeRNA is filtering ribosomal RNA from metatranscriptomic data. Additional applications include OTU-picking and taxonomy assignation available through QIIME v1.9+ (http://qiime.org - v1.9.0-rc1).
|
|
sourmash
|
quickly searches, compares, and analyzes genomic and metagenomic data sets.
|
|
spaceranger
|
Visium Spatial Software Suite for analyzing and visualizing spatial gene and protein expression data
Keywords:
High-throughput sequencing
|
|
SPAdes
|
(St. Petersburg genome assembler) a genome assembly algorithm designed for single-cell and multi-cell bacterial data sets.
|
|
spaln
|
Map and align a set of cDNA/EST or protein sequences onto a genome
|
|
spaTyper
|
computational method for finding spa types.
Keywords:
DNA-Sequencing
High-Throughput Sequencing
|
|
SpliceAI
|
A deep learning-based tool to identify splice variants.
Restriction: SpliceAI models require a license for commercial use. See technical notes for details.
Keywords:
Genome Annotation
High-Throughput Sequencing
|
|
SpliceMap
|
is a de novo splice junction discovery and alignment tool. It offers high sensitivity and support for arbitrary RNA-seq read lengths.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
spoa
|
SIMD partial order alignment tool/library.
Keywords:
Read Alignment
High-Throughput Sequencing
|
|
SRA Toolkit
|
(Sequence Read Archive Toolkit) a collection of tools and libraries for using data in the INSDC Sequence Read Archives.
Keywords:
High-Throughput Sequencing
Genomics
|
|
srnaMapper
|
Mapping small RNA data to a genome.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
srprism
|
Short Read Alignment Tool
|
|
sscocaller
|
Haplotyping single-cell DNA sequenced gamete cells.
Keywords:
scDNA-Seq Analysis
High-Throughput Sequencing
|
|
Stacks
|
a software pipeline for building loci from RAD-seq.
Keywords:
RADSeq
High-Throughput Sequencing
|
|
STAR
|
(Spliced Transcripts Alignment to a Reference) is an ultrafast universal RNA-seq aligner.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
staramr
|
Scan genome contigs against the ResFinder and PointFinder databases
|
|
Starcode
|
Starcode is a DNA sequence clustering software. Starcode clustering is based on all pairs search within a specified Levenshtein distance (allowing insertions and deletions), followed by a clustering algorithm: Message Passing, Spheres or Connected Components. Typically, a file containing a set of DNA sequences is passed as input, jointly with the desired clustering distance and algorihtm. Starcode returns the canonical sequence of the cluster, …
Keywords:
DNA-Sequencing
High-throughput sequencing
|
|
STAR‑Fusion
|
a component of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT), STAR-Fusion uses the STAR aligner to identify candidate fusion transcripts supported by Illumina reads. STAR-Fusion further processes the output generated by the STAR aligner to map junction reads and spanning reads to a reference annotation set.
Keywords:
High-throughput sequencing
RNA-Seq Analysis
|
|
stark
|
A tool for bluntifying a bidirected de bruijn graph by removing overlaps.
Keywords:
High-throughput sequencing
|
|
STREAM
|
(Single-cell Trajectories Reconstruction, Exploration And Mapping) is an interactive computational pipeline for reconstructing complex celluar developmental trajectories from sc-qPCR, scRNA-seq or scATAC-seq data.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
strike
|
A program to evaluate protein multiple sequence alignments using a single protein structure.
|
|
StringTie
|
a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus. Its input can include not only the alignments of raw reads used by other transcript assemblers, but also alignments longer sequences that …
|
|
strobealign
|
a read mapper that is typically significantly faster than other read mappers while achieving comparable or better accuracy, see the performance evaluation.
Keywords:
Read Alignment
High-Throughput Sequencing
|
|
Subread
|
comprises a suite of software programs for processing next-gen sequencing read data including
- Subread: a general-purpose read aligner which can align both genomic DNA-seq and RNA-seq reads. It can also be used to discover genomic mutations including short indels and structural variants. - Subjunc: a read aligner developed for aligning RNA-seq reads and for the detection of exon-exon junctions. Gene fusion events can …
Keywords:
High-throughput sequencing
|
|
SUPPA2
|
Fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions.
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
SvABA
|
Structural variation and indel analysis by assembly.
Keywords:
Genome Assembly
High-throughput sequencing
|
|
swarm
|
a robust and fast clustering method for amplicon-based studies.
|
|
SWORD
|
(Smith Waterman On Reduced Database) is a fast and sensitive software for protein sequence alignment.
|
|
TakeABreak
|
tool that can detect inversion breakpoints directly from raw NGS reads, without the need of any reference genome and without de novo assembling the genomes
Keywords:
High-throughput sequencing
|
|
tantan
|
tantan masks simple regions (low complexity & short-period tandem repeats) in biological sequences.
|
|
TBProfiler
|
profiling tool for Mycobacterium tuberculosis to detect drug resistance and lineage from WGS data.
|
|
TensorFlow
|
an open source software library for high performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. Originally developed by researchers and engineers from the Google Brain team within Google’s AI organization, it comes with strong support for machine learning and deep learning, …
|
|
terminus
|
enables the discovery of data-driven, robust transcript groups from RNA-seq data.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
TGS‑GapCloser
|
A gap-closing software tool that uses error-prone long reads generated by third-generation-sequence techniques (Pacbio, Oxford Nanopore, etc.) or preassembled contigs to fill N-gap in the genome assembly.
Keywords:
Genome Assembly
High-Throughput Sequencing
|
|
TideHunter
|
efficient and sensitive tandem repeat detection from noisy long reads using seed-and-chain.
|
|
tidk
|
Identify and find telomeres, or telomeric repeats in a genome.
Keywords:
Genome Annotation
High-Throughput Sequencing
|
|
tigmint
|
Correct misassemblies using linked or long reads
Keywords:
High-throughput sequencing
|
|
TN93
|
a fast distance calculator that computes pairwise distances between aligned nucleotide sequences in sequential FASTA format using the Tamura Nei 93 distance.
|
|
TOBIAS
|
(Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal) a collection of command-line bioinformatics tools for performing footprinting analysis on ATAC-seq data.
Keywords:
ATAC-Seq
High-Throughput Sequencing
|
|
Tombo
|
a suite of tools primarily for the identification of modified nucleotides from nanopore sequencing data.
Keywords:
Nanopore
High-Throughput Sequencing
|
|
TopHat
|
a fast splice junction mapper for RNA-Seq reads that aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.
|
|
ToulligQC
|
A post sequencing QC tool for Oxford Nanopore sequencers
Keywords:
Nanopore
High-Throughput Sequencing
|
|
TPMCalculator
|
quantifies mRNA abundance directly from the alignments by parsing BAM files. The input parameters are the same GTF files used to generate the alignments, and one or multiple input BAM file(s) containing either single-end or paired-end sequencing reads. The TPMCalculator output is comprised of four files per sample reporting the TPM values and raw read counts for genes, transcripts, exons and introns respectively.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
TransDecoder
|
identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.
|
|
TreeBeST
|
(Tree Building guided by Species Tree) is a versatile program that builds, manipulates and displays phylogenetic trees.
|
|
treePL
|
is a phylogenetic penalized likelihood program.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
TreeSAPP
|
a functional and taxonomic annotation tool for microbial genomes and proteins.
|
|
TreeTime
|
provides routines for ancestral sequence reconstruction and inference of molecular-clock phylogenies.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
TRF
|
(Tandem Repeats Finder) a program to locate and display tandem repeats in DNA sequences.
Keywords:
Genome Annotation
High-Throughput Sequencing
|
|
trimAl
|
is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment.
Keywords:
High-throughput sequencing
|
|
Trimmomatic
|
a flexible read trimming tool for Illumina NGS data.
Keywords:
High-Throughput Sequencing
|
|
Trinity
|
a software package comprised of three independent software modules (Inchworm, Chrysalis, and Butterfly) for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data.
|
|
Trinotate
|
a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms.
Keywords:
High-throughput sequencing
|
|
triodenovo
|
implements a Bayesian framework for calling de novo mutations in trios for next-generation sequencing data.
Keywords:
Genomics
High-Throughput Sequencing
|
|
tRNAscan‑SE
|
tRNA detection in large-scale genome sequence.
Keywords:
High-throughput sequencing
|
|
Truvari
|
Structural variant comparison tool for VCFs
Keywords:
Variant Analysis
High-Throughput Sequencing
|
|
Ultraplex
|
an all-in-one software package for processing and demultiplexing fastq files.
Keywords:
High-Throughput Sequencing
|
|
umis
|
tools for processing UMI RNA-tag data.
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
UMI‑tools
|
tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
UniFrac
|
Fast phylogenetic diversity calculations
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
unikmer
|
toolkit for k-mer with taxonomic information
|
|
UPIMAPI
|
(UniProt Id Mapping through API) a command line interface for using UniProt's API, which allows access to UniProt's ID mapping programmatically.
Keywords:
High-Throughput Sequencing
|
|
UShER
|
a program that rapidly places new samples onto an existing phylogeny using maximum parsimony. It is particularly helpful in understanding the relationships of newly sequenced SARS-CoV-2 genomes with each other and with previously sequenced genomes in a global phylogeny.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
vapor
|
is a tool for classification of Influenza samples from raw short read sequence data for downstream bioinformatics analysis.
Keywords:
Genome Annotation
High-Throughput Sequencing
|
|
varlociraptor
|
flexible, arbitrary-scenario, uncertainty-aware variant calling with parameter free filtration via FDR control.
Keywords:
Variant Analysis
High-Throughput Sequencing
|
|
vcf2parquet
|
Convert a vcf in parquet.
Keywords:
High-throughput sequencing
|
|
vcflib
|
command-line tools for manipulating VCF files.
Keywords:
Variant Analysis
High-Throughput Sequencing
|
|
VCFtools
|
a program package designed to provide easily accessible methods for working with complex genetic variation data in the form of VCF files, such as those generated by the 1000 Genomes Project.
|
|
velocyto
|
a library for the analysis of RNA velocity.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
Velvet
|
a sequence assembler for very short reads.
|
|
VerifyBamID2
|
A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.
Keywords:
Genomics
High-Throughput Sequencing
|
|
VERSE
|
a versatile and efficient RNA-Seq read counting tool
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
VeryFastTree
|
a new tool designed for efficient phylogenetic tree inference, specifically tailored to handle massive taxonomic datasets. It is a highly-tuned implementation based on the FastTree-2 tool that takes advantage of parallelization and vectorization strategies to speed up the inference of phylogenies for huge alignments.
Keywords:
Phylogenomics
High-Throughput Sequencing
|
|
ViennaRNA
|
Vienna RNA package -- RNA secondary structure prediction and comparison
Keywords:
RNA-Seq Analysis
High-Throughput Sequencing
|
|
VIRULIGN
|
a tool for codon-correct pairwise alignments, with an augmented functionality to annotate the alignment according the positions of the proteins.
|
|
VSEARCH
|
an alternative to the USEARCH tool developed by Robert C. Edgar (2010) for which the source code is not publicly available, VSEARCH is an open source, multithreaded 64-bit tool for processing and preparing metagenomics, genomics, and population genomics nucleotide sequence data. It supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact …
|
|
WASPQTL
|
WASP is a suite of tools for unbiased allele-specific read mapping and discovery of molecular QTLs.
|
|
WiggleTools
|
The WiggleTools package allows genomewide data files to be manipulated as numerical functions, equipped with all the standard functional analysis operators (sum, product, product by a scalar, comparators), and derived statistics (mean, median, variance, stddev, t-test, Wilcoxon's rank sum test, etc).
Keywords:
Genomics
High-throughput sequencing
|
|
Winnowmap
|
Winnowmap is a long-read mapping algorithm optimized for mapping ONT and PacBio reads to repetitive reference sequences.
Keywords:
PacBio Sequencing
High-Throughput Sequencing
|
|
wot
|
a software package for analyzing snapshots of developmental processes in scRNA-seq data.
Keywords:
scRNA-Seq Analysis
High-Throughput Sequencing
|
|
xAtlas
|
xAtlas is a fast and retrainable small variant caller that has been developed at the Baylor College of Medicine Human Genome Sequencing Center.
Keywords:
Variant Analysis
High-Throughput Sequencing
|
|
xPore
|
is a Python package for identification and quantification of differential RNA modifications from direct RNA sequencing
Keywords:
RNA-Sequencing
High-Throughput Sequencing
|
|
yacrd
|
is a simple and easy to use long read error-correction tool which can detect and remove chimeras.
|
|
YaHS
|
YaHS, yet another Hi-C scaffolding tool
Keywords:
Hi-C
High-Throughput Sequencing
|
|
Zerone
|
discretizes several ChIP-seq replicates simultaneously and resolves conflicts between them. After the job is done, Zerone checks the results and tells you whether it passes the quality control.
|
|