Button for mobile navigation

Supported Applications

Software count:
Filtering is with keywords.
AppCiter will help you create a bibliography of the programs you wish to cite. See How.
AppCiter Programs:

No programs selected

Results:

Name Description Links
tool for converting 10x BAMs produced by Cell Ranger, Space Ranger, Cell Ranger ATAC, Cell Ranger DNA, and Long Ranger back to FASTQ files that can be used as inputs to re-run analysis.
A5-miseq is a pipeline for assembling DNA sequence data generated on the Illumina sequencing platform. A5-miseq can produce high-quality microbial genome assemblies on a laptop computer without any parameter tuning by automating the process of adapter trimming, quality filtering, error correction, contig and scaffold generation and detection of misassemblies.
a simple transcriptome assembler based on kallisto and Cortex graphs.
abismal is a fast and memory-efficient mapper for short bisulfite sequencing reads
an extended version of Partial Order Alignment (POA) that performs adaptive banded dynamic programming (DP) with an SIMD implementation.
mass screening of contigs for antibiotic resistance genes.
an abundance-based tool for binning metagenomic sequences.
(Analysis of Functional NeuroImages) is a set of C programs for processing, analyzing, and displaying functional MRI (FMRI) data - a technique for mapping human brain activity.
(Another Gff Analysis Toolkit) a suite of tools to handle gene annotations in any GTF/GFF format.
Assembled Genomes Compressor (AGC) is a tool designed to compress collections of de-novo assembled genomes. It can be used for various types of datasets: short genomes (viruses) as well as long (humans).
AGFusion (pronounced 'A G Fusion') is a python package for annotating gene fusions from the human or mouse genomes.
(Ancestry and Kinship Toolkit) a statistical genetics tool for analysing large cohorts of whole-genome sequenced samples. It provides a handful of useful statistical genetics routines using the htslib API for input/output. This means it can seamlessly read BCF/VCF files and play nicely with bcftools.
is a tool for the efficient processing of single-cell data based on RAD files produced by alevin.
an efficient and versatile command-line application that computes multi-sample quality control metrics in a read-group aware manner.
AlignStats produces various alignment, whole genome coverage, and capture coverage metrics for sequence alignment files in SAM, BAM, and CRAM format.
Multi-mapped read rescue strategy for gene regulatory analyses
an implementation of the inference pipeline of AlphaFold v2.0 using a completely new model that was entered in CASP14.
a modern and open framework for MS-based proteomics.
a suite of programs that allows users to carry out molecular dynamics simulations, particularly on biomolecules. The suite can be used to carry out complete (non-periodic) molecular dynamics simulations (using NAB) with either explicit water or generalized Born solvent models. The independently developed packages work well by themselves, and with Amber itself.
(Alignment of Multiple Protein Sequences) a suite of programs for protein multiple sequence alignment, pairwise alignment, statistical analysis and flexible pattern matching.
AMPtk: Amplicon tool kit for processing high throughput amplicon sequencing data.
AnchorWave (Anchored Wavefront Alignment) identifies collinear regions via conserved anchors (full-length CDS and full-length exon have been implemented currently) and breaks collinear regions into shorter fragments, i.e., anchor and inter-anchor intervals.
estimates the evolutionary distance between closely related genomes. These distances can be used to rapidly infer phylogenies for big sets of genomes. Because andi does not compute full alignments, it is so efficient that it scales even up to thousands of bacterial genomes.
(Advanced Normalization Tools) extracts information from complex datasets that include imaging (Word Cloud). Paired with ANTsR (answer), ANTs is useful for managing, interpreting and visualizing multidimensional data. ANTs is popularly considered a state-of-the-art medical image registration and segmentation toolkit. ANTsR is an emerging tool supporting standardized multimodality image analysis. ANTs depends on the Insight ToolKit (ITK), a widely used medical image processing library to …
an open-source, community-driven analysis and visualization platform for ‘omics data. Its interactive interface facilitates the management of metagenomic contigs and associated data for automatic or human-guided identification of genome bins and their curation.
ARAGORN identifies tRNA and tmRNA genes. The program employs heuristic algorithms to predict tRNA secondary structure, based on homology with recognized tRNA consensus sequences and ability to form a base‐paired cloverleaf.
high-resolution HLA typing from RNA seq.
Scaffolding genome sequence assemblies using linked or long reads.
(Antibiotic Resistance Identification By Assembly) a tool that identifies antibiotic resistance genes by running local assemblies. It can also be used for MLST calling.
a command-line genome browser running from terminal window and solely based on ASCII characters.
Get assembly statistics from FASTA and FASTQ files.
trim adapters from high-throughput sequencing reads.
a gene prediction program for eukaryotes that can be used as an ab initio program, which means it bases its prediction purely on the sequence.
(Amazon Web Services Command Line Interface) a command line interface tool to manage multiple Amazon Web Services and automate them through scripts.
rapid and standardized annotation of bacterial genomes & plasmids.
A universal protein model for prokaryotic gene prediction
bam2fastx provides conversion of PacBio BAM files into gzipped fasta and fastq files, including splitting of barcoded data.
bam-readcount generates metrics at single nucleotide positions.
BAMscale is a one-step tool for either 1) quantifying and normalizing the coverage of peaks or 2) generated scaled BigWig files for easy visualization of commonly used DNA-seq capture based methods.
Extract coverage information from BAM files, supporting stranded and physical coverage and streams.
Tool for converting 10x BAMs produced by Cell Ranger
a fast, flexible C++ API & toolkit for reading, writing, and manipulating BAM files.
a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable, bam.
(BAsic Rapid Ribosomal RNA Predictor) predicts the location of ribosomal RNA genes in genomes (bacteria, archaea, metazoan mitochondria and eukaryotes).
is a tool to extract paired reads in FASTQ format from coordinate sorted BAM files. Bazam is a smarter way to realign reads from one genome to another. If you've tried to use Picard SAMtoFASTQ or samtools bam2fq before and ended up unsatisfied with complicated, long running inefficient pipelines, bazam might be what you wanted. Bazam will output FASTQ in a form that can …
a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. BBTools can handle common sequencing file formats such as fastq, fasta, sam, scarf, fasta+qual, compressed or raw, with autodetection of quality encoding and interleaving.
is a bioinformatics tool for constructing the compacted de Bruijn graph from sequencing data.
provides best-practice pipelines for automated analysis of high throughput sequencing data with the goal of being quantifiable, analyzable, scalable and reproducible. The development process is fully open and sustained by contributors from multiple institutions. Bioinformaticians, biologists and the general public should be able to run these tools on inputs ranging from research materials to clinical samples to personal genomes.
Prioritize small variants, structural variants and coverage based on biological inputs. The goal is to use pre-existing knowledge of relevant genes, domains and pathways involved with a disease to extract the most interesting signal from a set of high quality small or structural variant calls. Given information on coverage, it will be able to identify poorly covered regions in potential genes of interest.
bcbio-variation is a toolkit to analyze genome variation data, built on top of the Genome Analysis Toolkit (GATK) with Clojure. It supports scoring for the Archon Genomics X PRIZE competition and is also a general framework for variant file comparison. It enables validation of variants and exploration of algorithm differences between calling methods by automating the process involved with comparing two sets of variants. …
Parallel merging, squaring off and ensemble calling for genomic variants. Provide a general framework meant to combine multiple variant calls, either from single individuals, batched family calls, or multiple approaches on the same sample. Splits inputs based on shared genomic regions without variants, allowing independent processing of smaller regions with variant calls.
a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.
is a software package for phasing genotypes and imputing ungenotyped markers.
is a cross-platform program for Bayesian analysis of molecular sequences using MCMC.
BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale. Tasks can be easily split by chromosome for distributing whole-genome analyses across a computational cluster.
a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic. Bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), sophisticated analyses …
(Binding and Expression Target Analysis) a software package that integrates ChIP-seq of transcription factors or chromatin regulators with differential gene expression data to infer direct target genes.
a standalone high-performance tool for correcting sequencing errors from Illumina sequencing data.
is a compact file format for efficiently storing and querying whole-genome genotypes of tens to hundreds of thousands of samples. It can be considered as an alternative to genotype-only BCFv2. BGT is more compact in size, more efficient to process, and more flexible on query.
Brain Imaging Data Structure (BIDS) validator.
A quality assessment package for next-genomics sequencing data. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well.
an extension to Brian Kernighan's awk, with added support for several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q, and TAB-delimited formats with column names along with new built-in functions and a command line option to use TAB as the input/output delimiter. When the new functionality is not used, bioawk should behave exactly like the original BWK awk.
tools for early stage NGS alignment file processing including fast sorting and duplicate marking.
tools to analyze and comprehend high-throughput genomic data.
Installation Client for the BioGrids software collection.
Keywords:
Other
subtype microbial whole-genome sequencing (WGS) data using SNV targeting k-mer subtyping schemes.
The bioinfokit toolkit aims to provide various easy-to-use functionalities to analyze, visualize, and interpret the biological data generated from genome-scale omics experiments.
a set of tools for the time-efficient analysis of Bisulfite-Seq (BS-Seq) data. Bismark performs alignments of bisulfite-treated reads to a reference genome and cytosine methylation calls at the same time.
(Basic Local Alignment with Successive Refinement) maps Single Molecule Sequencing (SMS) reads that are thousands of bases long, with divergence between the read and genome dominated by insertion and deletion error.
(Basic Local Alignment Search Tool) finds regions of similarity between biological sequences.
a suite of BLAST (Basic Local Alignment Search Tool) tools that utilizes the NCBI C++ Toolkit with a number of performance and feature improvements over the legacy BLAST applications.
a 3D creation suite that supports the entirety of the 3D pipeline—modeling, rigging, animation, simulation, rendering, compositing and motion tracking.
is a k-mer spectrum-based read error corrector, designed to correct large datasets with a very low memory footprint. It uses the disk streaming k-mer counting algorithm contained in the GATB library, and inserts solid k-mers in a bloom-filter. The correction procedure is similar to the Musket multistage approach. Bloocoo yields similar results while requiring far less memory: as an example, it can correct whole …
aka Best Match Tagger is for removing human reads from metagenomics datasets
bmtool is part of BMTagger aka Best Match Tagger, for removing human reads from metagenomics datasets.
the Amazon Web Services (AWS) SDK for Python, which allows Python developers to write software that makes use of Amazon services like S3 and EC2. Boto provides an easy to use, object-oriented API as well as low-level direct service access.
an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers.
an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.
implements a versatile high-performance version of the BPP software
(Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.
a Perl/Cpp package that provides genome-wide detection of structural variants from next generation paired-end sequencing reads. It includes two complementary programs.
a computational pipeline for finding mutations relative to a reference sequence in short-read DNA re-sequencing data for microbial sized genomes. It reports single-nucleotide mutations, point insertions and deletions, large deletions, and new junctions supported by mosaic reads.
bustools is a program for manipulating BUS files for single cell RNA-Seq datasets. It can be used to error correct barcodes, collapse UMIs, produce gene count or transcript compatbility count matrices, and is useful for many other tasks. See the kallisto | bustools website for examples and instructions on how to use bustools as part of a single-cell RNA-seq workflow.
Bam and Variant Analysis Tools
(Burrows-Wheeler Aligner) a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM.
is a command-line tool for converting 3D images between common file formats.
(Concatemeric Consensus Caller with Partial Order alignments) is a computational pipeline for calling consensi on R2C2 nanopore data.
a reference-free whole-genome multiple alignment program based upon notion of Cactus graphs.
clusters paired-end reads using their barcodes and sequences.
Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing. Canu specializes in assembling PacBio or Oxford Nanopore sequences. Canu operates in three phases: correction, trimming and assembly. The correction phase will improve the accuracy of bases in reads.
a tool for calling copy number variants (CNVs) from human DNA sequencing data.
Assembly of Phylogenomic Datasets from High-Throughput Sequencing data
is a Python package designed to make drawing maps for data analysis and visualisation easy.
Keywords:
Other
Cas-OFFinder is OpenCL based, ultrafast and versatile program that searches for potential off-target sites of CRISPR/Cas-derived RNA-guided endonucleases (RGEN).
clusters and compares protein or nucleotide sequences.
CEFCIG (Computational Epigenetic Framework for Cell Identity Gene Discovery)
Comprehensive mapping of tissue cell architecture via integrated single cell and spatial transcriptomics (cell2location model)
a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
is a publicly available repository of curated receptors, ligands and their interactions.
Keywords:
Other
a cell image analysis software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.
a set of analysis pipelines that process Chromium single-cell RNA-seq output to align reads, generate feature-barcode matrices and perform clustering and gene expression analysis.
The set of analysis pipelines in this suite perform sample demultiplexing, barcode processing, identification of open chromatin regions, and simultaneous counting of transcripts and peak accessibility in single cells.
a set of analysis pipelines that perform identification of open chromatin regions, motif annotation, and differential accessibility analysis for Single Cell ATAC data.
Efficient genotyping bi-allelic SNPs on single cells
an interactive explorer for single-cell transcriptomics data
is a very rapid and memory-efficient system for the classification of DNA sequences from microbial samples, with better sensitivity than and comparable accuracy to other leading systems. The system uses a novel indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (e.g., 4.3 GB for ~4,100 bacterial …
A complete suite for gene-by-gene schema creation and strain identification.
ChIPs is a tool for simulating ChIP-sequencing experiments.
Rust implementation of NanoFilt+NanoLyse, both originally written in Python. This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file.
is an ultrafast method for aligning and preprocessing high throughput chromatin profiles.
a comprehensive and integrative circular RNA analysis toolset.
Circlator is a tool to circularize genome assemblies. The input is a genome assembly in FASTA format and corrected PacBio or nanopore reads in FASTA or FASTQ format. Circlator will attempt to identify each circular sequence and output a linearised version of it. It does this by assembling all reads that map to contig ends and comparing the resulting contigs with the input assembly.
a software package for visualizing data and information. It visualizes data in a circular layout.
Keywords:
Visualization
count antibody TAGS from a CITE-seq and/or cell hashing experiment.
a tool for symphonizing pileup and full-alignment for high-performance long-read variant calling
fast, accurate and versatile k-mer based classification system.
a general purpose multiple sequence alignment program for DNA or proteins.
Keywords:
is the latest version of Clustal: a multiple sequence alignment program for DNA or proteins.
a multiple sequence alignment program that uses seeded guide trees and HMM profile-profile techniques to generate alignments between three or more sequences.
Keywords:
Other
a command-line toolkit and Python library for detecting copy number variants and alterations genome-wide from high-throughput sequencing.
a tool that makes Run VS Code on any machine anywhere and access it in the browser.
Comet MS/MS searches uninterpreted tandem mass spectra of peptides against sequence databases.
uses exome sequencing data to find copy number variants (CNVs) and genotype the copy-number of duplicated genes.
Copy number and genotype annotation from whole genome and whole exome sequencing data.
is a support library for a sparse, compressed, binary persistent storage format, also called cooler, used to store genomic interaction data, such as Hi-C contact matrices.
Software for clustering de novo assembled transcripts and counting overlapping reads.
covtobed is a tool to generate BED coverage tracks from BAM files. It reads one (or more) alignment files (sorted BAM) and prints a BED with the coverage. It will join consecutive bases with the same coverage, and can be used to only print a BED file with the regions having a specific coverage range.
Crass is designed to identify and reconstruct CRISPR loci from raw metagenomic data without the need for assembly or prior knowledge of CRISPR in the data set.
Bioinformatics tool outputs converter to JSON or YAML.
a tool that enables the easy detection of CRISPRs and cas genes in user-submitted sequence data (allows sequences up to 50 Mo otherwise download standalone program). This is an update of the CRISPRFinder program with improved specificity and indication on the CRISPR orientation. MacSyFinder is used to identify cas genes, the CRISPR-Cas type and subtype.
a Workflow Management System geared towards scientific workflows.
a program for genome coordinates conversion between different genome assemblies.
controllable lossy compression of BAM/CRAM files.
a set of tools for manipulation of CSV/TSV files. It is convenient for rapid data investigation and integration into analysis pipelines.
Keywords:
Other
a tool that helps redistributable software libraries to support CUDA applications for Linux.
Keywords:
Other
a reference-guided assembler that assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples.
finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
a fast, parallel, and very lightweight memory tool to construct the compacted de Bruijn graph from genome reference(s).
a libre server and cloud storage browser for Mac and Windows with support for FTP, SFTP, WebDAV, Amazon S3, OpenStack Swift, Backblaze B2, Microsoft Azure & OneDrive, Google Drive and Dropbox.
Keywords:
Other
a software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data.
a cython wrapper around htslib built for fast parsing of Variant Call Format (VCF) files.
finds all significant local alignments between reads.
simple de novo transcriptome annotator
a toolkit for Dynamic Analysis of Nucleosome and Protein Occupancy by Sequencing.
a flexible library for parallel computing in Python.
provides joint management of analysis code and data. This enables you to comprehensively track the exact state of any analysis inputs that produced your results — across the entire lifetime of a project, and across multiple datasets.
Keywords:
Other
GNU datamash is a command-line program which performs basic numeric,textual and statistical operations on input textual data files.
Keywords:
Other
is a designed to convert neuroimaging data from the DICOM format to the NIfTI format.
Keywords:
DCMTK is a collection of libraries and applications implementing large parts the DICOM standard.
dDocent is simple bash wrapper to QC, assemble, map, and call SNPs from almost any kind of RAD sequencing. If you have a reference already, dDocent can be used to call SNPs from almost any type of NGS data set.
Deblur is a greedy deconvolution algorithm for amplicon sequencing based on Illumina Miseq/Hiseq error profiles.
Retention time prediction for (modified) peptides using Deep Learning.
a suite of python tools particularly developed for the efficient analysis of high-throughput sequencing data, such as ChIP-seq, RNA-seq or MNase-seq.
an integrated structural variant (SV) prediction method that can discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read and long-read massively parallel sequencing data.
Genetic multiplexing of barcoded single cell RNA-seq.
deconvolutes mixed genomes with unknown proportions.
RNA sequence design for a target protein sequence
Keywords:
Other
De Bruijn graph-based Spliced Aligner for Long Transcriptome reads
a Bioconductor software package installed in R 3.2.2 that estimates variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
bax file decoder and data compressor.
is a flexible and customizable pipeline for prokaryotic genome annotation as well as data submission to the INSDC.
a high-throughput program for aligning a file of short DNA sequencing reads against a protein reference database such as NR, at 20,000 times the speed of BLASTX, with high sensitivity.
In-silico PCR and variant primer design
is a Python 3.7+ library for very efficient parsing and writing of FASTQ and also FASTA files.
a set of tools for analyzing DNA methylation data from bisulfite sequencing
a suite of tools for use in genome assembly and consensus.
a python program for rapidly comparing large numbers of genomes, dRep can also "de-replicate" a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set.
(Detection of RNA Outlier Pipeline) pipeline to find aberrant gene expression events in RNA sequencing data.
Tools for BED, FASTA, FASTQ, GAF, GFA1/2, GFF3, PAF, SAM, and VCF files
a whole genome simulator for next-generation sequencing based off of wgsim found in SAMtools, which was written by Heng Li, and forked from DNAA. It was modified to handle ABI SOLiD and Ion Torrent data, as well as various assumptions about aligners and positions of indels. Many new features have been subsequently added.
estimates haplotype phase either within a genotyped cohort or using a phased reference panel. Eagle2 is now the default phasing method used by the Sanger and Michigan imputation servers and uses a new, very fast HMM-based algorithm that improves speed and accuracy over existing methods via two key ideas: a new data structure based on the positional Burrows-Wheeler transform and a rapid search algorithm …
a Bioconductor software package installed in R 3.2.2 for gene and isoform differential expression analysis of RNA-seq data.
a Bioconductor software package installed in R 3.2.2 for examining differential expression of replicated count data.
provides access to the NCBI's suite of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window. Functions take search terms from command-line arguments. Individual operations are combined to build multi-step queries. Record retrieval and formatting normally complete the process.
Keywords:
Other
Fast genome-wide functional annotation through orthology assignment.
The EIGENSOFT package combines functionality from our population genetics methods (Patterson et al. 2006) and our EIGENSTRAT stratification correction method (Price et al. 2006).
a high-performance tool for analyzing .sam/.bam files (up to and including variant calling) in sequencing pipelines.
Fast & accurate alignment of barcoded short-reads
an extensible, customizable, free/libre text editor.
Keywords:
Other
a program that integrates a range of currently available packages and tools for sequence analysis into a seamless whole.
EMu is a relative abundance estimator for 16S genomic sequences
a FASTQ lossless compression algorithm especially designed for nanopore sequencing FASTQ files.
predicts the functional effects of genomic variants
EPA-ng is a complete rewrite of the Evolutionary Placement Algorithm (EPA), previously implemented in RAxML. It uses libpll and pll-modules to perform maximum likelihood-based phylogenetic placement of genetic sequences on a user-supplied reference tree and alignment.
Ultraperformant Chip-Seq broad domain finder based on SICER.
a tool to predict protein structure, function, and mutations using evolutionary sequence covariation.
is a software package for Bayesian tree inference.
a Java program that finds potential disease-causing variants from whole-exome or whole-genome sequencing data. Starting from a VCF file and a set of phenotypes encoded using the Human Phenotype Ontology (HPO), it will annotate, filter and prioritize likely causative variants based on user-defined criteria such as a variant's predicted pathogenicity, frequency of occurrence in a population and also how closely the given phenotype matches …
Exonerate is a generic tool for pairwise sequence comparison. It allows you to align sequences using a many alignment models, either exhaustive dynamic programming or a variety of heuristics.
a streaming tool for quantifying the abundances of a set of target sequences from sampled subsequences.
is a drop-in C++ implementation of FastQC to assess the quality of sequence reads.
(Fast and Accurate Multiple Sequence Aligner) implements an algorithm for large-scale multiple sequence alignments (400k proteins in 2 hours and 8BG of RAM).
a DNA and protein sequence alignment software package that searches for matching sequence patterns or words, called k-tuples.
developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI).
Perform random operations on fastq files, using unix streaming. Secure your analysis with Fasten!
FastME provides distance algorithms to infer phylogenies.
Fastool is a simple and quick tool to read huge FastQ and FastA files (both normal and gzipped) and manipulate them. It makes use of the KSeq library (http://lh3lh3.users.sourceforge.net/kseq.shtml) for fast access to FastQ/A files.
is a tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.
a quality control tool for high throughput sequence data.
A tool to download FASTQs associated with Study, Experiment, or Run accessions.
fastq-scan reads a FASTQ from STDIN and outputs summary statistics (read lengths, per-read qualities, per-base qualities) in JSON format.
allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.
a fast, flexible, user-friendly, cluster-friendly QTL mapper.
infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million sequences in a reasonable amount of time and memory.
an ultra-fast tool for identification of SARS-CoV-2 and other microbes from sequencing data.
a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
an application for Efficient Data Transfers which is capable of reading and writing at disk speed over wide area networks (with standard TCP). It is written in Java, runs an all major platforms and it is easy to use.
Keywords:
Other
a set of tools to analyze genomic data with a focus on Next Generation Sequencing.
a CLI tool for interacting with fiberseq bam files.
an image processing package. It can be described as a distribution of ImageJ (and ImageJ2) together with Java, Java 3D and a lot of plugins organized into a coherent menu structure. Fiji compares to ImageJ as Ubuntu compares to Linux.
Keywords:
Other
a tool for filtering long reads by quality.
(Fast Length Adjustment of SHort reads) is a very fast and accurate software tool to merge paired-end reads from next-generation sequencing experiments. FLASH is designed to merge pairs of reads when the original DNA fragments are shorter than twice the length of reads. The resulting longer reads can significantly improve genome assemblies. They can also improve transcriptome assembly when FLASH is used to merge …
performs fast principal component analysis (PCA) of single nucleotide polymorphism (SNP) data, similar to smartpca from EIGENSOFT (http://www.hsph.harvard.edu/alkes-price/software/) and shellfish (https://github.com/dandavison/shellfish). FlashPCA is based on the https://github.com/yixuan/spectra/ library.