a suite of programs that allows users to carry out molecular dynamics simulations, particularly on biomolecules. The suite can be used to carry out complete (non-periodic) molecular dynamics simulations (using NAB) with either explicit water or generalized Born solvent models. The independently developed packages work well by themselves, and with Amber itself.
a Python distribution that includes more than 400 of the most popular Python packages for science, math, engineering, and data analysis.
(Advanced Normalization Tools) extracts information from complex datasets that include imaging (Word Cloud). Paired with ANTsR (answer), ANTs is useful for managing, interpreting and visualizing multidimensional data. ANTs is popularly considered a state-of-the-art medical image registration and segmentation toolkit. ANTsR is an emerging tool supporting standardized multimodality image analysis. ANTs depends on the Insight ToolKit (ITK), a widely used medical image processing library to …
(Amazon Web Services Command Line Interface) a command line interface tool to manage multiple Amazon Web Services and automate them through scripts.
bam2fastx provides conversion of PacBio BAM files into gzipped fasta and fastq files, including splitting of barcoded data.
a cross-platform program for Bayesian phylogenetic analysis of molecular sequences. It estimates rooted, time-measured phylogenies using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST 2 uses Markov chain Monte Carlo (MCMC) to average over tree space, so that each …
tools to analyze and comprehend high-throughput genomic data.
Installation Client for the BioGrids software collection.
a set of tools for biological computation written in Python by an international team of developers.
(Basic Local Alignment Search Tool) finds regions of similarity between biological sequences.
a suite of BLAST (Basic Local Alignment Search Tool) tools that utilizes the NCBI C++ Toolkit with a number of performance and feature improvements over the legacy BLAST applications.
the Amazon Web Services (AWS) SDK for Python, which allows Python developers to write software that makes use of Amazon services like S3 and EC2. Boto provides an easy to use, object-oriented API as well as low-level direct service access.
a computational pipeline for finding mutations relative to a reference sequence in short-read DNA re-sequencing data for microbial sized genomes. It reports single-nucleotide mutations, point insertions and deletions, large deletions, and new junctions supported by mosaic reads.
is a Python package designed to make drawing maps for data analysis and visualisation easy.
is a publicly available repository of curated receptors, ligands and their interactions.
is a cell image analysis software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.
a multiple sequence alignment program that uses seeded guide trees and HMM profile-profile techniques to generate alignments between three or more sequences.
enables the easy detection of CRISPRs and cas genes in user-submitted sequence data (allows sequences up to 50 Mo otherwise download standalone program). This is an update of the CRISPRFinder program with improved specificity and indication on the CRISPR orientation. MacSyFinder is used to identify cas genes, the CRISPR-Cas type and subtype.
a Workflow Management System geared towards scientific workflows.
csvtk is a set of tools for manipulation of CSV/TSV files. It is convenient for rapid data investigation and integration into analysis pipelines.
redistributable software libraries to support CUDA applications for Linux.
a libre server and cloud storage browser for Mac and Windows with support for FTP, SFTP, WebDAV, Amazon S3, OpenStack Swift, Backblaze B2, Microsoft Azure & OneDrive, Google Drive and Dropbox.
an optimising static compiler for both the Python programming language and the extended Cython programming language (based on Pyrex). It makes writing C extensions for Python as easy as Python itself. cython is installed as a module within python.
a flexible library for parallel computing in Python.
provides joint management of analysis code and data. This enables you to comprehensively track the exact state of any analysis inputs that produced your results — across the entire lifetime of a project, and across multiple datasets.
GNU datamash is a command-line program which performs basic numeric,textual and statistical operations on input textual data files.
a suite of python tools particularly developed for the efficient analysis of high-throughput sequencing data, such as ChIP-seq, RNA-seq or MNase-seq. deepTools contains useful modules to process the mapped reads data for multiple quality checks, creating normalized coverage files in standard bedGraph and bigWig file formats, that allow comparison between different files (for example, treatment and control). Finally, using such normalized and standardized files, …
DosageConvertor is a C++ tool to convert dosage files (in VCF format) from Minimac3/4 to other formats such as MaCH or PLINK.
provides access to the NCBI's suite of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window. Functions take search terms from command-line arguments. Individual operations are combined to build multi-step queries. Record retrieval and formatting normally complete the process.
An extensible, customizable, free/libre text editor.
integrates a range of currently available packages and tools for sequence analysis into a seamless whole.
Exonerate is a generic tool for pairwise sequence comparison. It allows you to align sequences using a many alignment models, either exhaustive dynamic programming or a variety of heuristics.
a DNA and protein sequence alignment software package that searches for matching sequence patterns or words, called k-tuples.
an image processing package. It can be described as a distribution of ImageJ (and ImageJ2) together with Java, Java 3D and a lot of plugins organized into a coherent menu structure. Fiji compares to ImageJ as Ubuntu compares to Linux.
is a software package for the analysis and visualization of structural and functional neuroimaging data from cross-sectional or longitudinal studies.
tools and libraries for interacting with Google Cloud products and services.
an interpreter for the PostScript (TM) language. It can display and convert postscript files. Software can be involved with gs command.
a set of command line tools to manipulate phylogenetic trees. It is implemented in Go language. The goal is to handle phylogenetic trees in Newick, Nexus and PhyloXML formats, through several basic commands. Each command may print result (a tree for example) in the standard output, and thus can be piped to the standard input of the next gotree command.
grabix leverages the fantastic BGZF library in samtools to provide random access into text files that have been compressed with bgzip. grabix creates it's own index (.gbi) of the bgzipped file. Once indexed, one can extract arbitrary lines from the file with the grab command. Or choose random lines with the, well, random command.
a scalable machine learning and predictive analytics platform.
(Hypergeometric Optimization of Motif EnRichment) a suite of sequencing analysis and sequence motif discovery tools.
is an interactive process viewer.
a software suite to create, edit, compose, or convert bitmap images.
is a software application used to segment structures in 3D medical images.
a small utility to create JSON objects.
is a lightweight and flexible command-line JSON processor.
The Julia programming language is a flexible dynamic language appropriate for scientific and numerical computing with performance comparable to traditional statically-typed languages.
a language-agnostic HTML notebook application for Project Jupyter.
the next-generation web-based user interface for Project Jupyter. JupyterLab enables you to work with documents and activities such as Jupyter notebooks, text editors, terminals, and custom components in a flexible, integrated, and extensible manner.
a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. Developed with a focus on enabling fast experimentation, Keras is a deep learning library that allows for easy and fast prototyping (through user friendliness, modularity, and extensibility); supports both convolutional networks and recurrent networks, as well as combinations of the two; and runs seamlessly on …
a system designed for automated collection of images from a transmission electron microscope; it includes the python-side programs written in python and c, the MySQL database and server, and the mainly php-based image and data viewers on a web server.
is a file transfer program that allows sophisticated FTP, HTTP and other connections to other hosts. If site is specified then LFTP will connect to that site otherwise a connection has to be established with the open command.
a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <200 sequences), FFT-NS-2 (fast; for alignment of <30,000 sequences).
an object-oriented Python library to analyze trajectories from molecular dynamics (MD) simulations in many popular formats. It can write most of these formats, too, together with atom selections suitable for visualization or native analysis tools.
(Mixture-of-Isoforms) for isoform quantitation using RNA-Seq is a probabilistic framework that quantitates the expression level of alternatively spliced genes from RNA-Seq data, and identifies differentially regulated isoforms or exons across samples. MISO is installed as a standalone program and as a module within python.
an open source implementation of Microsoft's .NET Framework based on the ECMA standards for C# and the Common Language Runtime.
a high performance and widely portable implementation of the Message Passing Interface (MPI) standard.
extracts no-reference IQMs (image quality metrics) from structural (T1w and T2w) and functional MRI (magnetic resonance imaging) data.
(multiple sequence comparison by log-expectation) a public domain multiple alignment software for protein and nucleotide sequences.
a reactive workflow framework and programming DSL that ease writing computational pipelines with complex data. It is designed around the idea that the Linux platform is the lingua franca of data science. Linux provides many simple but powerful command-line and scripting tools that, when chained together, facilitate complex data manipulations. Nextflow extends this approach, adding the ability to define complex program interactions and a …
a Python module for fast and easy statistical learning on NeuroImaging data. It leverages the scikit-learn Python toolbox for multivariate statistics with applications such as predictive modelling, classification, decoding, or connectivity analysis.
contains among other things: a powerful N-dimensional array object, sophisticated (broadcasting) functions, tools for integrating C/C++ and Fortran code, useful linear algebra, Fourier transform, and random number capabilities. NumPy is installed as a module within Python.
(Open Source Computer Vision Library) an open source computer vision and machine learning software library.
(Open Java Development Kit) is a free and open source implementation of the Java Platform, Standard Edition (Java SE).
an open source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise, technologies, and resources from all across the High Performance Computing community in order to build the best MPI library available.
a library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. pandas is installed as a module within python.
Given a file containing a list of unix commands, multithreading is used to process the commands in parallel on a single server. Success/failure is captured, and failed commands are retained and reported.
is a highly capable, feature-rich programming language with over 30 years of development. Perl runs on over 100 platforms from portables to mainframes and is suitable for both rapid prototyping and large scale development projects.
stands for parallel implementation of gzip, and is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data.
is the friendly PIL (Python Imaging Library) fork by Alex Clark and Contributors. The Python Imaging Library adds image processing capabilities to your Python interpreter. This library provides extensive file format support, an efficient internal representation, and fairly powerful image processing capabilities. The core image library is designed for fast access to data stored in a few basic pixel formats. It should provide a …
predicts the regulatory role of CRP transcription factor in Escherichia coli. PredCRP provides an accurate method for deriving an optimised model (named PredCRP-model) and a set of four interpretable rules (named PredCRP-ruleset) for predicting and analysing the regulatory roles of CRP from sequences of CRP-binding sites.
an interface to the REDCap Application Programming Interface (API), PyCap is designed to be a minimal interface exposing all required and optional API parameters.
a dedicated Python Integrated Development Environment (IDE) providing a wide range of essential tools for Python developers, tightly integrated together to create a convenient environment for productive Python, web, and data science development.
provides a python wrapper and convenience functions for cudaDeconv, which is a CUDA/C++ implementation of an accelerated Richardson Lucy Deconvolution algorithm1, suitable for general applications, but designed particularly for stage-scanning light sheet applications such as Lattice Light Sheet.
PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms. Its flexibility and extensibility make it applicable to a large suite of problems.
a tool to conduct recurrence analysis in a massively parallel manner using the OpenCL framework.
a general-purpose, interpreted, object oriented, high-level dynamic programming language that emphasizes code readability. Its syntax allows programmers to express concepts in fewer lines of code than in C++ or Java, thus allowing programmers to work more quickly and integrate their systems more effectively.
an open source deep learning platform that provides a seamless path from research prototyping to production deployment.
(Unified Complex Network and RecurreNce analysis toolbox) a fully object-oriented Python package for the advanced analysis and modeling of complex networks.
a cross-platform application framework that is used for developing application software that can be run on various software and hardware platforms with little or no change in the underlying codebase, while still being a native application with native capabilities and speed.
a free software environment for statistical computing and graphics.
a command line program to manage files on cloud storage.
is the time-proven, ultra-robust open-source engine for creating complex, data-driven PDF documents and custom vector graphics. It's free, open-source , and written in Python. The package sees 50,000+ downloads per month, is part of standard Linux distributions, is embedded in many products, and was selected to power the print/export feature for Wikipedia.
an integrated development environment (IDE) for R that includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management.
rust-bio-tools is a set of ultra fast and robust command line utilities for bioinformatics tasks based on Rust-Bio.
a free software that analyzes spatial, temporal and space-time data using the spatial, temporal, or space-time scan statistics. It is designed for any of the following interrelated purposes: Perform geographical surveillance of disease, to detect spatial or space-time disease clusters, and to see if they are statistically significant. Test whether a disease is randomly distributed over space, over time or over space and time. …
(single-cell variational inference tools) is a package for end-to-end analysis of single-cell omics data.
a simplified layer built on top of ITK, intended to facilitate its use in rapid prototyping, education, interpreted languages.
A method and tool to control single-cell RNA-seq data quality.
slclust is a utility that performs single-linkage clustering with the option of applying a Jaccard similarity coefficient to break weakly bound clusters into distinct clusters.
3D Slicer is a free and open-source platform for analyzing and understanding medical image data.
an R package that provides functions for inferring continuous, branching lineage structures in low-dimensional data. Designed to model developmental trajectories in single-cell RNA sequencing data and serve as a component in an analysis pipeline after dimensionality reduction and clustering, Slingshot is flexible enough to handle arbitrarily many branching events and allows for the incorporation of prior knowledge through supervised graph construction.
The Snakemake workflow management system is a tool to create reproducible and scalable data analyses. Workflows are described via a human readable, Python based language. They can be seamlessly scaled to server, cluster, grid and cloud environments, without the need to modify the workflow definition. Finally, Snakemake workflows can entail a description of required software, which will be automatically deployed to any execution environment.
is a command-line toolkit for rapid manipulation of NCBI taxonomy data.
(Tool Command Language) is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more. Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches.
cleans up raw data files and converts them to pdf format with LaTex. TeX Live offers an easy way to get up and running with the TeX document production system.
is a terminal multiplexer: it enables a number of terminals to be created, accessed, and controlled from a single screen. tmux may be detached from a screen and continue running in the background, then later reattached.
(Visualization Tool Kit)f a C++ class library and several interpreted interface layers including Tcl/Tk, Java, and Python. VTK supports a wide variety of visualization algorithms including scalar, vector, tensor, texture, and volumetric methods, as well as advanced modeling techniques such as implicit modeling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation.
command line tools for sequence logo generation.
is an open source, freely available visualization and discovery tool used to map neuroimaging data, especially data generated by the Human Connectome Project. The distribution includes wb_view, a GUI-based visualization platform, and wb_command, a command-line program for performing a variety of algorithmic tasks using volume, surface, and grayordinate data.
a blending of the wxWidgets C++ class library with the Python programming language.
XNAT client tools comprise a number of command line tools to store and retrieve data from XNAT archives. - ArcGet: retrieves image data. - ArcRead: retrieves summary text documents describing imaging data. - ArcSim: retrieves a list of imaging sessions with similar IDs. - StoreXML: writes XML documents to the archive.
an extensible parallel framework, written in Python using OpenMPI libraries that allows researchers to quickly build high throughput big data pipelines without extensive knowledge of parallel programming.