|   March 2021 Newsletter
Our March newsletter includes thirty five new software titles and an additional fifteen updates.  Spring workshop and training registrations are open for both The Harvard Chan Bioinformatics Core and HMS Research Computing.
 
 macOS 11 Big Sur supportWe've been testing new application installations on macOS 11 Big Sur. The majority of applications appear to work normally on both x86_64 and Apple Silicon hardware, with a few exceptions. If you have a new Big Sur M1 mac, you should be able to install software with the BioGrids installation manager. Be sure to install XQuartz, and let us know if you encounter any problemsOur new graphical Installation Manager for macOS and Linux is nearly ready to go and we are looking for some beta testers! Internal testing is going well and we would now like feedback from the community. Email help@biogrids.org and we'll get you set up.Installation Manager Beta Testing
 As always, please let us know if you have any questions or problems upgrading - help@biogrids.org
 
 Remote Working Help
 The BioGrids Wiki provides step by step instructions for installing BioGrids software on a local laptop or desktop machine. If you prefer a live demonstration, or run into trouble, please contact help@biogrids.org. We can set up a Zoom meeting to assist you.
 
 MacOS 10.15 Catalina
 While we recommend not upgrading to 10.15 on any Mac with BioGrids already installed, we have implemented a workaround to install BioGrids and SBGrid on new machines. Two approaches are available.
 
 
 Cite BioGridsIf your use of BioGrids supplied software was an important element in your publication, please include the following statement in your work:
 "Software used in the project was installed and configured by BioGrids
 (cite: eLife 2013;2:e01456, Collaboration gets the most out of software.)"
 
 See our Grant Support page for additional details.
 
 Register here to try out our software installer, which allows users to choose from over 290 bioinfomatics and life sciences tools that can be installed as ready-to-run applications on Mac or Linux machines with the click of a button or a short command from the CLI. No need to worry about dependencies or compilation.
 
 BioGrids is supported by a team of scientists and engineers at HMS. We provide direct support to BioGrids members. This includes all aspects of software installation and management. If you need assistance of any kind please send a note to:   help@biogrids.org.
 
 BioGrids Installer
The BioGrids Installer is an easy to use application that makes installing and managing life sciences software simple and quick. 
 A command line version is also available for Macs and Linux.  Download using the link button above and register here for activation.
 
 The BioGrids team provides support, infrastructure and testing for scientific software packages. We currently provide 335 titles in five categories and over 1,500  R, python and perl packages and modules. The collection grows weekly. Learn more here: About BioGrids
   
 
 BioGrids QuickStart
If you are new to BioGrids and would like to quickly get started with the command line version, follow the instructions below:
 
 1: Download the BioGrids Installer command line version
 
 Linux CLI
 curl -kLO https://biogrids.org/wiki/downloads/biogrids-1.0.695-Linux.tgz
tar zxf biogrids-1.0.694-Linux.tgz
cd biogrids-1.0.694-LinuxOSX CLI curl -kLO https://biogrids.org/wiki/downloads/biogrids-1.0.695-Darwin.tgz
tar zxf biogrids-1.0.694-Darwin.tgz
cd biogrids-1.0.694-Darwin2: Activate biogrids ./biogrids activate biogrid-production jvinent1  70rYFTDnmCr93VUklfbf1s3M4jdyC9bFVYHew==
Replace the site name, user name and activation key with your own credentials.
3: Install software with BioGrids ./biogrids install fastqc trimmomatic samtools star subread igv When finished, verify applications are installed: ./biogrids installed
 
 
 Software UpdatesBCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.
 Updated versions:1.12 | Linux 64 | OS X INTEL
 
 biohansel Subtype microbial whole-genome sequencing (WGS) data using SNV targeting k-mer subtyping schemes.
 Updated versions:2.6.1 | OS X INTEL 2.6.1 | Linux 64
 
 bioinfokit is a toolkit aimed to provide various easy-to-use functionalities to analyze, visualize, and interpret the biological data generated from genome-scale omics experiments.
 Updated versions:2.0.1 | OS X INTEL 2.0.1 | Linux 64
 
 BLAST+ is a suite of BLAST (Basic Local Alignment Search Tool) tools that utilizes the NCBI C++ Toolkit with a number of performance and feature improvements over the legacy BLAST applications.
 Updated versions:2.11.0 | Linux 64 2.11.0 | OS X INTEL
 
 DANPOS2 is a toolkit for Dynamic Analysis of Nucleosome and Protein Occupancy by Sequencing, version 2.
 Updated versions:2.2.2 | Linux 64
 
 DIAMOND is a high-throughput program for aligning a file of short DNA sequencing reads against a protein reference database such as NR, at 20,000 times the speed of BLASTX, with high sensitivity.
 Updated versions:2.0.4 | OS X INTEL 2.0.4 | Linux 64
 
 dRep is a python program for rapidly comparing large numbers of genomes. dRep can also "de-replicate" a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set.
 Updated versions:3.1.1 | OS X INTEL 3.1.1 | Linux 64
 
 emu is a relative abundance estimator for 16S genomic sequences
 Updated versions:1.0.1 | OS X INTEL 1.0.1 | Linux 64
 
 fastv is an ultra-fast tool for identification of SARS-CoV-2 and other microbes from sequencing data.
 Updated versions:0.8.1 | OS X INTEL 0.8.1 | Linux 64
 
 Genrich is a peak-caller for genomic enrichment assays (e.g. ChIP-seq, ATAC-seq).
 Updated versions:0.6.1 | OS X INTEL 0.6.1 | Linux 64
 
 GNUVID (GNU-based Virus IDentification) is a Python3 program. It ranks CDS nucleotide sequences in a genome fna file based on the number of observed exact CDS nucleotide matches in a public or private database. It was created to type SARS-CoV-2 genomes using a whole genome multilocus sequence typing (wgMLST) approach.
 Updated versions:2.2 | OS X INTEL 2.2 | Linux 64
 
 HiLine is a HiC alignment and classification pipeline.
 Updated versions:0.2.2 | Linux 64 0.2.2 | OS X INTEL
 
 IGV (Integrative Genomics Viewer) a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.
 Updated versions:2.9.4 | Linux 64 | OS X INTEL
 
 LongGF is a computational algorithm and software tool for fast and accurate detection of gene fusion by long-read transcriptome sequencing
 Updated versions:0.1.2 | OS X INTEL 0.1.2 | Linux 64
 
 Mapula is a command line tool that is able to parse alignments in SAM format and produce a range of useful stats.
 Updated versions:2.1.1 | OS X INTEL 2.1.1 | Linux 64
 
 MetaEuk is a modular toolkit designed for large-scale gene discovery and annotation in eukaryotic metagenomic contigs.
 Updated versions:4.a0f584d | Linux 64 4.a0f584d | OS X INTEL
 
 metagraph framework allows for indexing and analysis of very large biological sequence collections, producing compressed indexes that can represent several petabases of input data. The indexes can be efficiently queried with any query sequence of interest.
 Updated versions:0.1.0 | OS X INTEL 0.1.0 | Linux 64
 
 mokapot implements fast and flexible semi-supervised learning for peptide detection
 Updated versions:0.6.0 | Linux 64 0.6.0 | OS X INTEL
 
 msstitch is a tool to integrate a number of Shotgun proteomics tools, generating ready to use result files.
 Updated versions:3.6 | OS X INTEL 3.6 | Linux 64
 
 Nanopolish is a software package for signal-level analysis of Oxford Nanopore sequencing data.
 Updated versions:0.13.2 | Linux 64 0.13.2 | OS X INTEL
 
 NanoStat calculates various statistics from a long read sequencing dataset in fastq, bam or albacore sequencing summary format.
 Updated versions:1.5.0 | OS X INTEL 1.5.0 | Linux 64
 
 ngmlr CoNvex Gap-cost alignMents for Long Reads (ngmlr) is a long-read mapper designed to sensitively align PacBilo or Oxford Nanopore to (large) reference genomes.
 Updated versions:0.2.7 | Linux 64
 
 ont_fast5_api is a simple interface to HDF5 files of the Oxford Nanopore .fast5 file format.
 Updated versions:3.3.0 | OS X INTEL 3.3.0 | Linux 64
 
 pangolin Phylogenetic Assignment of Named Global Outbreak LINeages
 Updated versions:2.3.2 | OS X INTEL 2.3.2 | Linux 64
 
 PLASS (Protein-Level ASSembler) is a software to assemble short read sequencing data on a protein level.
 Updated versions:4.687d7 | OS X INTEL 4.687d7 | Linux 64
 
 PLINK is a comprehensive update to Shaun Purcell's PLINK command-line program -- a whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses.
 Updated versions:1.90 | OS X INTEL 1.90 | Linux 64 2.00a3 | OS X INTEL 2.00a3 | Linux 64
 
 Python is a general-purpose, interpreted, object oriented, high-level dynamic programming language that emphasizes code readability.
 Updated versions:3.7.0 | OS X INTEL 3.7.0 | Linux 64
 
 Raven is a de novo genome assembler for long uncorrected reads.
 Updated versions:1.5.0 | Linux 64 1.5.0 | OS X INTEL
 
 refgenie manages storage, access, and transfer of reference genome resources.
 Updated versions:0.9.3 | OS X INTEL 0.9.3 | Linux 64
 
 RNAblueprint  library solves the problem of stochastically sampling RNA/DNA sequences compatible to multiple structural constraints.
 Updated versions:1.3.2 | OS X INTEL 1.3.2 | Linux 64
 
 RODEO evaluates one or many genes, characterizing a gene neighborhood based on the presence of profile hidden Markov models (pHMMs).
 Updated versions:2.3.3 | OS X INTEL 2.3.3 | Linux 64
 
 Rust-Bio-Tools is a set of ultra fast and robust command line utilities for bioinformatics tasks based on Rust-Bio.
 Updated versions:0.19.6 | OS X INTEL 0.19.6 | Linux 64
 
 TOBIAS (Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal) is a collection of command-line bioinformatics tools for performing footprinting analysis on ATAC-seq data.
 Updated versions:0.12.10 | Linux 64 0.12.10 | OS X INTEL
 
 TreeSAPP is a functional and taxonomic annotation tool for microbial genomes and proteins
 Updated versions:0.10.2 | OS X INTEL 0.10.2 | Linux 64
 
 SAMtools (Sequence Alignment/Map) a generic format for storing large nucleotide sequence alignments that provides various utilities for manipulating alignments, including sorting, merging, indexing and generating alignments in a per-position format.
 Updated versions: 1.12 | OS X INTEL 1.12 | Linux 64
 
 SECIMTools is a suite of tools for processing of metabolomics data.
 Updated versions:21.3.4 | OS X INTEL 21.3.4 | Linux 64
 
 SeqFu is a general-purpose program to manipulate and parse information from FASTA/FASTQ files.
 Updated versions:0.8.11 | OS X INTEL 0.8.11 | Linux 64 0.8.10 | OS X INTEL 0.8.10 | Linux 64
 
 smallgenomeutilities is a collection of scripts that is useful for dealing and manipulating NGS data of small viral genomes.
 Updated versions:0.3.2 | Linux 64 0.3.2 | OS X INTEL
 
 SpacePHARER is a modular toolkit for sensitive phage-host interaction identification using CRISPR spacers.
 Updated versions:4.228b9e5 | OS X INTEL 4.228b9e5 | Linux 64
 
 SPAdes (St. Petersburg genome assembler) a genome assembly algorithm designed for single-cell and multi-cell bacterial data sets.
 Updated versions:3.15.2 | Linux 64 3.15.2 | OS X INTEL
 
 spaTyper is a computational method for finding spa types.
 Updated versions:0.3.3 | OS X INTEL 0.3.3 | Linux 64
 
 StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts.
 Updated versions:2.1.5 | Linux 64 2.1.5 | OS X INTEL
 
 TensorFlow is an open source software library for high performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices.
 Updated versions: 2.4.1 | Linux 64 1.14.0 | OS X INTEL 1.14.0 | Linux 64 2.0.0 | OS X INTEL
 
 Ultraplex is an all-in-one software package for processing and demultiplexing fastq files.
 Updated versions:1.1.4 | OS X INTEL 1.1.4 | Linux 64
 
 UShER is a program that rapidly places new samples onto an existing phylogeny using maximum parsimony. It is particularly helpful in understanding the relationships of newly sequenced SARS-CoV-2 genomes with each other and with previously sequenced genomes in a global phylogeny.
 Updated versions:0.2.0 | Linux 64 0.2.0 | OS X INTEL
 
 varlociraptor is a flexible, arbitrary-scenario, uncertainty-aware variant calling with parameter free filtration via FDR control.
 Updated versions:2.3.0 | OS X INTEL 2.6.5 | Linux 64
 
 VIRULIGN is a tool for codon-correct pairwise alignments, with an augmented functionality to annotate the alignment according the positions of the proteins.
 Updated versions:1.0.1 | OS X INTEL 1.0.1 | Linux 64
 
 WiggleTools  package allows genomewide data files to be manipulated as numerical functions, equipped with all the standard functional analysis operators (sum, product, product by a scalar, comparators), and derived statistics (mean, median, variance, stddev, t-test, Wilcoxon's rank sum test, etc).
 Updated versions:1.2.8 | Linux 64 1.2.8 | OS X INTEL
 
 
 Software TrainingTraining sessions available to HMS trainees:
 HMS Research Computing
 
 New courses and registrations for Spring 2021 are now open.
 See the HMS Research Computing Training Portal for the most current updates.
 
 
	
		
		
		
	
	
		
			| Date | Topic |   |  
			| April 7th | Systems Modeling and Controls with Simulink & Simscape | Register |  
			| April 21st | What’s New in MATLAB for Research | Register |  
			| May 5th | Distance Learning and Virtual Labs | Register |  
 
 The Harvard Chan Bioinformatics Core
 
 Courses for Spring  2021 are now open. See the Workshop Updates page for updates.
 
 
 
	
		
			| Topic | Category | Date | Duration | Prerequisites |  
			| Command-line interface and the O2 cluster (shell/Unix/Linux) | Basic | March 5th, 9th, 12th | Three 2.5h sessions | None |  
			| Bulk RNA-seq (Part I - FASTQ to counts) | Advanced | March 23rd, 26th, 30th & April 2nd | Four 2.5h sessions | Command-line interface |  
			| R | Basic | April 13th, 16th, 20th, 23rd | Four 2h sessions | None |  
			| Bulk RNA-seq (Part II - Differential gene expression) | Advanced | May 4th, 7th, 11th, 14th | Four 2h sessions | R |  
			| scRNA-seq | Advanced | May 25th, 28th & June 1st | Three 2.5h sessions | R |  
 Bioinformatics SupportNeed help getting software installed on new machines? Have you been planning to try Amazon Web Services (AWS) cloud computing?
 
 BioGrids can help you get started. We have expertise in bioinformatics, programming, workflow development and high performance computing.
 
 We improve the collection with feedback from the community.
 
 Want to see a new application in BioGrids?
 
 Let us know:       help@biogrids.org
 
 
 BioGrids is supported by Harvard Medical School and Boston Children's Hospital and relies on a framework that was developed by SBGrid.
 |