September 2018 Newsletter
Our September update brings eleven new software titles and two updated titles to the BioGrids software collection. In addition, the training section below lists eleven workshops and classes available in October.
BioGrids Office Hours
BioGrids will now hold office hours on Wednesdays from 9:00am - 10:00am via Zoom meeting. Want to see what BioGrids is and how it works? Join us for a three minute demo. If you prefer to meet in person feel free to stop in 202 LHRRB.
Zoom Meeting URL: https://zoom.us/j/653421485 Oct 3, 2018 9:00 AM Eastern Time
BioGrids is supported by a team of scientists and engineers at HMS. We provide direct support for BioGrids members.
Software Updates
AKT Ancestry and Kinship Tools (AKT) provides a handful of useful statistical genetics routines using the htslib API for input/output. This means it can seamlessly read BCF/VCF files and play nicely with bcftools.
New versions: 0.3.2 | Linux 64 0.3.2 | OS X INTEL
CellProfiler designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.
Updated versions: 3.1.5 | Linux 64 3.1.5 | OS X INTEL
hapLOHseq hapLOHseq has been developed for the detection of subtle allelic imbalance events from next-generation sequencing data. hapLOHseq is a sequencing-based extension of hapLOH, which is a method for the detection of subtle allelic imbalance events from SNP array data. hapLOHseq is capable of identifying events of 10 mega-bases or greater occurring in as little as 16% of the sample using exome sequencing data (at 80x) and 4% using whole genome se-quencing data (at 30x), exceeding the capabilities of existing software.
New versions: 0.1.2 | OS X INTEL 0.1.2 | Linux 64
HMCan HMCan is a tool specially designed to analyze histone modification ChIP-seq data produced from cancer genomes. HMCan corrects for the GC-content and copy number bias and then applies Hidden Markov Models to detect the signal from the corrected data. On simulated data, HMCan outperformed several commonly used tools developed to analyze histone modification data produced from genomes without copy number alterations.
New versions: 1.41 | Linux 64 1.41 | OS X INTEL
IsoTree IsoTree is an efficient de novo trascriptome assembler for RNA-Seq data. It can assemble transcripts from RNA-Seq reads (in fasta format). Unlike most of de novo assembly methods that build de Bruijn graph or splicing graph by connecting k-mers which are sets of overlapping substrings generated from reads, IsoTree constructs splicing graph by connecting reads directly. For each splicing graph, IsoTree applies an iterative scheme of mixed integer linear program to build a prefix tree, called isoform tree. Each path from the root node of the isoform tree to a leaf node represents a plausible transcript candidate which will be pruned based on the information of pair-end reads.
New versions: 1.1 | OS X INTEL 1.1 | Linux 64
Manta Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs. Manta discovers, assembles and scores large-scale SVs, medium-sized indels and large insertions within a single efficient workflow. The method is designed for rapid analysis on standard compute hardware: NA12878 at 50x genomic coverage is analyzed in less than 20 minutes on a 20 core server, and most WGS tumor/normal analyses can be completed within 2 hours.
New versions: 1.4.0 | OS X INTEL 1.4.0 | Linux 64
MarViN Rapid Genotype Refinement for Whole-Genome Sequencing Data using Multi-Variate Normal Distribution. Whole-genome low-coverage sequencing has been combined with linkage-disequilibrium (LD) based genotype refinement to accurately and cost-effectively infer genotypes in large cohorts of individuals.
New versions: 0.1.0 | OS X INTEL 0.1.0 | Linux 64
MIRA MIRA is a whole genome shotgun and EST sequence assembler for Sanger, 454, Solexa (Illumina), IonTorrent data and PacBio (the later at the moment only CCS and error-corrected CLR reads). It can be seen as a Swiss army knife of sequence assembly developed and used in the past 16 years to get assembly jobs done efficiently - and especially accurately. That is, without actually putting too much manual work into finishing the assembly.
New versions: 4.0.2 | OS X INTEL 4.0.2 | Linux 64
PredCRP Predicting the regulatory role of CRP transcription factor in Escherichia coli. This work uses an optimal feature selection method to identify 12 informative features of CRP-binding sites in cooperation with a support vector machine. PredCRP achieved training and test accuracy of 0.98 and 0.93, respectively. This work screened and identified 23 previously unobserved regulatory interactions in Escherichia coli. PredCRP predicted the regulatory roles of CRP acting on the 23 sites and achieved test accuracy of 0.96 according to quantitative PCR validation.
New versions: ffa830c | Linux 64 ffa830c | OS X INTEL
PyMOL Open Source This is the open source version of the widely used molecular visualization package developed by Warren DeLano.
Updated versions: 2.2.0 | Linux 64
Strelka2 Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs. The germline caller employs an efficient tiered haplotype model to improve accuracy and provide read-backed phasing, adaptively selecting between assembly and a faster alignment-based haplotyping approach at each variant locus. The germline caller also analyzes input sequencing data using a mixture-model indel error estimation method to improve robustness to indel noise. The somatic calling model improves on the original Strelka method for liquid and late-stage tumor analysis by accounting for possible tumor cell contamination in the normal sample. A final empirical variant re-scoring step using random forest models trained on various call quality features has been added to both callers to further improve precision.
New versions: 2.9.9 | OS X INTEL 2.9.2 | OS X INTEL 2.9.2 | Linux 64
TrimGalore Trim Galore is a wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data.
New versions: 0.5.0 | Linux 64 0.5.0 | OS X INTEL
VarScan VarScan is a platform-independent mutation caller for targeted, exome, and whole-genome resequencing data generated on Illumina, SOLiD, Life/PGM, Roche/454, and similar instruments.
New versions: 2.3.9 | Linux 64 2.3.9 | OS X INTEL
BioGrids Installer
The BioGrids Installer is an easy to use application that makes installing and managing life sciences software simple and quick.
A command line version is also available for Macs and Linux. Download and activate using the link button below.
The BioGrids team provides support, infrastructure and testing for scientific software packages. We currently provide 140 packages in five categories. The collection grows weekly. Learn more here: About BioGrids
Software Training
Training sessions available to HMS trainees:
Affymetrix and Illumina microarray data analysis using R/Bioconductor 10:00am - 12:00pm Wednesday, October 3, 2018
Countway Practical Presentation Skills 11:00am - 1:00pm Wednesday, October 10, 2018
Countway Practical Presentation Skills 5:00pm - 6:30pm Wednesday, October 17, 2018
Practical Steps for Increasing Openness and Reproducibility: A Day of Open Science 10:00am - 5:00pm Monday, October 22, 2018
Drop-In: Open Access & Open Data 10:00am - 12:00pm Tuesday, October 23, 2018
Managing, sharing and curating your research data in a digital environment 10:00am - 12:00pm Wednesday, October 24, 2018
Integrating reproducible best practices into biomedical and clinical research: A hands-on workshop for researchers 1:00pm - 3:00pm Wednesday, October 24, 2018
Countway Practical Presentation Skills 5:00pm - 6:30pm Wednesday, October 24, 2018
For reproducibility, we need the methods behind the data: A hands-on workshop with protocols.io 10:00am - 11:30am Friday, October 26, 2018
Reproducibility for Everyone 12:30pm - 2:00pm Friday, October 26, 2018
The Harvard Chan Bioinformatics Core
Introduction to RNA-seq Analysis Using High-Performance Computing and R October 31st - November 2nd and Nov 14th - 16th, 2018 Applications are due September 14, 2018.
Bioinformatics Support
Need help getting software installed on new machines? Have you been planning to try Amazon Web Services (AWS) cloud computing?
BioGrids can help you get started. We have expertise in bioinformatics, programming, workflow development and high performance computing.
We improve the collection with feedback from the community.
Want to see a new application in BioGrids?
Let us know: help@biogrids.org
BioGrids is supported by the HMS TnT fund and based upon SBGrid.org