workshops

BioGrids Workshops

Learn how to use the BioGrids Installer with practical science applications.

Upcoming Workshops

Completed Workshops

May 31: Using BioGrids for RNA-Seq on AWS and your Laptop

This is a repeat of the RNA-Seq workshop. Sign up for the original workshop and you will be added to the waitlist to join this additional workshop.

June 6: Introduction to RStudio for Biomedical Researchers

This workshop is features Nicholas LeCompte from BCH Research Computing's Bioinformatics and Genomics Team. This 2-hour course is designed for biomedical researchers who have an interest in leveraging R for their omics research, but don't have experience with R.

July 25, 2019 10:00am - 11:30am | TMEC 304

Using BioGrids for RNA-Seq on AWS and Your Laptop

October 31, 2019 10:00am - 11:30am | TMEC 304 | Class is full.

Workshop Details

Using BioGrids for RNA-Seq on AWS and Your Laptop

Complete workshop data files and scripts: biogrids_workshop2.tar.gz

Workshop presentation: Powerpoint PDF

Log in to your AWS EC2 instance

Download this .pem file: PEM file

Move the file to your .ssh folder and change the permissions:

cd Downloads/ cp biogrids_workshop.pem ~/.ssh/ chmod 400 ~/.ssh/biogrids_workshop.pem

Click here for EC2 IP Addresses Use the machine name (ip address) handed to you in class - find it in this list and copy.

Example login:

  ssh -i ./biogrids_workshop.pem ec2-user@ec2-5-44-101-187.compute-1.amazonaws.com

Temporary BioGrids Installer Account Credentials

biogrid-production jvincent3  lEcC1N8yiNnaqAjKvIPKkmA3YwTCsF87kYkavA==

1: Install and activate BioGrids CLI

Download and install biogrids CLI client then run:

./biogrids activate <site name>  <user name>  <activation key>

Linux

curl -kLO https://biogrids.org/wiki/downloads/biogrids-1.0.694-Linux.tgz
tar zxf biogrids-1.0.694-Linux.tgz
cd biogrids-1.0.694-Linux

OSX

curl -kLO https://biogrids.org/wiki/downloads/biogrids-1.0.694-Darwin.tgz
tar zxf biogrids-1.0.694-Darwin.tgz
cd biogrids-1.0.694-Darwin

Activate biogrids

./biogrids activate biogrid-production jvinent1  70rYFTDnmCr93VUklfbf1s3M4jdyC9bFVYHew==

2: Install software with BioGrids

./biogrids install fastqc trimmomatic samtools star@2.7.0f subread igv

When finished, verify applications are installed:

./biogrids installed

3: Download and uncompress tutorial files

cd    # make sure we are in home dir
curl -kLO https://biogrids.org/wiki/downloads/biogrids_workshop2.tar.gz
tar zxf biogrids_workshop2.tar.gz
cd biogrids_workshop

4: Run RNA-Seq workflow

cd biogrids_workshop
source /programs/biogrids.shrc     # invoke the biogrids environment
./runMOV10.sh ./raw_fastq/ENCODE_MOV10.fq

Successful completion of the workflow will create a new directory name 'results'.

Output from the RNA-Seq workflow is stored there.

View results with IGV:

igv

Within IGV:

1: Genomes / Load Genome from File... (chr1_MOV10.fa)

2: File / Load from file... (chr1_MOV10.gtf)

3: File / Load from file... ( .bam file)

RNA-Seq Workflow on AWS

Start an EC2 Instance

Use the login you were assigned in class:

 https://sbgrid.signin.aws.amazon.com/console

 username: workshop21
 password: Biogrids_Workshop1

Start a t2.micro instance (free tier).

The last prompt when lanuching will ask you to select a key pair.

Create a new key pair if this is the first time you have used EC2 or select an existing key pair.

If creating a new keypair, you must download the .pem file. You will need this to access the EC2 instance launched with that key pair.

We recomend keeping your .pem key files in the .ssh directory in your home.

Copy the .pem file to .ssh and set the mode to user read only:

cp my_key_file.pem ~/.ssh
cd ~/.ssh
chmod 400 my_key_file.pem
ls -l my_key_file.pem

Set an AWS Budget Alert

Many beginning users of AWS are concerned about cost.

It is common for new users to accidentally incur costs they did not expect. This may happen by leaving an EC2 instance running or selecting a very expensive EC2 instance by accident.

This is the reason new AWS accounts are very limited in the services they can use.

Set up an AWS budget with alerts to avoid unexpected charges.

Search for AWS Budget in the AWS console and follow the prompts to set up a budget. We reccomend setting a limit of $1.00 for new accounts still under the free tier.

If charges are incurred and approach the $1.00 limit you will receive an email. This enables you to investigate the source of the cost and correct if needed.

Genome Resources

ENCODE data files can be found here for CalTech RNA-Seq : http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/

Use this bam file: wgEncodeCaltechRnaSeqK562R1x75dAlignsRep1V2

Region of MOV10 gene: chr1:113,214,934-113,243,900

How to download whole genome:

UCSC ftp site: hgdownload.cse.ucsc.edu
UCSC web site: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/
UCSC recommends using an ftp client for large file downloads
chr1 is only 70M