Installing Software
Using the BioGrids Environment
Getting Help
Support for Site Administrators
Hardware Support Notes
Learn how to use the BioGrids Installer with practical science applications.
May 31: Using BioGrids for RNA-Seq on AWS and your Laptop
This is a repeat of the RNA-Seq workshop. Sign up for the original workshop and you will be added to the waitlist to join this additional workshop.
June 6: Introduction to RStudio for Biomedical Researchers
This workshop is features Nicholas LeCompte from BCH Research Computing's Bioinformatics and Genomics Team. This 2-hour course is designed for biomedical researchers who have an interest in leveraging R for their omics research, but don't have experience with R.
July 25, 2019 10:00am - 11:30am | TMEC 304
October 31, 2019 10:00am - 11:30am | TMEC 304 | Class is full.
Using BioGrids for RNA-Seq on AWS and Your Laptop
Complete workshop data files and scripts: biogrids_workshop2.tar.gz
Workshop presentation: Powerpoint PDF
Download this .pem file: PEM file
Move the file to your .ssh folder and change the permissions:
cd Downloads/ cp biogrids_workshop.pem ~/.ssh/ chmod 400 ~/.ssh/biogrids_workshop.pem
Click here for EC2 IP Addresses Use the machine name (ip address) handed to you in class - find it in this list and copy.
Example login:
ssh -i ./biogrids_workshop.pem ec2-user@ec2-5-44-101-187.compute-1.amazonaws.com
biogrid-production jvincent3 lEcC1N8yiNnaqAjKvIPKkmA3YwTCsF87kYkavA==
Download and install biogrids CLI client then run:
./biogrids activate <site name> <user name> <activation key>
curl -kLO https://biogrids.org/wiki/downloads/biogrids-1.0.694-Linux.tgz
tar zxf biogrids-1.0.694-Linux.tgz
cd biogrids-1.0.694-Linux
curl -kLO https://biogrids.org/wiki/downloads/biogrids-1.0.694-Darwin.tgz
tar zxf biogrids-1.0.694-Darwin.tgz
cd biogrids-1.0.694-Darwin
./biogrids activate biogrid-production jvinent1 70rYFTDnmCr93VUklfbf1s3M4jdyC9bFVYHew==
./biogrids install fastqc trimmomatic samtools star@2.7.0f subread igv
When finished, verify applications are installed:
./biogrids installed
cd # make sure we are in home dir
curl -kLO https://biogrids.org/wiki/downloads/biogrids_workshop2.tar.gz
tar zxf biogrids_workshop2.tar.gz
cd biogrids_workshop
cd biogrids_workshop
source /programs/biogrids.shrc # invoke the biogrids environment
./runMOV10.sh ./raw_fastq/ENCODE_MOV10.fq
Successful completion of the workflow will create a new directory name 'results'.
Output from the RNA-Seq workflow is stored there.
View results with IGV:
igv
Within IGV:
1: Genomes / Load Genome from File... (chr1_MOV10.fa)
2: File / Load from file... (chr1_MOV10.gtf)
3: File / Load from file... ( .bam file)
Log in to the AWS console: aws.amazon.com
Use the login you were assigned in class:
https://sbgrid.signin.aws.amazon.com/console
username: workshop21
password: Biogrids_Workshop1
Start a t2.micro instance (free tier).
The last prompt when lanuching will ask you to select a key pair.
Create a new key pair if this is the first time you have used EC2 or select an existing key pair.
If creating a new keypair, you must download the .pem file. You will need this to access the EC2 instance launched with that key pair.
We recomend keeping your .pem key files in the .ssh directory in your home.
Copy the .pem file to .ssh and set the mode to user read only:
cp my_key_file.pem ~/.ssh
cd ~/.ssh
chmod 400 my_key_file.pem
ls -l my_key_file.pem
Many beginning users of AWS are concerned about cost.
It is common for new users to accidentally incur costs they did not expect. This may happen by leaving an EC2 instance running or selecting a very expensive EC2 instance by accident.
This is the reason new AWS accounts are very limited in the services they can use.
Set up an AWS budget with alerts to avoid unexpected charges.
Search for AWS Budget in the AWS console and follow the prompts to set up a budget. We reccomend setting a limit of $1.00 for new accounts still under the free tier.
If charges are incurred and approach the $1.00 limit you will receive an email. This enables you to investigate the source of the cost and correct if needed.
ENCODE data files can be found here for CalTech RNA-Seq : http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/
Use this bam file: wgEncodeCaltechRnaSeqK562R1x75dAlignsRep1V2
Region of MOV10 gene: chr1:113,214,934-113,243,900
How to download whole genome:
chr1 is only 70M