Image analysis with ImageJ (Kathryn)

Posted on January 31, 2012 by Kathryn

See also this post by Allan (Dan E.)

ImageJ is a simple, free software package available for analyzing images. The download is available here.

Lab camera manual: Panasonic Lumix DMC-ZS7 (Rose)

Posted on January 31, 2012 by Rose

A couple of points:

1. There is a spare battery, so please swap out and charge the one that you have just used.

2. The photos stored on the card could be deleted at any time (if a big job needs more room on the storage card), so PLEASE download them before you take the camera back to the lab to avoid losing them.

3. The GPS should be turned on only when you need it (and set to OFF when you get on a plane).

DMCZS7 Basic Operating Instructions

DMCZS7 Operating Instructions

Macro Phenotyping (Brook)

Posted on January 30, 2012 by Brook

This post is meant to be a mostly comprehensive list and explanation of the non-molecular phenotypes that our lab measures on our various plant populations. At the moment, it is definitely not there yet (for one thing, it has a clear sunflower bias), so please contribute! Continue reading →

Bioportal (Rose)

Posted on January 30, 2012 by Rose

Bioportal is a free computing resource that provides several applications in our area. I’ve been running STRUCTURE on both the “low priority” and normal queues and it’s been fantastic (unlike Westgrid, who haven’t even responded to my application). For those of you who are struggling to find room on the cluster, it might be useful to you too. Much as I’d like to keep it to myself and exploit the hell out of it, here’s the address:

https://www.bioportal.uio.no/

Lab News–Curling Triumph(s) (ish) (Heather)

Posted on January 29, 2012 by Heather

Ok, “triumph” is relative. But we have to start somewhere.

For those of you who weren’t there to witness but have been breathless with curiosity all weekend, the Rieseberg Lab had a powerful presence at the Botany Bonspiel this last Saturday, fielding 2.5 teams.

Whether due to illness, ill-luck in the bracket placement, or possibly just lack of overall awesomeness at curling, the Riesburglars (me, Josh, Brook, and Rieseberg Ally and Token Canadian Jasmine Ono) failed to repeat our strong 4th place finish from last year. But Rieseburglars 2 (Kate, Kieran, GregO and Seb) kicked ass as an all-Canadian powerhouse, finishing undefeated in their 3-game streak (yet mysteriously not advancing to the finals???). Saving the best for last, the Crutburglars (mistakenly named “The Experiment” in the official register) (Dan, Vincenzo, and 2 Crutsingers) actually entered the finals–and left clutching fistfuls of “plastic gold” (otherwise called “Starbucks Cards”). Woot!

posting code, a warning (Greg B.)

Posted on January 26, 2012 by Greg B.

I’ve just noticed there is a problem with code that has been dropped into posts. If you copy and paste directly from the post, for some reason, depending on the syntax of the code you may lose important bits. For example “while ()” appears as “while ()” but thats not what it is! Click on edit and see for yourself! You will not have this problem if you go to the edit post page and copy from there. Also and this may just be the text editor I use but some commented line were broken into multiple lines which did not have ‘#’.

There has to be a better way to host code here but I don’t know what it is. Any ideas?

Shearing DNA for WGS Sequencing- Bioruptor (Dan E.)

Posted on January 18, 2012 by Dan E.

This post is about fragmenting, or shearing, genomic DNA to a particular size range using the Rieseberg lab’s Bioruptor sonicator.

Most of the current whole genome shotgun (WGS) library preparation protocols for NGS applications start with fragmented DNA. Generally speaking, this starting DNA should be a certain size and, for multiple samples, consistently that size. This objective turns out to be quite a tricky thing to accomplish with the Bioruptor. Given that WGS sequencing will probably continue to be popular in the lab, I am posting here what I have learned so far about taking whole genomic sunflower DNA and smashing it to the size range that I want using the Bioruptor. If I discover anything else in future library preps I’ll add it below. If anybody else has useful tips please comment.

Continue reading →

Amplifying Large AT rich amplicons with Pfu type polymerases (Allan)

Posted on January 13, 2012 by Allan

This is a protocol for amplifying very large amplicons that are high in AT. I developed in order to amplify the 8.2 Kb region containing the promoter of Arabidopsis FT which is approximately 70% AT.

In my hands it worked reliably with an 8 Kb amplicon (DeBono notebook 1, July 2010) but can be easily modified for longer products using more dNTP and optimized template concentration.

Continue reading →

Merge SNP calls (Greg B.)

Posted on January 11, 2012 by Greg B.

Rose coded this up as a faster and efficient way to combine all the snp calls into one table. I’ve made a few modifications, hopefully its not broken. Updates are likely in the future.

Continue reading →

SNP calling with ML (Greg B.)

Posted on January 11, 2012 by Greg B.

Email for v7. Bug found that printed G’s as C’s and vise versa.

Call SNPs from sam files in a method similar to Hohenlohe et al 2010. Updated to v4 Feb 9. Previous version had a bug.

It is now fixed up for all of BWAs cigars flavors.

This only deals with reads that fit one of the following:
Full alignment (55M)
Soft clip at the start (10S45M)
Soft clip at end (45M10S)
One deletion (25M10D25M)
One insertion (20M10I20M)

This means it ignores reads with a cigar fields that have N, H, P, = or X and it ignores reads with a cigar more complicate then a single soft clip or a single indel. It also does not penalize reads adjacent to indels.

It ignores bases in soft clipped parts of reads

Continue reading →

RAM cache problems on Linux (Nolan)

Posted on January 11, 2012 by Nolan

I just wanted to tell you about a common, and as far as I know unavoidable, problem that occurs in all Linux distributions I have worked with on machines that have large memory. The RAM cache fills up and is not emptied properly.
The RAM cache is used extensively when large files are read and written, and for some reason the RAM cache gradually fills up due to file transfers and is not cleared very quickly. In some cases when another program requires memory the memory just isn’t there – you can see now on hulk no memory is being used by programs, yet over 100 Gb is being used by the cache, so only about half of the RAM is listed as available. In many cases this can use up so much RAM that the computer has to switch to using SWAP space, which happens regularly on the zoology cluster and all of our high-memory machines. I have seen this on RedHat and related, Ubuntu, and OpenSuse, among others. The only way I have found to get rid of this problem is to manually clear the cache, which only works on our machines as you need root access. Although this is not an ideal solution it hasn’t caused me problems so far. When using high-RAM programs I frequently clear the RAM cache to make sure the Free Memory stays at the max. To do that:
sudo su
sync; echo 3 > /proc/sys/vm/drop_caches
exit

Don’t worry if a file transfer is occurring, the sync command protects it. But, don’t alter these commands unless you are sure you know what you are doing. I have talked to many very smart people about this and nobody has come up with a better solution. Let me know, though, if any of you have better ideas.

SNP table parsing (Greg B.)

Posted on January 10, 2012 by Greg B.

Ask me for the most current version if you want to use any of these!

A few perl scripts that take a SNP table and do the following:
1. Remove unwanted samples and rename the samples
2. Remove sites that do not have enough data
3. Order the sites based on a map

Continue reading →

Population Genomics! (Greg B.)

Posted on January 10, 2012 by Greg B.

Ask me for the most current version of these if you want to use them.

Several people in the lab are now working with very similar data (in structure) and have similar questions (in technique). As I have discussed with several people it would be very useful if we could all build up and share the tools needed to do these analysis. Understandably, people may want to do things on their own or in a specific manner, but I think there are several advantages to building this up together. The main thing is that it will be more efficient in terms of personnel time, having each person re-invent the wheel does not make sense. I think the blog is a great place to set do this. Below we can make our wish list and link to posts for solutions as they become available. As always, the principles (1,2 and 3) covered here apply.

Continue reading →

AWK (Seb)

Posted on January 6, 2012 by Seb

What is awk?

“AWK is a language for processing files of text. A file is treated as a sequence of records, and by default each line is a record. Each line is broken up into a sequence of fields, so we can think of the first word in a line as the first field, the second word as the second field, and so on. An AWK program is of a sequence of pattern-action statements. AWK reads the input a line at a time. A line is scanned for each pattern in the program, and for each pattern that matches, the associated action is executed.” – Alfred V. Aho

Why awk?

1.AWK is simpler to use than most conventional programming languages.
2. It is fast.
3. It has string manipulation functions, so it can search for particular strings and modify the output.
4. A version of the AWK language is a standard feature of nearly every modern Unix-like operating system available today.

Simple examples on how to use AWK:
Continue reading →

SNP summary statistics in R: ‘hierfstat’ is back and better than before! (Rose)

Posted on January 2, 2012 by Rose

After being disabled and not supported for several months, ‘hierfstat’ (by Jerome Goudet) now has lots of useful (and fast) calculations of summary statistics, including expected and observed heterozygosity, Fst and Jost’s Dest.

Continue reading →

STACKS installation (Rose)

Posted on December 12, 2011 by Rose

Installing stacks on Ubuntu Natty Narwhal or Oneiric Ocelot

STACKS is a piece of software produced by Julian Catchen in the Cresko lab. It’s designed to identify loci and alleles from RAD (or GBS) reads either de novo or after alignment to a reference. It consists of several modules that can be run separately, but to completely install it as a pipeline, it relies on a web server, unfortunately. Many of the required instructions are given in the README file, but because nobody in our lab is an expert on this, we had to fiddle around to get the program running on our Ubuntu machines.

Continue reading →

BLAST databases (Seb)

Posted on December 9, 2011 by Seb

I’ve uploaded two BLAST protein databases one nucleotide database here: /Linux/Loren/blast_database

Please look at the README file for more info and append to it if your modify things or add databases. Continue reading →

Ordered Transcriptomes (Greg B.)

Posted on December 8, 2011 by Greg B.

Here are the transcriptome assemblies ordered based on the map Chris made. The columns are contig name, Linkage group, and CentiMorgan.

The Trinity Assembly ordered
The 16000 gene assembly

Greg B.

Removing adaptors, low-complexity and low-quality from fastq (Nolan)

Posted on December 7, 2011 by Nolan

Thuy and I have made several perl scripts to trim Illumina reads, available on the zoology cluster at:

/Linux/Loren/Seq/trimIlluminaFqQual20.pl

/Linux/Loren/Seq/trimIlluminaFqQual20Phred.pl

Continue reading →

BioDiv building manual (Kathryn)

Posted on December 6, 2011 by Kathryn

… is located here. Useful info about the location of icemakers and autoclaves, and who to ask about card access, etc.

Rieseberg Lab Resources

RLR: Technical resources for Rieseberglers

Image analysis with ImageJ (Kathryn)

Lab camera manual: Panasonic Lumix DMC-ZS7 (Rose)

Macro Phenotyping (Brook)

Bioportal (Rose)

Lab News–Curling Triumph(s) (ish) (Heather)

posting code, a warning (Greg B.)

Shearing DNA for WGS Sequencing- Bioruptor (Dan E.)

Amplifying Large AT rich amplicons with Pfu type polymerases (Allan)

Merge SNP calls (Greg B.)

SNP calling with ML (Greg B.)

RAM cache problems on Linux (Nolan)

SNP table parsing (Greg B.)

Population Genomics! (Greg B.)

AWK (Seb)

SNP summary statistics in R: ‘hierfstat’ is back and better than before! (Rose)

STACKS installation (Rose)

BLAST databases (Seb)

Ordered Transcriptomes (Greg B.)

Removing adaptors, low-complexity and low-quality from fastq (Nolan)

BioDiv building manual (Kathryn)