Text File To kml – Perl Script

Posted on September 28, 2012 by Dan E.

Google Earth reads and writes a special form of xml file called a kml (keyhole markup language). Many other geographic viewers and GISs can also read kml files so it’s not a bad thing to be able to make kml files for sample location data. I assume there are many ways to do this. The way I have done it is via a perlscript that I wrote. This post provides that script and explains what it does.

Here is the script, its called texttokml.pl.

It’s very simple and I commented it heavily so even the most naive perl programmer should be able to figure it out and change it but if you want me to hold your hand just ask.

Explanation follows . . .

Continue reading →

Scripts for Formatting SNP Tables

Posted on September 26, 2012 by Greg B.

Some SNP table to useful table conversion scripts are here: FormattingScripts_v0.4

Readme.txt explains usage, makes fasta, bayescan, structure files as well as converting to digits for R.

Let me know if you find any of this useful or broken.

Greg

Edit: updated small fix to structure formatter

DNA Gel excision protocol

Posted on September 6, 2012 by Allan

Most gel excisions stink. Qiagen and Promega tend to claim 80%+ efficiencies that aren’t really achievable. Here I’m describing Jen Sheen’s SiO2 method for DNA purification from agarose: http://www.plantmethods.com/content/pdf/1746-4811-6-1.pdf

Continue reading →

Silo: Useful program ISO files

Posted on August 3, 2012 by Allan

You are required to login to view this page.

CheapEasy DIY Barcodes in R

Posted on July 31, 2012 by Rob

I couldn’t believe how expensive the software was for writing barcodes, so I wrote a short program in R to do it for FREE. And, frankly it should be faster and easier if you already have your labels in an Excel file. You don’t really need to understand the program or even R functions to use it, as long as you know how to run an R program.

Setup and Overview:

[UPDATED (see notes below)] – R-code. Start with this (Note I could not upload a .R file, so this is .txt but still an R program).

Input – barcodes128.csv – You need this file to run the program. Save it in your working directory (see comments in R code for how to set this). AND labels.csv – This is a sample file showing the format for your labels. Even though it’s a .csv, it is a single column with each label as a separate row, so there are no actual commas

Output – BarcodesOut.pdf – A sample output: a pdf file for the 0.5″x1.75″ Worth Poly Label WP0517 (Polyester Label Stock), currently in the lab

That’s really all you need to know, everything that follows is extraneous info. If you have any problems, check out the Detailed Instructions, Troubleshooting Tips, or add a comment below. Continue reading →

Drought tolerance

Posted on July 30, 2012 by Kathryn

Here is a new method for assessing drought tolerance which claims to be fast. Seems interesting.

http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2012.00230.x/abstract

Old lab PC – new Ubuntu computer

Posted on June 15, 2012 by Seb

I’ve installed the latest version of Ubuntu (12.04) on the old PC lab computer:

-Username, computer name and password are written on the computer itself, if needed.
-I’ve also installed on it a few of my favorite programs (LibreOffice, Inkscape, Gimp, R, Chrome).
-It boots in about 35 seconds, not bad for an “old piece of junk”!

Feel free to use it!

seb

GBS Protocol (GregO)

Posted on June 14, 2012 by Greg Owens

Kristin and I have been working on GBS for a long time and since it now seems to be working, we wrote up a protocol. It is mostly the same from Greg Baute’s previous protocol, but with a few key changes (More DNA, more PCR). I’ve made it look nice and included a diagram for ease of thought.

Also, the official pronunciation of GBS is ‘jibs’

Continue reading →

RLR Image Library (Dan E.)

Posted on May 28, 2012 by Dan E.

Hello all. I’ve created an image library here at RLR. I think its a good idea and I’m hoping that you do too. If we accumulate a collection of good images, mostly photos I assume, it will become useful and interesting.

The idea is that we can post image galleries – collections of photos of or about a particular project/trip/experiment/event – to share with the lab. I hope this sharing will be informative and entertaining but also practical – we can share images for use in presentations and posters and the like.

Right off the bat I want it to be clear that if you use any image from RLR that you did not create yourself, that image must be attributed to its creator. Its easy, just give the person who made and uploaded the image a credit on or near the image in your presentation or poster etc.. Pretty obvious really.

The image library comprises galleries displayed on new pages added to RLR under the “Image Library” page. If you wish, these pages can be hidden such that only registered users who have logged in can see your images.

Check out Brook’s opening effort for an excellent example of what I’m talking about.

Greg O. has also put up some nice shots of Californian sunflowers.

I’ve added instructions on the “How to: contribute content” page and on the “How to: use RLR” page.

If my instructions are insufficient, or you can see obvious problems or improvements, please let me know.

Answers to some questions that you may have . . .

Continue reading →

SnoWhite Tips and Troubleshooting (Thuy)

Posted on May 18, 2012 by Thuy

Snowhite is a tool for cleaning 454 and illumina reads. There are quite a few gotchas that will take you half a day to debug. This wiki has a lot of good tips.

Snowhite invokes other bioinformatics programs, one of them being TagDust. If you get a segfault error from TagDust, it may be because you are searching for contaminant sequences larger than TagDust can handle. TagDust can only handle maximum 1000 characters per line in the contaminant fasta file and maximum 1000 base contaminant sequence lengths.

A segfault (or segmentation fault) happens when a program accesses the wrong piece of memory. After TagDust hits the 1000 line character/sequence base limit, TagDust keeps trying to access memory past the 1000 memory slots it has allocated. It may try to access non-existent memory locations or off-limits memory locations. You need to edit the TagDust source code so it allocates enough memory for the sequences and does not wander into bad memory locations.

Go into your TagDust source code directory and edit file “input.c”.

Go to line 68:

char line[MAX_LINE];

Change MAX_LINE to a number larger than the number of characters in the longest line in your contaminant fasta file. You probably can skip this step if you are using the NCBI UniVec.fasta files, since the default of 1000 is enough.
Go to line 69:

char tmp_seq[MAX_LINE];

Change MAX_LINE to a number larger than the number of bases in the longest contaminant sequence in your contaminant fasta file. I tried 1000000 with a recent NCBI UniVec.fasta file and it worked for me.

Recompile your TagDust source code
- Delete all the existing executables by executing make clean in the same directory as the Makefile
- Compile all your files again by executing make clean in the same directory as the Makefile
- If you decided to allocate a lot of memory to your arrays, and your program requires > 2GB of memory at compile time, you may run into “relocation truncated to fit: R_X86_64_PC32 against symbol” errors during linkage. This occurs when the compiler is unable to allocate enough space for the program’s statically allocated objects. Edit the Makefile so that

CC = gcc
becomes
CC = gcc -mcmodel=medium

Reference: http://www.obihai.org/2010/05/relocation-truncated-to-fit-rx866432s.html

WGS barcoded adaptors usage (Dan B.)

Posted on May 7, 2012 by Dan B.

At this google doc spreadsheet you can find a record of curent usage for the 48 barcoded adaptors we have in the lab for WGS. Each adaptor stock is good for a total of 8 uses. The idea is that we should (as much as possible) try to even out the amount we utilize across all 48 alternatives.

Please update the document when you use any of the adaptors.

Compiled Sunflower QTLs (GregO)

Posted on May 3, 2012 by Greg Owens

Last year I worked on a project to see if any of the domestication outlier genes were found with previously mapped QTLs. The project ultimately fell flat when new data showed that the outlier I was working on wasn’t an outlier, but I did compile a large table of sunflower QTLs which may be useful. The table has 369 mapped QTLs.

I’ve shared this with a couple of people, but I’m posting it here on a google doc for everyone to use. Here is the link: https://docs.google.com/spreadsheet/ccc?key=0AgfXIvTZMEqPdHdJWTk3UVlVa3dkdGFTak9ySlUtNkE

A couple notes:
-It was compiled about a year ago, so it may be out of date. Also, although I tried to include every applicable study, I may have missed some. If you do find a study that I missed, I encourage you to add it to the table.
-It is only from annuus crosses, and a majority are domestics
-The position values are in cM

Anyway, read and enjoy. Change it if you find errors or new papers!

Barcode reader (Kathryn)

Posted on May 1, 2012 by Kathryn

For your big greenhouse or field experiments, data entry can be a serious pain. So skip it.

Continue reading →

New User and Access Policy for RLR (Dan E.)

Posted on April 30, 2012 by Dan E.

Hello everybody. As described in the lab meeting on 27th April 2012 I have instituted a new system of users and access to RLR.

The key change is that we will no longer run the blog as an entirely private site.

Continue reading →

Global climate and soil data (Kathryn)

Posted on April 27, 2012 by Kathryn

There are a few publicly available data sets that are useful for looking at the abiotic environments of specific locations.

Continue reading →

Double stranded cDNA synthesis for use with Nimblegen microarrays (Kathryn)

Posted on April 27, 2012 by Kathryn

Many people in the lab in the past have used custom built Nimblegen microarrays to assess gene expression. For this purpose, high quality double stranded cDNA libraries must be produced from your RNA extractions.

Continue reading →

Jaatha – training data sets (Rose)

Posted on April 27, 2012 by Rose

I’ve generated three training data sets, which will save you around 5 days if you decide to run Jaatha, a molecular demography program. It uses the joint site frequency spectrum of two populations to model various aspects of population history (split time, population size and growth, migration). Here’s the paper: Naduvilezhath et al 2011.

1. Using the default model, with the following maxima: tmax=20, mmax=5, qmax=10.

2. Alternative maxima: tmax=5, mmax=20, qmax=20.

3. Alternative maxima: tmax=5, mmax=20, qmax=5.

They can’t be uploaded because they’re compressed R data structures, but let me know if you’d like to give them a whirl.

Guide to Next-Generation Sequencers (Brook)

Posted on April 25, 2012 by Brook

Occasionally I find myself reading a news item or a paper that mentions a particular sequencing platform and scratching my head to remember of what exactly that particular platform is capable. If you ever find yourself in that same boat, the Molecular Ecologist has a very handy and often-updated guide here.

96-well plate DNA extractions: Qiagen versus the Rieseberg lab (Kate)

Posted on March 26, 2012 by Kate

I had been having some trouble getting 96-well plate DNA extractions working consistently with our homemade regents, so Greg O and I decided to do a side by side comparison of our reagents with the Qiagen regents. Here are our results:

Continue reading →

Making Illumina Whole Genome Shotgun Sequencing Libraries – (Dan E.)

Posted on March 23, 2012 by Dan E.

I’ve been making whole genome shotgun sequencing libraries (for the purposes of this post: WGSS libraries) to sequence sunflower genomes on the Biodiversity Centre’s Illumina HiSeq. I haven’t been doing it for very long and its likely that my approach will change in the future as costs and products change but, as of early 2012, I’ve landed on a hybrid protocol based on kits from an outfit called Bioo Scientific. I use the Bioo Sci. adapter kit and their library prep kit up to the final PCR step at which point I switch to a PCR kit from another outfit called KAPA. I also use a KAPA kit to quant libraries with qPCR. In this post I give a little context then describe what I do to make WGSS libraries . . . Continue reading →

Rieseberg Lab Resources

RLR: Technical resources for Rieseberglers

Text File To kml – Perl Script

Scripts for Formatting SNP Tables

DNA Gel excision protocol

Silo: Useful program ISO files

CheapEasy DIY Barcodes in R

Drought tolerance

Old lab PC – new Ubuntu computer

GBS Protocol (GregO)

RLR Image Library (Dan E.)

Answers to some questions that you may have . . .

SnoWhite Tips and Troubleshooting (Thuy)

WGS barcoded adaptors usage (Dan B.)

Compiled Sunflower QTLs (GregO)

Barcode reader (Kathryn)

New User and Access Policy for RLR (Dan E.)

Global climate and soil data (Kathryn)

Double stranded cDNA synthesis for use with Nimblegen microarrays (Kathryn)

Jaatha – training data sets (Rose)

Guide to Next-Generation Sequencers (Brook)

96-well plate DNA extractions: Qiagen versus the Rieseberg lab (Kate)

Making Illumina Whole Genome Shotgun Sequencing Libraries – (Dan E.)