Filtering the variants is a critical step in the pipeline, as most of the SNP callers are very inclusive by default. Below are the criteria I have used to filter SNPs:
Qual: This is the phred-scaled probability of a SNP occurring at this site. A score of 20 means that there is a 1 in 100 chance that the SNP is a false positive. A score of 30 means that there is a 1 in 1000 chance of a false positive. There is a Qual score for each variant site.
MQ: This is the phred-scaled probability that the read is mapped to the correct location (a low map score will occur if reads are not mapped uniquely at that site– i.e. they come from a region that is repeated in the genome). There is a MQ score for each variant site.
GQ: This is the phred-scaled probability that the genotype being called is correct, given that there is a SNP at that site. There is a GQ for each individual.
Minimum and maximum individual read depth: Sometimes I have found genotypes being called based on a small number of reads (e.g. 5) although the GQ is relatively high (>20). Therefore, I will likely increase the GQ threshold or also have a minimum depth requirement. Also, high depth indicates that there are repetitive regions aligning to that site so the SNP may not be real.
Minor allele frequency (>0.05): Low frequency SNPs could be due to errors and are not useful for outlier tests and several other tests of selection (although they are for site frequency spectrum tests)
Heterozygosity (<0.7): High or fixed heterozygosity could indicate parology.
Missing data: I remove SNPs that have a high amount of missing data as they are not useful for later downstream analysis.
For bench marking the different option for these blog post I used:
All SNPs: Qual=20; MQ=10; GQ=20
Filtered SNPs: >10 ind; minor allele > 0.05; het < 0.7
To filter the vcf file I have been using custom perl scripts. Vcftools is another option, as is SelectVariants and VariantFiltration from GATK. Broad also recommends variant score recalibration, but I have not yet explored these options.