GWAS (Genome wide association study)

QTL (quantitative trait locus)

How to identify QTLs

1. QTL/Linkage Mapping 2. GWAS (newer)
study pop. is families or experimental crosses uses a population sample
hundreds of DNA markers Tens of thousands DNA marker
Current recombinations historical recombinations

Linkage disequilibrium (LD) (non random association of allels between locis, LD "decays" - the faster the decay the higher the marker density has to be)

Pro: can scan genome with fewer markers

Cons: Can only detect alleles with large effect; limited resolution (identify broad region, not individual genes); requires data on multiple family members

Association (GWAS)

Pros: can detect subtle effects; very fine resolution

Cons: requires 0.5 to 1 million markers to cover whole genome; requires large sample size

1. QTL/Linkage Mapping (inferior to GWAS)

For QTL Linkage Mapping you need:

  • Well recorded population with high quality DNA samples
  • DNA Markers across the genome
  • Map of Marker positions
  • High quality genotype

2. GWAS (best method)

QC of SNP Genotyping

For samples

For SNPs

Association analysis is often quickest way to find genotyping errors (PLINK)

Signifikant SNP results and QC

Problems that bias you results (confounding factors)

Solutions: a) Genomic control - inflating factor to control p values b) Structure - estimate a population structure c) Principle components - genome wide IBS matrix (in Plink its the --cluster command)

The workflow is usually: 1. Plan the study (choose design to target your hypothesis) 2. Collect data (based on design, number of markers) 3. Remove problematic data 4. Identify other pedigree related problems (population stratification) 5. Association analysis 6. Correction of results to minimize false positives 7. further validation of the region (e. g. identify possible genes)

Correction and binary file creation

Without phenotype file

plink --ped wolf.ped --map wolf.map --out wolf --geno 0.25 --maf 0.05 --mind 0.25 --dog --noweb --allow-no-sex --make-bed

With phenotype file

plink --ped cow.ped --map cow.map --out cow --geno 0.25 --maf 0.05 --mind 0.25 --cow --noweb --allow-no-sex --make-bed --no-pheno --pheno cow.phe

Various analysises

--freq - Allel frequency, creates a .frq file --missing - --hardy - --asso - * add a flag to this basic command:

plink --bfile wolf --out wolf --dog --noweb --allow-no-sex

Population stratification

 plink --bfile wolf --out wolf --dog --noweb --allow-no-sex --cluster --mds-plot 2

Plotting in excel, we see 3 distinct populations:

Basic reports, case/control phenotype

plink --bfile cow --allow-no-sex --cow --out cow --assoc --noweb --mperm 10000 --no-pheno --pheno cow.phe
plink --bfile cow --allow-no-sex --cow --out cow2 --assoc --noweb --adjust

Ploting the data

library(qqman)
x <- read.table("cow2.qassoc", sep="", header = TRUE)
manhattan(x, chr="CHR", bp="BP", snp="SNP", p="P")
y <- c("rs3001", "SNP205", "rs3003")

Results: * check this out for how to get GWAS into a plot or this figure