Forward-In-Time Simulation Mutation Events.


  Similar to the previous examples, a bash script is outlined below that illustrates the impact of different ebv prediction on the number of segregating founder and new mutations by generation. Within each ebv prediction method a simulation that allows new mutations that impact the quantitative trait is generated. The total number of mutations that are generated within each gamete is sampled from a Poisson distribution with a rate parameter equal to the mutation rate times the length of the chromosome (i.e. nucleotides). Out of the total mutations that are generated, only 10 percent are assumed to have an effect (i.e. non-neutral) on the quantitative trait. All new mutations are stored in the 'QTL_new_old_Class" along with the generation they appeared. The two types of prediction methods include pedigree-based or genomic-based BLUP.

##--------------------------------------------------------------------
## First generate parameter file. Put parameters to change at the very
## bottom of the file. That way you can use the 'head' linux command
## for the parameters that change and use the 'echo' linux command to
## add any new parameters.
##--------------------------------------------------------------------
## In case Example18.txt is still a file ##
rm -rf Example18.txt || TRUE
# Parameters that are the same #
echo '-| General |-' >> Example18.txt
echo 'SEED: 1500' >> Example18.txt
echo 'NREP: 15' >> Example18.txt
echo '−| Genome & Marker |−' >> Example18.txt
echo 'CHR: 5' >> Example18.txt
echo 'CHR_LENGTH: 200 200 200 200 200' >> Example18.txt
echo 'NUM_MARK: 4000 4000 4000 4000 4000' >> Example18.txt
echo 'QTL: 100 100 100 100 100' >> Example18.txt
echo 'MUTATION: 2.5e-8 0.10' >> Example18.txt
echo '−| Population |−' >> Example18.txt
echo 'FOUNDER_Effective_Size: Ne100_Scen1' >> Example18.txt
echo 'MALE_FEMALE_FOUNDER: 50 250 random 3' >> Example18.txt
echo 'VARIANCE_A: 0.25' >> Example18.txt
echo 'VARIANCE_D: 0.05' >> Example18.txt
echo '−| Selection |−' >> Example18.txt
echo 'GENERATIONS: 30' >> Example18.txt
echo 'INDIVIDUALS: 50 0.2 250 0.2' >> Example18.txt
echo 'PROGENY: 1' >> Example18.txt
echo 'SELECTION: ebv high' >> Example18.txt
echo 'CULLING: ebv 12' >> Example18.txt
echo 'MATING: random' >> Example18.txt
echo '-| Output Options |-' >> Example18.txt
echo 'GENOTYPES: no' >> Example18.txt

##--------------------------------------------------------------------
## first do pblup
##--------------------------------------------------------------------
echo 'START: sequence' >> Example18.txt
echo 'EBV_METHOD: pblup' >> Example18.txt
## Run GenoDiver ##
./GenoDiver Example18.txt
## rename replicate output to reps_pblup ##
mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_pblup

##--------------------------------------------------------------------
## then do gblup
##--------------------------------------------------------------------
head -n 24 Example18.txt > Example18a.txt
mv ./Example18a.txt ./Example18.txt
echo 'START: founder' >> Example18.txt
echo 'EBV_METHOD: gblup' >> Example18.txt
## Run GenoDiver ##
./GenoDiver Example18.txt
## rename replicate output to reps_gblup ##
mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_gblup

Parameter File Summary
  Sequence information is generated for five chromosomes with a length of 200 Megabases. The genome simulated has a high degree of short-range LD (Ne100_Scen1). The SNP panel contains 20,000 markers (i.e. 4,000 markers per chromosome). For each chromosome, 100 randomly placed QTL were generated. The narrow and broad sense heritability for the quantitative trait is 0.30 and 0.25, respectively. The phenotypic variance is by default set at 1.0, and therefore the residual variance is 0.70. The founder population consisted of 50 males and 250 females. For each generation, a total of 50 males and 250 females are in the population. A total of 10 and 50 (0.2 replacement rate) male and female parents, respectively, are culled and replaced by new progeny each generation. Random selecton of progeny and culling of parents was conducted for 3 generations. After 3 generations, animals with a high EBV were selected or culled each generation. The EBV are estimated using a pedigree-based BLUP or genomic BLUP utilizing all the animals. Each mating pair produced one progeny and parents were mated at random. A mutation rate of 2.5e-8 was assumed and 10 percent of new mutations had an effect on either the quantitative.

  Similar to Example 6, the important files are saved within the renamed replicate folder within each scenario. The replicate number is appended to the file name. Utilizing the R code outlined below the following plot was generated to illustrate how to loop through each scenario and generate plots that describe the impact of the two ebv prediction methods on the number of founder mutations and new mutations segregating across generations.

R-Code
rm(list=ls()); gc()
library(ggplot2); library(tidyverse)
## Change
wd <- "/Users/jeremyhoward/Documents/39_GenoDiver_C++Code/WebsiteExamples/Example18/"
## the directory name for each scenario ##
scen <- c("reps_pblup","reps_gblup")
reps <- c(1500:1509) ## Number of replicates simulated ##
##################################################################
## Loop through and grab metric across scenarios and replicates ##
##################################################################
for(i in 1:length(scen))
{
for(j in 1:length(reps))
{
    filename <- paste(wd,scen[i],"/Summary_Statistics_QTL_",reps[j],sep="")
    df <- read_table2(file=filename,col_names = TRUE,col_type = "iiiiiiiiidi") %>%
        mutate(Method = paste(unlist(strsplit(scen[i],"_"))[2]),
                     Rep = reps[j]) %>%
        select(Generation,Method,Rep,Quant_Founder_Start,Mutation_Quan_Total)
    if(j == 1 & i == 1){summarytable <- df}
    if(j > 1 | i > 1){summarytable <- rbind(summarytable,df);}
}
}
###########################################################
## Plot Number of Founder Mutations by Prediction Method ##
###########################################################
## generate mean and sd by generation and method
means <- aggregate(Quant_Founder_Start ~ Generation + Method, data=summarytable,FUN=mean)
sds <- aggregate(Quant_Founder_Start ~ Generation + Method, data=summarytable,FUN=sd)
plotdf <- cbind(means,sds[,3]); rm(means,sds)
names(plotdf) <- c("Generation","Method","Mean","SD")
pd <- position_dodge(0.20)

ggplot(plotdf, aes(x=as.factor(Generation), y=Mean, group=Method, colour=Method)) +
geom_point(size=2.0) + geom_line(size=0.50) + theme_bw() +
labs(title = "Founder QTL Segregating", x = "Generation", y = "Mean Number of Segrating QTL") +
theme(plot.title = element_text(size = 16,hjust = 0.5),axis.title = element_text(size = 12),
legend.position="bottom",axis.text=element_text(size=10))
###########################################################
## Plot Number of New Mutations by Prediction Method ##
###########################################################
## generate mean and sd by generation and method
means <- aggregate(Mutation_Quan_Total ~ Generation + Method, data=summarytable,FUN=mean)
sds <- aggregate(Mutation_Quan_Total ~ Generation + Method, data=summarytable,FUN=sd)
plotdf <- cbind(means,sds[,3]); rm(means,sds)
names(plotdf) <- c("Generation","Method","Mean","SD")
pd <- position_dodge(0.20)

ggplot(plotdf, aes(x=as.factor(Generation), y=Mean, group=Method, colour=Method)) +
geom_point(size=2.0) + geom_line(size=0.50) + theme_bw() +
labs(title = "New QTL Segregating", x = "Generation", y = "Mean Number of Segrating QTL") +
theme(plot.title = element_text(size = 16,hjust = 0.5),axis.title = element_text(size = 12),
legend.position="bottom",axis.text=element_text(size=10))