Different Genotype Scenarios.


  Similar to the previous examples, a bash script is outlined below that illustrates the impact of different genotyping strategies on the long term genetic trend when generating breeding values using single step genomic BLUP (ssgblup). The scenarios include no genotyping, only genotype a proportion or genotype every selection candidate. A total of 15 replicates were generated. After a scenario is done the directory where the replicates are saved is renamed within the bash script.

##--------------------------------------------------------------------
## First generate parameter file. Put parameters to change at the very
## bottom of the file. That way you can use the 'head' linux command
## for the parameters that change and use the 'echo' linux command to
## add any new parameters.
##--------------------------------------------------------------------
## In case Example15.txt is still a file ##
rm -rf Example15.txt || TRUE
# Parameters that are the same #
echo '-| General |-' >> Example15.txt
echo 'SEED: 1500' >> Example15.txt
echo 'NREP: 15' >> Example15.txt
echo '−| Genome & Marker |−' >> Example15.txt
echo 'CHR: 5' >> Example15.txt
echo 'CHR_LENGTH: 87 87 87 87 87' >> Example15.txt
echo 'NUM_MARK: 1750 1750 1750 1750 1750' >> Example15.txt
echo 'QTL: 200 200 200 200 200' >> Example15.txt
echo '−| Population |−' >> Example15.txt
echo 'FOUNDER_Effective_Size: Ne250' >> Example15.txt
echo 'MALE_FEMALE_FOUNDER: 50 500 random 3' >> Example15.txt
echo 'VARIANCE_A: 0.35 0.15 0.35' >> Example15.txt
echo '−| Selection |−' >> Example15.txt
echo 'GENERATIONS: 15' >> Example15.txt
echo 'INDIVIDUALS: 50 0.4 500 0.2' >> Example15.txt
echo 'PROGENY: 1' >> Example15.txt
echo 'SELECTION: index_ebv high' >> Example15.txt
echo 'PHENOTYPE_STRATEGY1: 1.0 pheno_atselection 1.0 pheno_atselection' >> Example15.txt
echo 'PHENOTYPE_STRATEGY2: 1.0 pheno_afterselection 1.0 pheno_afterselection' >> Example15.txt
echo 'INTERIM_EBV: after_culling' >> Example15.txt
echo 'INDEX_PROPORTIONS: 0.20 0.80' >> Example15.txt
echo 'CULLING: index_ebv 12' >> Example15.txt
echo 'MATING: random' >> Example15.txt
echo '-| Output Options |-' >> Example15.txt
echo 'GENOTYPES: no' >> Example15.txt

## Used to loop across different phenotype proportions ##
prop=("0.20" "0.40" "0.60" "0.80")

##--------------------------------------------------------------------
## first do pblup
##--------------------------------------------------------------------
echo 'START: sequence' >> Example15.txt
echo 'EBV_METHOD: pblup' >> Example15.txt
## Run GenoDiver ##
./GenoDiver Example16.txt
## rename replicate output to reps_pblup_all ##
mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_pblup

##--------------------------------------------------------------------
## loop across different proportions based on ebv
##--------------------------------------------------------------------
for i in 0 1 2 3
do
     ## Remove parameters that are changing ##
     head -n 26 Example15.txt > Example15a.txt
     mv ./Example15a.txt ./Example15.txt
     ## echo in new paramters ##
     echo 'START: founder' >> Example15.txt
     echo 'EBV_METHOD: ssgblup' >> Example15.txt
     echo 'GENOTYPE_STRATEGY: 9' ${prop[i]} 'parents_ebv' ${prop[i]} 'parents_ebv' >> Example15.txt
     ## Run GenoDiver ##
     ./GenoDiver Example15.txt
     ## rename replicate output to reps_ebv_prop ##
     mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_ebv_${prop[i]}
done

##--------------------------------------------------------------------
## loop across different proportions based on random
##--------------------------------------------------------------------
for i in 0 1 2 3
do
     ## Remove parameters that are changing ##
     head -n 26 Example15.txt > Example15a.txt
     mv ./Example15a.txt ./Example15.txt
     ## echo in new paramters ##
     echo 'START: founder' >> Example15.txt
     echo 'EBV_METHOD: ssgblup' >> Example15.txt
     echo 'GENOTYPE_STRATEGY: 9' ${prop[i]} 'parents_random' ${prop[i]} 'parents_random' >> Example15.txt
     ## Run GenoDiver ##
     ./GenoDiver Example15.txt
     ## rename replicate output to reps_ebv_prop ##
     mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_random_${prop[i]}
done

##--------------------------------------------------------------------
## Genotype everyone
##--------------------------------------------------------------------
## Remove parameters that are changing ##
head -n 26 Example15.txt > Example15a.txt
mv ./Example15a.txt ./Example15.txt
## echo in new paramters ##
echo 'START: sequence' >> Example15.txt
echo 'GENOTYPE_STRATEGY: 9 1.0 parents_offspring 1.0 parents_offspring' >> Example15.txt
echo 'EBV_METHOD: ssgblup' >> Example15.txt
## Run GenoDiver ##
./GenoDiver Example15.txt
## rename replicate output to reps_all ##
mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_all

Parameter File Summary
  Sequence information is generated for five chromosomes with a length of 87 Megabases. The genome simulated has a moderate degree of short-range LD (Ne250). The SNP panel contains 8,750 markers (i.e. 1,750 markers per chromosome). For each chromosome, 200 randomly placed QTL and zero FTL mutations were generated. Across both quantitative traits simulated, a narrow sense heritability of 0.35 and only additive effects were generated (i.e. no dominance). The residual variance across both traits is 0.65. The two traits are simulated with a correlation of 0.15 and 0.00 for the additive genetic and residual environmental effects between trait 1 and trait 2. The founder population consisted of 50 males and 500 females. For each generation, a total of 50 males and 500 females are in the population. A total of 20 (0.4 replacement rate) male and 100 female (0.2 replacement rate) parents are culled and replaced by new progeny each generation. Random selecton of progeny and culling of parents was conducted for 3 generations. After 3 generations, animals with a high EBV were selected or culled each generation. The EBV are estimated using a pedigree-based BLUP or single-step genomic BLUP utilizing all the animals. Each mating pair produced one progeny and parents were mated at random. At selection the phenotype for trait 1 was observed across all selection candidates, while the phenotype for trait 2 was observed after an animal was selected. Prior to determining whether an animal is to be genotyped, interim ebv were predicted for all animals.

  Similar to Example 6, the important files are saved within the renamed replicate folder within each scenario. The replicate number is appended to the file name. Outlined below is a more detailed explanation of the major differences in the phenotyping scenarios:
  • parents_ebv: Genotype a proportion of selection candidates with high ebv and all selected parents.
  • parents_random: Genotype a proportion of selection candidates at random and all selected parents.
  • parents_offspring: Genotype all selection candidates.
  Utilizing the R code outlined below the following plot was generated to illustrate how to loop through each scenario and generate plots that describe the impact of reducing the number of genotyped animals on the true breeding value genetic trend.

R-Code
rm(list=ls()); gc()
library(ggplot2); library(tidyverse)
## Change
wd <- "/Users/jeremyhoward/Documents/39_GenoDiver_C++Code/WebsiteExamples/Example15/"
## the directory name for each scenario ##
scen <- c("reps_pblup","reps_random_0.20","reps_ebv_0.20","reps_random_0.40","reps_ebv_0.40","reps_random_0.60" ,"reps_ebv_0.60","reps_random_0.80" ,"reps_ebv_0.80","reps_all")
reps <- c(1500:1514) ## Number of replicates simulated ##
##################################################################
## Loop through and grab metric across scenarios and replicates ##
##################################################################
for(i in 1:length(scen))
{
for(j in 1:length(reps))
{
    filename <- paste(wd,scen[i],"/Summary_Statistics_DataFrame_Performance_",reps[j],sep="")
    df <- read_table2(file=filename,col_names = TRUE,col_type = "dccccccccccccc") %>%
        mutate(index_tbv = as.numeric(matrix(unlist(strsplit(index_tbv, "[()]")), ncol = 2, byrow = TRUE)[, 1]),
                     Method = paste(unlist(strsplit(scen[i],"_"))[2:length(unlist(strsplit(scen[i],"_")))],collapse = '_'),
                     Rep = reps[j]) %>%
        select(Generation,Method,Rep,index_tbv)
    if(j == 1 & i == 1){summarytable <- df}
    if(j > 1 | i > 1){summarytable <- rbind(summarytable,df);}
}
}
## generate mean and sd by generation and method
means <- aggregate(index_tbv ~ Generation + Method, data=summarytable,FUN=mean)
sds <- aggregate(index_tbv ~ Generation + Method, data=summarytable,FUN=sd)
#################################################
## Plot Genetic Trend across Different Methods ##
#################################################
plotdf <- cbind(means,sds[,3]); rm(means,sds)
names(plotdf) <- c("Generation","Method","Mean","SD")
pd <- position_dodge(0.20)

ggplot(plotdf, aes(x=as.factor(Generation), y=Mean, group=Method, colour=Method)) +
geom_errorbar(aes(ymin=Mean-SD, ymax=Mean+SD), colour="black", width=.4, size = 0.5, position=pd) +
geom_point(size=2.0) + geom_line(size=0.50) + theme_bw() +
labs(title = "Genetic Trend", x = "Generation", y = "Mean True Breeding Value") +
theme(plot.title = element_text(size = 16,hjust = 0.5),axis.title = element_text(size = 12),
legend.position="bottom",axis.text=element_text(size=10))