Different Phenotype Scenarios.


  Similar to the previous examples, a bash script is outlined below that illustrates the impact of different phenotyping strategies on the long term genetic trend when generating breeding values using either pedigree BLUP (pblup) or single step genomic BLUP (ssgblup). Within each each breeding value method phenotypes on all or a random 80, 60, 40 or 20 percent of the selection candidates were collected. A total of 15 replicates were generated. After a scenario is done the directory where the replicates are saved is renamed within the bash script.

##--------------------------------------------------------------------
## First generate parameter file. Put parameters to change at the very
## bottom of the file. That way you can use the 'head' linux command
## for the parameters that change and use the 'echo' linux command to
## add any new parameters.
##--------------------------------------------------------------------
## In case Example16.txt is still a file ##
rm -rf Example16.txt || TRUE
# Parameters that are the same #
echo '-| General |-' >> Example16.txt
echo 'SEED: 1500' >> Example16.txt
echo 'NREP: 15' >> Example16.txt
echo '−| Genome & Marker |−' >> Example16.txt
echo 'CHR: 5' >> Example16.txt
echo 'CHR_LENGTH: 87 87 87 87 87' >> Example16.txt
echo 'NUM_MARK: 1750 1750 1750 1750 1750' >> Example16.txt
echo 'QTL: 200 200 200 200 200' >> Example16.txt
echo '−| Population |−' >> Example16.txt
echo 'FOUNDER_Effective_Size: Ne250' >> Example16.txt
echo 'MALE_FEMALE_FOUNDER: 50 500 random 3' >> Example16.txt
echo 'VARIANCE_A: 0.35' >> Example16.txt
echo '−| Selection |−' >> Example16.txt
echo 'GENERATIONS: 15' >> Example16.txt
echo 'INDIVIDUALS: 50 0.2 500 0.2' >> Example16.txt
echo 'PROGENY: 1' >> Example16.txt
echo 'SELECTION: ebv high' >> Example16.txt
echo 'EBV_METHOD: pblup' >> Example16.txt
echo 'CULLING: ebv 12' >> Example16.txt
echo 'MATING: random' >> Example16.txt
echo '-| Output Options |-' >> Example16.txt
echo 'GENOTYPES: no' >> Example16.txt


## Used to loop across different phenotype proportions ##
prop=("0.20" "0.40" "0.60" "0.80")

##--------------------------------------------------------------------
## first do pblup; reducing percentage of phenotypes
##--------------------------------------------------------------------
echo 'START: sequence' >> Example16.txt
echo 'EBV_METHOD: pblup' >> Example16.txt
echo 'PHENOTYPE_STRATEGY: 1.0 pheno_atselection 1.0 pheno_atselection' >> Example16.txt
## Run GenoDiver ##
./GenoDiver Example16.txt
## rename replicate output to reps_pblup_all ##
mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_pblup_all


## loop across different proportions ##
for i in 0 1 2 3
do
     ## Remove parameters that are changing ##
     head -n 22 Example16.txt > Example16a.txt
     mv ./Example16a.txt ./Example16.txt
     ## echo in new paramters ##
     echo 'START: founder' >> Example16.txt
     echo 'EBV_METHOD: pblup' >> Example16.txt
     echo 'PHENOTYPE_STRATEGY: ' ${prop[i]} 'random_atselection' ${prop[i]} 'random_atselection' >> Example16.txt
     ## Run GenoDiver ##
     ./GenoDiver Example16.txt
     ## rename replicate output to reps_pblup_all ##
     mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_pblup_${prop[i]}
done

##--------------------------------------------------------------------
## then do ssgblup; reducing percentage of phenotypes
##--------------------------------------------------------------------
## Remove parameters that are changing ##
head -n 22 Example16.txt > Example16a.txt
mv ./Example16a.txt ./Example16.txt
echo 'START: founder' >> Example16.txt
echo 'EBV_METHOD: ssgblup' >> Example16.txt
echo 'PHENOTYPE_STRATEGY: 1.0 pheno_atselection 1.0 pheno_atselection' >> Example16.txt
echo 'GENOTYPE_STRATEGY: 6 1.0 parents_offspring 1.0 parents_offspring' >> Example16.txt
## Run GenoDiver ##
./GenoDiver Example16.txt
## rename replicate output to reps_pblup_all ##
mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_ssgblup_all


## loop across different proportions ##
for i in 0 1 2 3
do
     ## Remove parameters that are changing ##
     head -n 22 Example16.txt > Example16a.txt
     mv ./Example16a.txt ./Example16.txt
     ## echo in new paramters ##
     echo 'START: founder' >> Example16.txt
     echo 'EBV_METHOD: ssgblup' >> Example16.txt
     echo 'PHENOTYPE_STRATEGY: ' ${prop[i]} 'random_atselection' ${prop[i]} 'random_atselection' >> Example16.txt
     echo 'GENOTYPE_STRATEGY: 6 1.0 parents_offspring 1.0 parents_offspring' >> Example16.txt
     ## Run GenoDiver ##
     ./GenoDiver Example16.txt
     ## rename replicate output to reps_pblup_all ##
     mv ./GenoDiverFiles/replicates ./GenoDiverFiles/reps_ssgblup_${prop[i]}
done


Parameter File Summary
  Sequence information is generated for five chromosomes with a length of 87 Megabases. The genome simulated has a moderate degree of short-range LD (Ne250). The SNP panel contains 8,750 markers (i.e. 1,750 markers per chromosome). For each chromosome, 200 randomly placed QTL and zero FTL mutations were generated. The quantitative trait simulated has a narrow sense heritability of 0.35 and only additive effects are generated (i.e. no dominance). The phenotypic variance is by default set at 1.0, and therefore the residual variance is 0.65. The founder population consisted of 50 males and 500 females. For each generation, a total of 50 males and 500 females are in the population. A total of 10 and 100 (0.2 replacement rate) male and female parents, respectively, are culled and replaced by new progeny each generation. Random selecton of progeny and culling of parents was conducted for 3 generations. After 3 generations, animals with a high EBV were selected or culled each generation. The EBV are estimated using a pedigree-based BLUP or single-step genomic BLUP utilizing all the animals. Each mating pair produced one progeny and parents were mated at random

  Similar to Example 6, the important files are saved within the renamed replicate folder within each scenario. The replicate number is appended to the file name. Outlined below is a more detailed explanation of the major differences in the phenotyping scenarios:
  • pheno_atselection: All selection candidates have phenotype information and they are utilized when ebv are being predicted.
  • random_atselection: Only a certain percentage of the selection candidates have phenotype information and they are utilized when ebv are being predicted.
  Utilizing the R code outlined below the following plot was generated to illustrate how to loop through each scenario and generate plots that describe the impact of reducing the number of phenotypes that are collected on the true breeding value genetic trend for pblup and ssgblup breeding value predictions.

R-Code
rm(list=ls()); gc()
library(ggplot2); library(tidyverse)
## Change
wd <- "/Users/jeremyhoward/Documents/39_GenoDiver_C++Code/WebsiteExamples/Example16/"
## the directory name for each scenario ##
scen <- c("reps_pblup_0.20","reps_pblup_0.40","reps_pblup_0.60","reps_pblup_0.80","reps_pblup_all" ,"reps_ssgblup_0.20","reps_ssgblup_0.40","reps_ssgblup_0.60","reps_ssgblup_0.80","reps_ssgblup_all")
reps <- c(1500:1514) ## Number of replicates simulated ##
##################################################################
## Loop through and grab metric across scenarios and replicates ##
##################################################################
for(i in 1:length(scen))
{
for(j in 1:length(reps))
{
    filename <- paste(wd,scen[i],"/Summary_Statistics_DataFrame_Performance_",reps[j],sep="")
    df <- read_table2(file=filename,col_names = TRUE,col_type = "dcccccc") %>%
        mutate(tbv = as.numeric(matrix(unlist(strsplit(bv, "[()]")), ncol = 2, byrow = TRUE)[, 1]),
                     Method = paste(unlist(strsplit(scen[i],"_"))[2:length(unlist(strsplit(scen[i],"_")))],collapse = '_'),
                     Rep = reps[j]) %>%
        select(Generation,Method,Rep,tbv)
    if(j == 1 & i == 1){summarytable <- df}
    if(j > 1 | i > 1){summarytable <- rbind(summarytable,df);}
}
}
## generate mean and sd by generation and method
means <- aggregate(tbv ~ Generation + Method, data=summarytable,FUN=mean)
sds <- aggregate(tbv ~ Generation + Method, data=summarytable,FUN=sd)
#################################################
## Plot Genetic Trend across Different Methods ##
#################################################
plotdf <- cbind(means,sds[,3]); rm(means,sds)
names(plotdf) <- c("Generation","Method","Mean","SD")
pd <- position_dodge(0.20)

plotdfa <- plotdf[which(plotdf$Method == "pblup_0.20" | plotdf$Method == "pblup_0.40" | plotdf$Method == "pblup_0.60" | plotdf$Method == "pblup_0.80" | plotdf$Method == "pblup_all"), ]
ggplot(plotdfa, aes(x=as.factor(Generation), y=Mean, group=Method, colour=Method)) +
geom_errorbar(aes(ymin=Mean-SD, ymax=Mean+SD), colour="black", width=.4, size = 0.5, position=pd) +
geom_point(size=2.0) + geom_line(size=0.50) + theme_bw() +
labs(title = "Genetic Trend PBLUP \n(+/- 1 SD)", x = "Generation", y = "Mean True Breeding Value") +
theme(plot.title = element_text(size = 16,hjust = 0.5),axis.title = element_text(size = 12),
legend.position="bottom",axis.text=element_text(size=10))

plotdfa <- plotdf[which(plotdf$Method == "ssgblup_0.20" | plotdf$Method == "ssgblup_0.40" | plotdf$Method == "ssgblup_0.60" | plotdf$Method == "ssgblup_0.80" | plotdf$Method == "ssgblup_all"), ]
ggplot(plotdfa, aes(x=as.factor(Generation), y=Mean, group=Method, colour=Method)) +
geom_errorbar(aes(ymin=Mean-SD, ymax=Mean+SD), colour="black", width=.4, size = 0.5, position=pd) +
geom_point(size=2.0) + geom_line(size=0.50) + theme_bw() +
labs(title = "Genetic Trend ssGBLUP \n(+/- 1 SD)", x = "Generation", y = "Mean True Breeding Value") +
theme(plot.title = element_text(size = 16,hjust = 0.5),axis.title = element_text(size = 12),
legend.position="bottom",axis.text=element_text(size=10))