GenoDiver Livestock Simulation Software

The generation of effects for the quantitative, fitness, and the covariance between additive effects when simulation multiple traits are important parameters that can have a large impact on the simulation results. The methods to generate effects for both types of traits is similar to previous articles and methods other simulation programs have used. At the current time, the sampling of additive effects is from a gamma distribution and dominances effects are simulated from a normal distribution to generate the covariance between quantitative and fitness traits.

Quantitative Trait

The additive effect (a), defined as half the difference in genotypic value between alternative homozygotes, is generated from a gamma distribution. The default parameters for the gamma distribution (0.4,1.66) result in an L-shaped distribution of QTL effects and implies that the majority of effects are small and a few have large effects. The gamma distribution only generates positive values, therefore, with equal probability, one of the two alleles is chosen to be positive or negative based on a binomial distribution (p = 0.5).

The dominance effect, defined as the deviation of the value of the heterozygote from the mean of the two homozygotes, was generated using a multistep procedure. Independence between additive and dominance effects is the classical treatment (Falconer & Mackay, 1996) and it is convenient because it allows orthogonally of the additive and dominance estimates. However, this independence is contradictory with the phenomena of inbreeding depression and hybrid vigor that indicates dominance is directional (Lynch & Walsh, 1998) and results from real data (Wellmann & Bennewitz 2011; Wellmann & Bennewitz 2012), which suggest an a priori dependency between additive and dominance effects. Therefore, the degree of dominance (h) is sampled from a normal distribution, which allows for the user to vary the proportion of positive or negative dominance effects by altering the mean. Next, dominance effects (d) were generated by multiplying the degree of dominance by the absolute value of the additive effect (d = h|a|). The use of this simulation method results in the additive and dominance effects to now be dependent on each other. Lastly, the choice of parameters specifying the normal distribution and the minor allele frequency for the quantitative QTL has an impact on the proportion of dominance effects that display partial or over-dominance. The proportion that display partial or over-dominance is outlined near the beginning of the log file.

Two quantitative traits with a given covariance structure between the additive effects are simulated based on methods similar to Zhang et al. (2015) and Hayashi & Iwata (2013). Within each trait, similar to how additive effects are generated for one trait, additive effects are sampled from gamma distributions and the marginal distribution across both traits are assigned the same shape and scale parameter. Due to the marginal distributions being the same across the two traits, a correlation between the additive effects for the two traits can be generated by sampling from three independent gamma distributions and the associated samples combined to generate additive effects for trait 1 and trait 2. Assuming the marginal distribution across both traits are 0.4 and 1.66 for the scale and shape parameter, respectively, the following gamma distributions were generated:

x1 ∼ gamma(0.4*r_g,1.66)
x2 ∼ gamma(0.4*(1-r_g),1.66)
x3 ∼ gamma(0.4*(1-r_g),1.66)

Samples from the associated gamma variables were then combined to generate Trait1 as x1 + x2 and Trait2 as x1+x3.

Fitness Trait

The generation of fitness effects was divided into lethal and sub-lethal genetic architectures to allow for full flexibility. The distribution of fitness effects and their associated frequency in the genome have been hypothesized to come from two competing results from the literature. The first one is based on the results obtained by (Mukai et al., 1972) and is what we called the “Mukai scenario”, where mutations are assumed to be numerous and of small effect. The second hypothesis is based on more recent results from mutation-accumulation studies and assume that mutations are considerable less frequent but of larger effect (Caballero & Keightley, 1994; Garcia-Dorado & Caballero, 2000). For both lethal and sub-lethal FTL the fitness was defined as relative fitness and is parameterized by two coefficients and they include the selection coefficient (s) and the dominance coefficient (h). The s value measures how much worse the unfit allele is, compared to the fittest allele. The h value measures the degree of dominance that the heterozygote shows regarding the reduced fitness compared to the unfit homozygote (Wright 1931). The normalization procedure forces the fittest homozygote genotype to have a value of 1, and the other homozygote geno- type has a value of 1 - s. Lastly, heterozygote genotypes have a fitness value of 1 - hs.

The selection coefficient was generated from a gamma distribution with different parameters for lethal and sublethal. The logfile outlines the mean selection coefficient for the lethal and sub-lethal FTL. As a reference when altering the shape and scale parameter, the mean of a gamma distribution is the shape X scale.

The dominance coefficient was generated from a normal distribution with different parameters for the lethal and sublethal. The absolute value of the sample is taken as the dominance coefficient. The logfile outlines the mean dominance coefficient for the lethal and sub-lethal FTL. As a reference when altering the shape and scale parameter, the mean of a gamma distribution is the shape X scale.

The fitness of an individual was then calculated as the multiplicative effect of each fitness genotype across both lethal and sub-lethal FTL with a maximum value of 1 and minimum of 0. A value closer to 1 has a higher fitness and is more likely to survive. In order to simulate environmental stochasticity, a random number was generated from a uniform distribution between 0 and 1 and compared with the fitness value for an individual. If the fitness value was less than the random value from the uniform distribution, the individual did not survive to breeding age and if it was greater than or equal to the animal survived to breeding age.

Covariance Between Traits

The correlation between the quantitative trait and the fitness trait can be due linkage or pleiotropy. Setting the COVAR parameters both to 0 results in linkage to be the only possible source of correlation between fitness and quantitive traits. Setting the COVAR parameters to a value greater than 0 results in a pleiotropic correlation between the additive effects for the quantitative trait and the selection coefficient for the sub-lethal fitness traits. The scaling of quantitative traits results in the additive effects for the quantitative trait to change and therefore covariance was generated based on Trivariate Reduction algorithm. The Trivariate Reduction algorithm only allows the correlation to be positive. For example, high values for the quantitive trait would result in the two traits being antagonistic based on a positive correlation. One just needs to change the favorable direction of the quantitative trait to alter the interpretation.

Trivariate Reduction for Gamma1 (a₁,b₁) and Gamma2 (a₂,b₂)
Correlation (ρ) bounded between: 0 ≤ ρ ≥ min(a₁,b₁) / √a₁,a₂.
- Steps:
1.) Generate Y1 ∼ gamma(a1 - √a₁,a₂,1).
2.) Generate Y2 ∼ gamma(a2 - √a₁,a₂,1).
3.) Generate Y3 ∼ gamma(√a₁,a₂,1).
4a.) Generate Value for Gamma1: b1(Y1 + Y3).
4b.) Generate Value for Gamma2: b2(Y2 + Y3).

The Y3 value generate the covariance between the two traits. For FTL that have a covariance with the quantitative trait the Y2 value is sampled for each FTL within an iteration and the rank correlation is calculated. Once the rank correlation gets within a 1.5 percent of t he value specified it then generates the selection coefficient and dominance values using the current iterations Y2 values.