Estimating Effects of Genes: One Size May Not Fit All
For decades, animal breeders have benefited from databases of livestock performance information on economically important traits such as milk yield or reproductive performance, along with pedigree, to accurately predict genetic merit of animals. Genotype information using single nucleotide polymorphism (SNP) markers available within the past decade, particularly in dairy cattle, has further increased their accuracy and the ability to select younger animals for breeding stock, thereby doubling genetic gain in milk yields per year.
One of the surveyors of heredity is Rob Tempelman, professor of quantitative genetics and animal breeding in the Michigan State University (MSU) Department of Animal Science and recipient of the 2017 Jay L. Lush Award in Animal Breeding from the American Dairy Science Association.
Tempelman describes his work as an attempt to enhance research reproducibility while recognizing that genetic effects and their associations with economically important traits can vary widely across different environments.
For the past couple of decades, Tempelman, along with several graduate students, has developed statistical models and computationally feasible algorithms to help assess the nature of this heterogeneity and the impacts on prediction on genetic merit of ignoring it. His current research on the use of SNP markers to predict genetic merit for complex traits, is factoring in influences such as management systems.
Feed efficiency of dairy cows is an example of an area that has benefited. Agricultural producers want to maximize milk output while minimizing feed cost without jeopardizing the health of the cow. Feeds with different chemical compositions — including, for example, energy and protein content — are used across farms or even between seasons on the same farm, so there can be substantial heterogeneous genetic and non-genetic relationships between the various component traits of feed efficiency.
Tempelman considers himself to be incredibly fortunate to have been part of a multi-institutional research team (including principal investigator Mike Vandehaar, also from MSU) funded by the USDA that allowed him to address such issues with data from various collaborators worldwide. Some of this work also was conducted in conjunction with another USDA grant involving Juan Steibel at MSU and Tempelman’s former graduate student, Nora Bello, at Kansas State. This project focused on developing statistical and computational tools to model heterogeneous genetic effects across environments.
“A need exists beyond classical quantitative genetic analyses on feed efficiency to better model the heterogeneous relationships between component traits of feed efficiency across locations,” Tempelman said. “This includes identifying chromosome regions whose genetic control on feed efficiency is sensitive to environmental influences such as temperature or herd management.”
Scientists understand that factors such as milk production, body weight and dry matter intake are elements of feed efficiency. The cow itself is a key variable. Genetic data on the domestic cow ballooned with the sequenced genome of Bos taurus reported in 2009.
As exciting as this breakthrough was, the data are only part of the means to understand traits such as milk production and feed conversion. With almost 3 billion base pairs in the genome assembly, the number of markers that can help locate genes associated with a trait could be overwhelming.
This is where Tempelman’s skills and statistical tools come in. By combining knowledge of genetics and dairy production with expertise in statistics, he not only can provide insights into predicting outcomes in breeding programs but, possibly more important, he understands the pitfalls of erroneous assumptions.
“From one of my other responsibilities — as co-director of the CANR Statistical Consulting Center — I have run across many types of data sets,” Tempelman said. “Good experimental designs and data analyses can help reveal whether treatment effects are consistent versus heterogeneous across environments in a range of agricultural and natural resources projects.”
The number of genetic markers in a cow’s genotype is typically in the tens of thousands, yet the number of phenotypes from individual cows is comparatively small for estimating the effects of individual SNP markers on key traits, such as milk production, in a “genome-wide association” (GWA) analysis. A GWA analysis is typically a first step in identifying genes that are potentially important for the trait of interest, such as milk production or feed efficiency.
Some of the better performing statistical GWA models are based on sophisticated “hierarchical Bayesian” analyses but require specification of the proportion of SNP markers that are believed to be associated with a particular trait. In these models, such a proportion is typically referred to as a “hyperparameter.”
Many scientists have arbitrarily “guessed” these and other hyperparameters rather than attempting to estimate them from the data at hand. Tempelman and his graduate students have demonstrated that estimating these hyper-parameters is important to improving the sensitivity of GWA analyses and accuracy of predictions of genetic merit.
“To predict an animal’s performance with insight from the genome requires well-defined statistical models and computing power,” Tempelman said. “If we want to optimize traits like milk yield, reproductive performance or feed efficiency, no one discipline is solely sufficient on its own. We need to continue to work more seamlessly across fields such as genetics, nutrition and reproductive physiology in our scholarship to improve the performance, health and welfare of livestock.”
Reprinted from Michigan State University AgBio Research 2017 Annual Report