Pooled genotyping is one method of lowering the cost to breeders

Consequently, recombination between the causal and marker loci will occur during the breeding process and as allele frequencies change with each selection cycle, LD shifts impacting the accuracy of predictive models over time. GS models therefore requires regular updating and as such, model training becomes an important component of a modern breeding system. With the cost of genotyping rapidly decreasing and the recent release of multiple chromosome-scale, haplotype-phased genome assemblies for alfalfa , genomics is becoming a viable option for many smaller breeding programs. Recently the use of GS has been investigated by alfalfa breeders for biomass yield , forage quality and salinity tolerance . However, these studies were based on phenotypic and/or genotypic data at the individual plant level. Although useful, alfalfa is often evaluated at the family level using half- or fullsib families and then marketed as a synthetic population. Andrade et al. proposed GS may be better incorporated into an alfalfa breeding program by genotyping pooled families to obtain allele frequency marker data rather than individual genotyping calls. One major takeaway from much of the work in GS is the size of the training population plays a key role in the predictive ability of the final model . However, bucket flower for lesser funded breeding programs, genotyping and phenotyping can quickly become prohibitively expensive with the inclusion of more material.

Another is incorporating remote sensing and high throughput phenotyping to reduce the expensive labour component of phenotyping the training population.Plant phenotyping is a core foundation of plant breeding and has evolved through the years. Accurate and rapid measurement of phenotypic data is essential to understanding the genetic basis for plant traits and for the subsequent generation of improved germplasm. For biomass yield in alfalfa this traditionally required the destructive sampling, drying and weighing of hundreds to thousands of experimental units multiple times over the 2-4 year lifespan of alfalfa breeding trials.This process is labor-intensive, time-consuming, and costly. Recently, improvements in camera technology, aerial photography, and data processing have resulted in the broad adoption of remote sensing and high throughput phenotyping in agriculture which can significantly reduce the high labor cost of phenotyping. Remote sensing allows for the accurate, efficient, and non-destructive estimate of biomass and has been shown to be useful for high throughput phenotyping in breeding applications , including the prediction of biomass yield in large alfalfa breeding plots . What is not yet clear however, is whether the same predictive ability transfers to the variety of other plot types used in alfalfa breeding; family rows and minisward plots.

This is of particular interest for training a genomic selection model where upwards of 1000 families need to be evaluated for optimal predictive ability . The benefits of successfully incorporating HTP into an alfalfa breeding program will be twofold. Firstly, it will enable trial sizes to increase, benefitting not only training populations for genomic selection, allowing greater prediction accuracies, but will be useful to upscale traditional evaluation trials. Secondly, non-destructive biomass measurement will allow the tracking of growth rates throughout the season and other temporal traits, something that is not currently feasible with traditional destructive harvest methods.Alfalfa is one of the most widely grown perennial forage legumes in temperate and Mediterranean-climate regions worldwide , owing to its exceptional yield, high nutrition, broad adaptability, nitrogen fixation, and host of beneficial ecosystem services . Alfalfa hay grown in California predominantly supports the largest dairy sector in the USA, but also provides forage for sheep, beef, and horse production as well as a growing export market . Alfalfa is an allogamous autotetraploid and is characterized by severe inbreeding depression . Alfalfa is highly heterozygous, and cultivars are synthetic populations that exhibit high variability . Most breeding programs currently utilize recurrent phenotypic selection, where the best genotypes are recombined following evaluation trials that typically last 2-4 years . Despite the numerous benefits of alfalfa, the economic viability of alfalfa is under threat from an increasing yield gap relative to major cereal crops and other potential substitutes in the dairy ration . This yield gap has developed due to low rates of genetic gain for forage yield in alfalfa, particularly over the last 30 years where progress has stalled completely . This lack of yield improvement can be ascribed to a range of factors common in outcrossing perennial forages, namely long selection cycles, multiple harvests per year,small breeding investment, the inability to develop hybrids, the harvesting of all above ground biomass , the need to maintain forage nutritive value, and significant genotype by environment interaction . However, yield improvement has occurred in other perennial forages such as perennial ryegrass and white clover ; therefore, progress should be possible in alfalfa. Lamb et al. suggested that the lack of yield improvement in alfalfa is because less breeding focus has been placed on yield, instead there has been a focus on improving tolerance to biotic and abiotic stresses.

Although this enables alfalfa to reach its yield potential, it is not increasing yield per se in populations under improvement. Furthermore, alfalfa yield is often selected indirectly based on evaluation of vigor on spaced plants or on family rows , which has been shown to be a poor proxy for forage yield in the dense swards used in commercial alfalfa production . Marker-assisted selection is a useful tool for plant breeding programs and may be one way to improve the rate of genetic gain. Early research enabled breeders to identify molecular markers strongly linked to quantitative trait loci for a variety of important traits in alfalfa . However, MAS is primarily effective for traits controlled by relatively few genes with large effects. Complex traits, including yield, are usually controlled by many loci with small effects . In this case, genomic selection offers a compelling alternative to MAS by using a model that includes the effect of all markers in computing a genomic estimated breeding value for each individual in the population. Genomic selection can address one of the largest impediments to faster genetic gain in alfalfa – the need for multi-year evaluations that extend the length of each selection cycle. Selection can be made on genotypic information alone without the need for phenotypic evaluation, reducing the cycle time length from3-5 years to less than 6 months. With the cost of high-throughput sequencing decreasing and the recent publication of multiple chromosome-scale, haplotype phased genome assemblies for tetraploid alfalfa , the prospect of a robust genomic selection program is now possible for many alfalfa breeding programs. Various studies have investigated the use of GS in alfalfa breeding for a range of traits including biomass yield, cut flower bucket forage quality and salinity tolerance. Moderate prediction accuracies were obtained for biomass yield, stem digestible neutral detergent fiber , and leaf protein content, ranging from 0.3-0.4 . The vigor of alfalfa under salt stress has also been assessed and a predictive model developed with a prediction accuracy of 0.793 . Although the results of these studies suggest GS could be used to increase the rate of genetic gain for a rage of traits in alfalfa, no empirical demonstration of GS has been published to date. We hypothesized that using genomic selection for high yield based on a model developed from the phenotypic evaluation of clonally replicated genotypes would result in higher yield than a population selected by genomic selection for low yield or than phenotypic selection. The objective of this study was to empirically test populations developed from a genomic selection model for forage dry matter yield in densely sown sward plots of alfalfa.The germplasm used for genomic selection derived from a population created by Dr. Don Viands at Cornell Univ. in the 1990s called NY0358. We previously described this population, the NE-1010 clonal selection population, in an experiment using SSR markers for association analysis . Briefly, NY0358 was formed by intercrossing three elite, semi-dormant cultivars and recombining the resulting population twice. The NY0358 population underwent two cycles of selection for biomass yield using clonal evaluations at multiple locations .

About 200 individual plants were included at the beginning of each cycle. These plants were clonally propagated using stem cuttings, and three replications were planted to the field at each location. In each replication, three clones were included in a plot; thus, each individual genotype was replicated nine times at each location in each cycle. Yield data were collected across multiple harvests and multiple years on individual plots, bulking the biomass of the three clones within the plot. In the first cycle, data were obtained from Ithaca, New York; Ste.-Foy, Québec; and Ames, Iowa. The top yielding 10% of genotypes selected based on an across location analysis of total annual yield were recombined to form NY0847. A second cycle of phenotypic selection was conducted using NY0847; genotypes were clonally propagated as for Cycle 1 and yield data collected from Ithaca, NY and Ste.-Foy, Québec. The best 10% of genotypes from NY0847 based on a statistical analysis of total annual biomass yield from NY only were intercrossed to form NY1221 .For genomic selection, we used a model developed from the initial clonal evaluation cycle total annual yield measured in NY only, because these data were more robust than those from the other locations . We based the model on total annual yield, which is a more important trait than yield of any individual harvest. We applied the model to seedlings from the population NY0847 and subsequently conducted a second cycle of genomic selection. We grew 19 or 20 individual seedlings from each of the 20 maternal families composited to create NY0847, for a total of 384 individual seedlings genotyped using GBS, as described previously , multiplexing 100 genotypes in a single lane of a HiSeq 2000 DNA sequencer. We aligned sequences with previously determined sequence tags to only analyze SNP that had been part of the model . Following SNP scoring and imputation, we computed GEBVs for each individual plant. Based on GEBVs, we selected the top 20 genotypes, restricting selections to no more than four individuals from any given maternal half-sib family to maintain variation in the population. These 20 individuals were intermated in the greenhouse by hand without emasculation to form the GSC1H population. An analogous population, GSC1L, based on the lowest 20 GEBVs was also formed. In addition, a random selection of 20 plants from the 400 plant population was intermated as a control population, GSC1R. For the second cycle of selection, seeds of each maternal half-sib family used to form GSC1H were germinated and DNA from 19 or 20 plants from each of the 20 families was isolated for a total of 384 plants analyzed with GBS markers. We again selected the top and bottom 20 individuals based on GEBVs as done for Cycle 1 and intercrossed them separately in the greenhouse to create GSC2H and GSC2L, respectively.Experiments were established in April 2017 at two locations in the United States each consisting of ten replications laid out in a randomized complete block design. The two locations were the Cornell University Research Farm in Ithaca, NY on a Niagara silt loam and 973 mm average annual rainfall; and Tulelake, CA , on a Tulebasin mucky silty clay loam . The sowing rate was 20 kg ha-1 with plots measuring approximately 1.5m × 5m. Each plot contained 8 rows spaced 17cm apart. An alfalfa border was sown around the entire experimental plot area. Soil tests were conducted at each location and fertilizer applied to maintain P and K at recommended levels for high yielding perennial forages . Trials were monitored for weeds, insects, and mammalian pests, with control measures conducted accordingly. Forage yields were estimated by mowing a swath through each plot leaving a residual of 7 cm. Prior to harvest, alleyways between plots were mown to remove edge effects and ensure plots were of uniform length. The target maturity for harvest was bud to early flowering stage. In Ithaca the harvest area was 1- by 4m. There was a total of nine harvests, three in each of 2018, 2019 and 2020, with no data collected in the establishment year. In Tulelake the harvest area was 0.9- by 4-m. There was a total of 12 harvests, three in 2017, four in 2018 and 2019, and a single harvest in 2020.