A second closely related principle is naturalization— genetic adaptation to local conditions

Using the number of cities in which a tephritid species has been detected as a proxy for area infested, figure 2a shows that in 1960 there were only two California cities in which tephritids had been detected . However, by 1970 the number of cities with a tephritid detection had increased to 13, by 1990 to a remarkable 200 cities, and by 2010 to more than 300 cities. Although 10 different species contributed to these totals, A. ludens, C. capitata and B. dorsalis contributed the most, appearing in 77, 168 and 245 new cities, respectively, by 2012 . Reintroductions versus established populations A long-standing explanation for recurring fruit fly detections is that flies are continually being reintroduced, either in cargo shipments or by people carrying infested fruit from fruit-fly infested regions of the world. We test this hypothesis and an alternative one to account for the recurring detections; both hypotheses were originally framed by Carey for the medfly as reintroduction hypothesis—recurring tephritid detections are due to repeated introductions—and established population hypotheses—recurring detections are due to resident fly populations.

We assess the strength of these two hypotheses by comparing and identifying inconsistencies in relative numbers, square pots for plants diversity and frequency of detections in California and in other fruit-fly friendly regions.Tephritids are intercepted at all airports across the USA including all airports located in the southern states considered at risk for tephritid introductions. California ports of entry accounted for less than 20% of all insects intercepted in at-risk states . Assuming that insect interceptions can serve as proxies for the relative propagule pressure, if reintroductions were the primary source of detections, then the number of fruit fly detections in fruit-fly-friendly regions of the USA outside of California, compared with detections in California, should be roughly five to one, because California contributes about 20% of detections. Yet no tephritids were detected in the majority of states that are deemed at risk for fruit fly introduction and maintain robust monitoring programmes .Although the medfly and the olive fly are the only two tropical tephritid species that are long-term residents of fruitfly-friendly regions in the European Union , interception rates of other tephritid species at ports of entry throughout the EU are quite high. One source of evidence for this is EUROPHYT, the European notification system for plant health interceptions. This system’s database revealed that, of the total number of interceptions of harmful organisms in plants and plant products imported into the EU in 2011 , fully one-third were tephritids , and showed that, from 2007 to 2009, more than 700 individual tephritids in three genera and nine species not established in Europe were intercepted at Paris’s International Airport .

If the diversity and number of tephritid interceptions at the scores of international airports located in fruit-fly-friendly southern Europe, northern Africa and the Middle East are similar to those at the Paris airport, then the tephritid propagule pressure throughout this world region is far greater than in California. Yet, despite this pressure, with the exception of the peach fruit fly , discovered in 1998 in Egypt, no other tropical tephritids have been detected throughout the Mediterranean Basin for a century.Several lines of evidence support the hypothesis that from five to nine tephritid species have become self-sustaining populations in the state : their abrupt first appearance in the mid-1950s followed by high incidence of repeat detections, their marked seasonality and northward spread , the lack of new detections and/or introductions of new species in most other at-risk regions in the USA and Mediterranean Basin, and the high probabilities of repeatedly detecting many of the tephritid species in California while at the same time not detecting them in other at-risk areas. These findings do not rule out the possibility of multiple introductions into the state for tephritids such as the medfly. However, the multiple detections of several species in nearly the same location anywhere from 10 to 30 years after they were first detected, without any captures during interim years, suggests that, as for many other invasive species , tephritids can be present in low numbers for decades. Indeed, one of the important features of lags in invasion biology that probably also applies to the tephritid invasion of California is that invasions are often not recognized until they are over.

Our findings that multiple species of tropical tephritids have self-sustaining and thus established populations in California have profound economic implications. For example, a 1995 study estimated that medfly establishment alone would result in $493 million to $875 million in annual direct costs, and the imposition of an embargo would cause an additional loss of $564 million. The state economy could lose $1.2 billion in gross revenue and more than 14 000 jobs. However, two aspects of the invasions are advantageous for planners, programme directors and policy makers. The first is that local population sizes for all species are extremely small, and therefore likely to continue to be subdetectable. Therefore, based on phytosanitary standards of the International Plant Protection Convention, most regions of the state should continue to be classified as risk-free by trading partners. The second aspect of the invasions that can be exploited for the longer term involves the invasion lags, which imply that there can be relatively long windows of opportunity for developing new protocols and programmes. Commodity certification protocols can be developed for the creation of fly-free and low-prevalence zones, as can long-term research programmes on tephritid biology and management.Because the likelihood of slowing the spread of or eradicating an alien pest depends heavily upon its residency time, a basic invasion biology canon is that early detection is critical for rapid response . Our results reveal that, because the sources of repeat detections are captures from established populations rather from reintroduced ones, in most cases ‘early detection’ is a misnomer when applied to tephritid detections at all scales. Because this expression is often inaccurate, it is also misleading inasmuch as it implies that a policy primarily directed at preventing new introductions will solve the problem of recurrent detections or infestations.As is true for many alien insect populations, the majority of tephritid population growth and spread in the state is subdetectable because of the small size and cryptic habits of all life stages, the slow pace of naturalization processes, and suppression of populations by intervention programmes. In cancer diagnostics, this is referred to as the ‘rare-event detection problem’ ; in the context of fruit fly detections, the parallel concept is the difficulty in discovering exceedingly rare, scattered, ultra-small populations of tephritids that are mostly in pre-adult stages, hidden among millions of properties and tens of millions of micro-niches. The scores of examples of repeat tephritid finds within a small region of California, separated by decades, large square planting pots suggest that the efficiency of detecting small populations of fruit flies is grossly overestimated, and that the actual chances of discovering populations that are so tiny and scattered is vanishingly small. Our findings are consistent with two interrelated invasion biology principles that underlie the ability of tephritid populations to establish and maintain residency at ultra-low, cryptic and insidious population levels. The first involves what Simberloff refers to as the ‘mysterious lag phase’ in which new populations experience delayed growth. It is unlikely that the magnitude of the lag period in tephritids would be similar to the 150þ years reported for some introduced plant species. However, it is likely that tropical species of tephritids that are introduced to different climatic regions experience major population lags much like the multidecade lags observed in the melon fly in Africa and the cherry fruit fly in Europe. Recent studies suggest that many species’ invasion success may depend more heavily on their ability to respond to natural selection than on broad physiological tolerance or plasticity, and could also result from the need for multiple invasions to facilitate a sufficient evolutionary response.Although it is widely believed that human movement of infested plant material plays a major role in spreading introduced pests, capture patterns for the Mexican fruit fly suggest that this is not the case for this species and, by extension, may not be the case for many of the other invasive tephritids. For example, in 2011, 43 million vehicles and 17 million pedestrians crossed the six ports of entry from Mexico to California.

Assuming that the direction of movement for roughly half of these vehicles and people was from Mexico to California, and if humans entering and dispersing around the state were responsible for the Mexico-to-California as well as the within-state movement of the Mexican fruit fly, then this species should have been detected more or less randomly throughout the state. But the vast majority of all A. ludens detections for nearly 60 years have been in the same areas in which this species continually reoccurs. At the same time, there have been virtually no discoveries of A. ludens in the regions of the state with extraordinarily high movement of Latino populations , such as the main agricultural areas in the Central, Salinas and Imperial Valleys .We know of no historical precedent in the invasion biology literature similar to the tephritid situation in California, where not only are several insect species within a single family invading a region at the same time, but the group also contains species within multiple genera. The California tephritid invasion thus provides unique opportunities to compare the invasive properties of species across different genera with similar life histories, to explore reasons why 17 tephritid species have been detected in California but few to none in many other fruit-fly-friendly regions of the USA and the world, and to develop new population theory for ultra-low, cryptic populations.CDFA and USDA declared 100% success for each of the several hundred eradication programmes that were launched against fruit flies in California . These declarations were accurate according to legal criteria specified by the USDA and the International Phytosanitary Commission; that is, a region is declared fruit-fly-free when no flies have been detected for a time period corresponding to three generations. Although these legal criteria are required for regulatory compliance to enable growers to ship their produce, our results reveal that the more stringent ecological requirements for eradication declaration were not met in the majority of cases. This underscores the continuing problem in the insect eradication literature of loosely and inaccurately applying a term that has a clear definition. Those interested in insect eradication can learn much from the epidemiological literature on eradication programmes regarding frameworks for evaluating systematically the potential for eradication, clear definitions of concepts and terms, and perspectives on the preconditions, difficulties and challenges of successfully eradicating insects.Although some authors have characterized population establishment as self-sustaining populations , none has attempted to specify criteria. The likely reason is that, because of the uncertainty resulting from a combination of demographic stochasticity and detection constraints, it is virtually impossible to define a precise point at which a small population becomes self-sustaining. In the light of this problem, we propose that early-stage invasions can be categorized using methods similar to those we used for Californian tephritids . The establishment category for each species is necessarily subjective and can be based on a combination of detection metrics, including capture span , total number of years captured, inter-year frequency of detections , total numbers of individuals detected, within-state distribution and spatial patterns of apparent spread.The organoleptic properties of tomato fruit are defined by a set of sensory attributes, such as flavor, fruit appearance and texture. Flavor is defined as the combination of taste and odor. Intense taste is the result of an increase in gluconeogenesis, hydrolysis of polysaccharides, a decrease in acidity and accumulation of sugars and organic acids, while aroma is produced by a complex mixture of volatile compounds and degradation of bitter principles, flavonoids, tannins and related compounds . Fruit color is mainly determined by carotenoids and flavonoids, while textural characteristics are primary controlled by the cell wall structure in addition to cuticle properties, cellular turgor and fruit morphology . In last years, tomato fruit organoleptic quality has been investigated both at the genetic and biochemical levels in order to obtain new varieties with improved taste. Recently, the genomes of traditional tomato cultivars such as San Marzano and Vesuviano , considered important models for fruit quality parameters, have been sequenced. SM, originating from the Agro Sarnese-Nocerino area in southern Italy, produces elongated fruits with a peculiar bittersweet flavor.

The pre-drying sulfur treatment had significant impact on the survival of these pathogens

Taking dried peaches made without sulfur as an example, Salmonella survived the entire 180 days of storage with the final level of 4.38 ± 0.08 log CFU/g when the storage temperature was 5 °C. When the storage temperature was 20 °C, Salmonella fell below the limit of detection after 90 days of storage although they could still be recovered by enrichment from two of six samples that were tested. The presence of sand together with the presence of sulfur speeded up the die-off of Salmonella. The impact of dry inoculation might be due the low initial inoculation level or the limit nutrient available when Salmonella cells were dried on the surface of the sand. As shown in Figure 2.1D and Table 2.3, Salmonella fell below the limit of detection after 30 days of storage at 5 °C and could no longer be detected even by enrichment. When the storage temperature was at 20 °C, Salmonella fell below the limit of detection and could not be detected by enrichment after 15 days of storage. Injured cells were observed starting from Day 0. The difference between the counts obtained from TSAR and XLT-4R indicated the formation of injured cells. These differences were as big as 3.68 log CFU/g .Survival of E. coli O157:H7 on dried peaches. Figure 2.2 and Table 2.4 show the behavior of E. coli O157:H7 survival on the dried peaches. The E. coli O157:H7 population on the wet-inoculated peaches and peaches with sulfur immediately after inoculation were 9.40 ± 0.32 and 9.43 ± 0.45 log CFU/g respectively .

The initial E. coli population on the wet-inoculated dried peaches was 8.70 ± 0.26 log CFU/g . After 5 days of storage, blueberry packaging approximately 1 log reduction was observed from the inoculated dried peaches made without sulfur and stored at 5 and 20 °C. From Day 5 to Day 15, greater reduction was observed from peaches made without sulfur that were stored at 20 °C. A 3.53 log reduction was observed from 20 °C, while 0.97 log reduction was observed from samples stored at 5 °C . From Day 15 to 60, while E. coli O157:H7 further declined to 3.77 ± 0.40 log CFU/g on inoculated-dried peaches without sulfur stored at ambient temperature and maintained at similar levels from Day 15 to Day 60. The surviving E. coli O157:H7 on wet inoculated dried peaches made without sulfur fell below the limit of detection after 90 days of storage at both refrigerated and ambient temperatures and could no longer be detected by enrichment after 150 days of storage. The addition of sulfur speeded up the die-off of E. coli O157:H7. When the storage temperature was at 5 °C, E. coli O157:H7 on dried peaches made with sulfur decreased to 5.74 ± 0.12 log CFU/g after 15 days of storage. After additional 15 days , E. coli O157:H7 could not be detected by neither directly plating nor enrichment. When being stored at ambient temperature, E. coli O157:H7 fell below the limit of detection after 15 days of storage and could not be detected via enrichment after 30 days of storage. Similar to what was observed from Salmonella, lower initial inoculation levels were seen from sand-inoculated samples.

When the storage temperature was 5 °C, E. coli O157:H7presence on dry-inoculated peaches made without sulfur gradually decreased from 6.52 ± 0.45 log CFU/g to 3.27 ± 0.06 log CFU/g on Day 120. No E. coli O157:H7 could be detected via plating or enrichment on Day 180. When the storage temperature was 20 °C, E. coli O157:H7 only decreased by approximately 1.3 log by Day 60. After Day 60, a sharp decrease was seen on Day 90 as no E. coli O157:H7 can be detected by plating. The pathogen can be detected by enrichment until Day 150. No pathogen can be detected on Day 180. When looking at the dry-inoculated peach made with sulfur treatment, a 2.15 log reduction was observed in the first 5 days during the storage at 5 °C . While E. coli O157:H7 could still be detected by enrichment on Day 15, it could not be detected after Day 30. Storing at ambient temperature increased the reduction see on Day 5. Greater than 4.25 log reduction was observed in the first 5 days. E. coli O157:H7 could not be detected after 15 days of storage. The differences between counts obtained from TSAR and MACR were also observed due to the formation of injured cells during inoculation and storage. Survival of L. monocytogenes on dried peaches. Figure 2.3 and Table 2.5 show the behavior of L. monocytogenes survival on the dried peaches. The L. monocytogenes population on the wet-inoculated peaches and peaches with sulfur immediately after inoculation were 9.52 ± 0.70 and 9.54 ± 0.34 log CFU/g respectively . The initial L. monocytogenes population on the wet-inoculated dried peaches was 8.57 ± 0.04 and 8.20 ± 0.23 log CFU/g as determined on TSAR and MOXR respectively. When the storage temperature was 5 °C, L. monocytogenes decreased to 7.92 ± 0.06 and maintained at similar levels for the between Day 5 to Day 90. When the storage temperature was 20 °C, a 2-log reduction was observed during the first 5-day of storage. Another sharp decreasing of survival L.monocytogenes numbers was observed between Day 30 and Day 60.

An approximately 2.4 log reduction was seen from TSAR. Starting from Day 90, L. monocytogenes could not be recovered by either direct plating nor enrichment on peaches made without sulfur. The pre-drying sulfur treatment sped up the die-off of L. monocytogenes. Starting from Day 15, L. monocytogenes could not be recovered from any dried peaches made with sulfur treatment stored at both temperatures. Dry inoculation yielded lower initial inoculation levels compared with wet inoculation. There were 6.56 ± 0.09 log CFU/g of L. monocytogenes on peaches made without sulfur treatment and 5.81 ± 0.03 log CFU/g of L. monocytogenes on peaches made with sulfur treatment. This lower inoculation level together with the presence of sulfur led to a significant reduction of L. monocytogenes during the first 5 days of storage regardless of the storage temperatures. Starting from Day 5, L. monocytogenes could only be detected from dry-inoculated dried peaches made with sulfur treatment by enrichment. With dried peaches made without sulfur treatment, bluleberry packaging box the surviving L. monocytogenes gradually decreased from Day 0 to Day 120 when the storage temperature was at 5 °C. Starting from Day 150, L. monocytogenes could no longer be recovered from dry-inoculated dried peaches without sulfur treatment. When the storage temperature was at 20 °C, L. monocytogenes could not be detected by either plating nor enrichment starting from Day 120. Sulfur measurement. The amount of total sulfur dioxide and free SO2 was measured during the first 30 days of ambient storage . The measurement was suspended after Day 30 due to the “shelter-in-place” order placed in March 2020. As shown in Table 2.6, the initial level of free SO2 and total SO2 in the dried peaches made with sulfur treatment were 830± 32 mg/Kg and 2,108 ± 32 mg/Kg respectively. The wet inoculation led to a loss of approximately 122 mg/kg of free SO2 and approximately 73 mg/kg of total SO2. The 2-day drying after inoculation only impacted the total SO2 level and didn’t impact the free SO2 level . No significant change was observed in free SO2 from Day 0 to Day 5. A significant loss of free SO2 was seen from Day 5 to Day 15 . On Day 30, there were 393 ± 46 mg/kg of free SO2 and 1,544 ± 12 mg/kg of total SO2 present in these dried peaches.The survival of common bacterial pathogens, Salmonella, E. coli O157:H7, and L. monocytogenes, was monitored on dried peaches made without and with sulfur pre-drying treatment. Two inoculation carriers were applied, and the two storage temperatures were tested. This impact is applicable to all pathogens tested. For example, Salmonella could not be recovered by enrichment from wet-inoculated dried peaches made with sulfur treatment after 60 days of storage at 5 °C and 15 days of storage at 20 °C. When they were inoculated onto dried peaches made without sulfur treatment, there were 4.95 ± 0.07 log CFU/g of Salmonella survived on these samples by the end of 180 days of storage at 5 °C. Even when the storage temperature was at 20 °C, Salmonella was still detected via enrichment from dried peaches made without sulfur.

The impact of sulfur treatment on the survival of pathogens was also reported by Liu et al. . Similarly, no Salmonella cell was recovered from sulfurtreated apricots via enrichment after 90 days of storageat 22 °C, while ~2.5 log CFU/g of Salmonella was recovered from dried apricots made without sulfur dioxide . Although sulfur dioxide treatment facilitates bacterial die-off, it has the potential to induce asthmatic reactions in some people . The free SO2 levels detected in dried peaches made with sulfur used in this study were higher than previously reported numbers. Although the processors did label the packages with “made with sulfur treatment”, additional studies may be needed to gain insight into sulfur levels present in samples sold at the farmers markets and the changes of free and total SO2 during storage. Storage temperature is another factor that generates a significant impact on pathogen survival. In general, pathogens survived at a higher level for longer period of time at low temperatures than ambient temperature. For example, Salmonella survived on dried peaches made without sulfur at 5 °C for up to 180 days with a final level of 4.59 log CFU/g on wet-inoculated ones and 4.38 ± 0.08 log CFU/g on dry-inoculated ones. On the same samples stored at ambient temperature, Salmonella could only be detected via enrichment after 90 days of storage, indicating the surviving level was below 1.9 Log CFU/g. Cuzzi, et al. found similar results. In their study, L. monocytogenes was inoculated onto dried applies, strawberries and raisins with sand and stored at 4 °C and 23 °C . Since L. monocytogenes could not be recovered from inoculated dried apples at Day 0 , only the survival in dried strawberries and raisins were monitored in this study . When the storage temperature was at 23 °C, L. monocytogenes decreased rapidly by greater than 4 and 3.6 log CFU/g after 14 and 7 days of storage on raisins and dried strawberries. However, when the storage temperature was at 4 °C, L. monocytogenes only decreased by approximately 0.1 and 0.2 log CFU/g/month. After 336 days of storage, L.monocytogenes only decreased by 1.4 and 3.1 log CFU/g on raisins and strawberries, respectively. The impact of inoculation carriers on the survival of pathogen was completed by the fact that different carriers led to, sometimes, different initial inoculation levels before storage. For example, the wet inoculation brought 9.45 ± 0.06 log CFU/g of Salmonella on dried peaches made without sulfur on Day 0, while the dry inoculation had an initial inoculation level of 7.26 ± 0.14 Log CFU/g to dried peaches. Lower initial inoculation levels from sand inoculated samples were also seen for E. coli O157:H7. In this case, the impact of carriers can not be fully studied. In the study conducted by L. R. Beuchat and Mann , two inoculation methods were used to inoculate the dried fruits. One was misting dried fruits with an aqueous suspension of a 5-serotype cocktail of Salmonella, and the other was mixing the dried fruits with sand on which a 5-serotype cocktail had been dried. Authors found that the survival of Salmonella on dried cranberries, raisins, strawberries, and date paste inoculated using the dry carrier and wet carrier followed similar trends. In the study conducted by Feng et al. , plate-grown E. coli O157:H7 were inoculated onto in-shell hazelnuts via wet or dry carriers . After that, samples were stored at 24 ± 1 °C for 12 months. Their results showed that E. coli O157:H7 reduced rapidly on sand-inoculated hazelnut than wet-inoculated ones, although the initial inoculation levels before storage were similar . In the study conducted by Liu et al. , Salmonella was inoculated onto dried apricots made with or without sulfur. The liquid inoculum was diluted for wet inoculation so that the same initial inoculation levels were achieved for both wet- and dry-inoculated dried apricots. Based on their results, Salmonella survived at higher levels for longer period of timeon sand-inoculated dried apricots.

The ABA-treated fruit also showed a higher number of stained xylem vessels

The ABA-sprayed plants had an average fruit phloem sap uptake of 0.778ml fruit–1 d–1 and an average phloem sap solute concentration of 301mg ml–1 . Therefore, the solute accumulation per day was 1.04×144.3=150.07mg fruit–1 d–1 in non-ABA-sprayed plants and 0.778×301=234.18mg fruit–1 d–1 in ABA-sprayed plants. Therefore, the results also showed that ABA-sprayed plants had higher phloem solute accumulation per fruit than non-ABA-sprayed plants from 15 to 30 DAP. According to the present data and other studies, ABA could also be acting at the whole-plant level as a signal triggering carbohydrate accumulation and osmotic adjustment in sink organs . In addition, spraying peach fruit with ABA has been shown to increase the activity of sorbitol oxidase, a predominant enzyme in the metabolism of the translocated sugar alcohol sorbitol, which was followed by an enhanced sugar accumulation in the fruit . The higher phloem sap solute concentration in ABA sprayed plants can decrease fruit apoplastic solute potential, which is then equilibrated by a parallel decline in fruit total water potential . Under these conditions, plastic growing bag higher fruit solute accumulation can increase the water potential gradient between the fruit and stem, favouring fruit xylem sap uptake .

Accordingly, the present results show that whole-plant ABA spray treatment not only increased phloem solute accumulation per fruit, but also decreased leaf transpiration, maintaining a higher stem water potential and higher total fruit water uptake, compared with non-ABA sprayed plants.Following irrigation in the morning, tomato plants treated with ABA had a smaller increase in xylem sap flow rate intothe leaves compared with non-treated plants, presumably due to suppression of stomatal opening . In all treatments, the increase in xylem sap flow after the time of irrigation in the morning probably reflected the combined effects of leaf rehydration as well as increasing light intensity stimulating stomatal opening, and increasing VPD from increased air temperatures and decreased relative humidity in the greenhouse environment. The reduction in leaf xylem sap flow after 15:30h to 16:30h was presumably a result of the reverse changes in the environmental conditions observe dearly in the day after irrigation. These results and other studies also show a direct relationship between high leaf transpiration and higher leaf Ca2+ uptake , suggesting that leaf Ca2+ accumulation is dependent only on leaf xylem sap uptake triggered by leaf transpiration rates. In addition, the data also show that Ca2+ concentration in the leaf xylem sap extracted by pressurizing the leaves or by inducing leaf guttation were similar, suggesting that there is no significant Ca2+ contamination of leaf xylem sap when the leaves are cut and pressurized in the pressure chamber for xylem sap extraction.

In that case, ABA treatment may have also reduced the hydraulic resistance within the fruit, favouring xylemic water movement in the fruit towards the blossom-end tissue, provided a hydrostatic gradient responsible for xylem sap flow was present in the fruit . Since Ca2+ is believed to be mobile in the plant exclusively through the xylem vessels , the observed increase in xylem sap flow towards the fruit in the pedicle, and a reduced hydraulic resistance within the fruit, may explain the observed higher fruit Ca2+ accumulation in ABA-treated plants. Neither effect was observed in ABA-dipped fruit, suggesting that changes in Ca2+ partitioning in the plant are responsive only to whole plant ABA treatment. The pattern of fruit xylem sap uptake followed an increase after irrigation in all treatments until 15:30h to 16:30h, decreasing thereafter at both 15 and 30 DAP. Similar to the leaves, this pattern could be explained by the combined effects of an increase in plant water content right after irrigation and an increase in VPD early during the day that increased the evaporative demand due to increasing air temperatures and decreasing relative humidity durig the day time. Late in the day, decreasing SWP due to plant water loss and decrease in the VPD and consequently the evaporative demand due to decreasing air temperatures and increasing relative humidity could limit xylem sap flow into the fruit, resulting in the observed decrease in fruit xylem sap uptake. At night, the VPD was low but not zero, and continued plant water loss under these conditions may have been associated with the observed reverse flow of fruit xylem sap at 15 DAP for non-ABA-treated plants. Although a reverse xylem sap flow was observed later in the irrigation cycle, the fruit growth rate was always positive, indicating that phloem sap uptake maintained the positive growth rates even under reverse xylem sap flow conditions. The reverse flow of fruit xylem sap was not observed at 30 DAP, possibly because of higher fruit solute content compared with 15 DAP. The higher solute content decreased fruit water potential , which possibly increased the strength of the fruit as a sink for xylemic sap uptake under limited xylem conductivity conditions. The present results showed that fruit xylem sap uptake decreased from 15 to 30 DAP in tomato.

Previous studies have shown that phloem may represent 76–83% and xylem may represent 17–24% of fruit peduncle water uptake at early stages of growth and development . Consistent with the data presented, other studies have also shown that at later stages of growth and development, the xylem contribution to fruit water uptake decreases due to loss of xylem functionality and/or reduction in the hydrostatic gradient responsible for xylemic sap uptake and movement in the fruit . However, other studies have shown that xylem transport into trusses of tomato fruit cultivar Gourmet remained functional throughout the first 8 weeks of growth. In addition, these studies showed that ~75% of water net influx into the fruit occurred through the external xylem and ~25% via the perimedullary region, which contains both phloem and xylem . Differences in the phloem/xylem ratio of fruit sap uptake presented in the literature could be attributed to different genotypes and/or the growing conditions of each study. In future studies, direct measurements of phloem sap uptake into the fruit using nuclear magnetic resonance should be carried out for the same tomato cultivar and growing conditions as used in the present study to compare precisely the methods and the results obtained .Although no statistically significant changes in Ca2+ concentrations in stem xylem sap were observed among the treatments, spraying plants with ABA increased the Ca2+ concentration in the xylem sap moving into the fruit. The movement of Ca2+ in the xylem vessels depends on adsorption and desorption of Ca2+ from active exchange sites within the cell walls . In that case, fruit of ABA-sprayed plants possibly had exchange sites within the xylem cell walls that were more saturated with Ca2+, wholesale grow bags maintaining higher levels of soluble Ca2+ in the xylem sap stream. In addition, evidence suggests that special nutrient transport systems exist at the interface between living cells and xylem vessels . The higher Ca2+ concentration observed in the xylem sap of the peduncle of fruit from ABA-sprayed plants could be the result of the higher flow rate of xylem sap into the fruit leading to a higher saturation of Ca2+ binding sites in the xylem vessels and cell uptake requirements that reduced Ca2+ binding to active exchange sites in the cell walls as well as the Ca2+ uptake into living cells at the interface with the xylem vessels.Spraying tomato plants with ABA increased the Ca2+ concentration and Ca2+ accumulation in the pericarp tissue at the fruit peduncle end by increasing fruit xylem sap uptake, decreasing fruit phloem sap uptake, increasing Ca2+ concentration in the xylem sap moving into the fruit, and possibly by increasing phloem Ca2+ transport into the fruit. The results show that ABA spray treatment increased fruit xylem sap uptake4.72-fold, fruit xylem sap Ca2+ concentration 1.28-fold, and fruit growth 1.41-fold, compared with water spray treatment, respectively. These results suggest that the increase in fruit xylem sap uptake was the most important effect of ABA spray treatment leading to the observed higher fruit Ca2+ accumulation from 15 to 30 DAP. The Ca2+ accumulation in fruit tissue estimated by multiplying the xylem sap Ca2+ concentration in the fruit peduncle by its respective flow rate into the fruit from 15 to 30 DAP was ~84% of the Ca2+ accumulation quantified by the difference in total fruit Ca2+ content observed at 30 DAP minus the total fruit Ca2+ content observed at 15 DAP.

Considering that fruit water uptake is via the xylem and phloem, the results suggest that the phloem may have also contributed to fruit Ca2+ uptake under the experimental conditions described in this study. The results also show a greater difference between the quantified and estimated Ca2+ accumulation in the fruit of ABA-sprayed plants than in the fruit of other treatments , suggesting that spraying plants with ABA also enhanced fruit phloem Ca2+ uptake. Considering that spraying tomato plants with ABA decreased fruit phloem sap uptake, it is possible that this treatment increased Ca2+ concentration in the phloem sap to increase fruit Ca2+ uptake to compensate for the reduction of phloem sap uptake. These results agree with previous studies suggesting that phloem can also have an important contribution to fruit Ca2+ uptake depending on the phloem sap Ca2+ concentration and phloem sap flow rate into the fruit . In the present study, it was assumed that fruit transpiration rates were similar among all treatments. Future studies related to the effect of ABA on xylem and phloem fruit water uptake should include direct measurements of fruit transpiration rates. In the xylem vessels, after reaching the peduncle end of the fruit, Ca2+ can be taken up by the cells, bind to active exchange sites within the cell walls, or remain soluble in the xylem vessels to be translocated towards the blossom-end tissues of the fruit . Accordingly, the present results show that higher xylem sap and tissue Ca2+ content at the fruit peduncle end resulted in higher fruit Ca2+ translocation to and Ca2+ accumulation in the blossom-end tissues in response to whole-plant ABA treatment. Dipping the fruit in ABA did not affect xylem sap or tissue Ca2+ content at the fruit peduncle end, but resulted in higher Ca2+ accumulation and higher Ca2+ in the apoplast in the blossom-end tissue at 15 DAP, suggesting that ABA also triggered a fruit-specific mechanism that favoured Ca2+ translocation from the peduncle end towards the blossom-end region of the fruit. This latter effect was not observed at 30 DAP. According to the present data, spraying the whole plant with ABA or dipping the fruit in ABA maintained a higher number of functional xylem vessels that reduced the resistance to xylemic water and Ca2+ movement into the blossom-end tissue, which could help to explain the observed higher Ca2+ content in the distal end of the fruit. In ABA-dipped fruit, the increase in Ca2+ concentration in the blossom-end tissue was only observed at 15 DAP, possibly due to the reduction in any ABA effect on maintaining a higher number of functional xylem vessels at late stages of fruit growth and development. It is possible that ABA could also increase the number of functional xylem vessels connecting the fruit to the plant, which should be determined in future studies. In addition, higher cuticular wax content in epidermal cells at 30 DAP compared with fruit at 15 DAP could limit fruit ABA uptake during the later dip treatments.At the whole-plant level, ABA treatment triggered stomatal closure, decreasing xylemic water and Ca2+ flow to the leaves, which maintained higher stem water potential. Under such conditions, whole-plant ABA treatment favoured xylemic water and Ca2+ movement into the rapidly expanding fruit, resulting in higher Ca2+ content reaching the fruit peduncle end. However, the data suggest that xylem sap uptake could not fully explain fruit Ca2+ accumulation due to the difference between the observed total fruit Ca2+ accumulation and the estimated fruit Ca2+ accumulation based on the Ca2+ concentration in the xylem sap and xylem sap flow rate into the fruit. These results suggest that phloem could have acted as a source of Ca2+ to the fruit under the experimental conditions described in this study. More detailed studies should include direct measurements of fruit transpiration rates to better characterize the role of phloem in fruit Ca2+ uptake. In addition, a better understanding of phloem contributions to fruit Ca2+ uptake can be accomplished by developing efficient methods to extract and quantify Ca2+ in the phloem sap moving into the fruit.

Our results support the indirect interaction between the TFs and ABA during ripening

Cnr also displayed similar functional enrichments to WT among their respective ripening-related DEGs, including photosynthesis-related pathways, carbohydrate, and amino acid metabolism, and plant hormone signal transduction . Compared to Cnr, rin shared a smaller number of ripening-related DEGs and functional enrichments with WT fruit . The number of ripening-related DEGs shared between nor and WT fruit was negligent, and no functional enrichments were detected in this set of DEGs. Similar to our previous analysis, we mined the ripening-related DEGs to determine the patterns of expression of key genes involved in fruit quality traits . We observed that Cnr and WT showed similar gene expression of SlPSY1, SlLCY1, POLYGALACTURONASE 2A , pectate lyase , PECTIN METHYLESTERASE 1 , and ACTINATE HYDRATASE . Fruit from nor and rin did not have similar ripening expression patterns to WT fruit for those genes, except for the SlPG2A and SlPME1 in rin. Altogether, these data indicate that Cnr fruit undergo the most similar ripening progression to WT fruit, plastic nursery plant pot while nor and rin fruit have moderate to minimal changes between the MG and RR stages.

The changes in gene expression of CNR, NOR, and RIN in the ripening mutants indicate that the genes are interconnected during fruit development. In addition, Cnr consistently showed earlier defects in fruit traits, gene expression, and hormone pathways. To characterize the combined genetic effects of the mutations on tomato fruit, we generated homozygous double mutants through reciprocal crosses of the single mutants. We then phenotyped the double mutants for fruit traits and ethylene production . Because the reciprocal crosses produced fruit indistinguishable from each other, we reportthem as only one double mutant . Fruit of nor/rin double mutants were almost indistinguishable from both nor and rin fruit in appearance and external color. Fruit resulting from any cross with Cnr as a parent presented similar visual characteristics . We also performed a PCA of the color measurements to compare the double mutants to their parental lines at the RR stage and confirmed this observation . Based on these observations and our earlier phenotypic and transcriptional data, we confirmed that the Cnr mutation affects early fruit development. In contrast, the nor and rin mutations act during fruit ripening. If defects in Cnr occur earlier in fruit development than those caused by nor or rin, we expected the Cnr/rin and Cnr/nor double mutants to behave similarly to Cnr anddisplay similar phenotypes . Cnr/rin fruit were significantly less firm than either parent at the MG stage but performed most similarly to Cnr at the RR stage. Cnr/nor fruit was not distinguishable from either parent in firmness at MG but was firmer than Cnr RR fruit. Interestingly, Cnr/nor fruit exhibited high ethylene production at the MG stage like the Cnr fruit. At the RR stage, Cnr/nor showed a less pronounced decrease in ethylene production, resulting in higher hormone levels than either parent. Although some phenotypic differences were detected, we verified that Cnr/rin and Cnr/nor resembled the Cnr parent for most of the fruit traits measured.

If nor and rin act synergistically during ripening, the rin/nor double mutants would have a more extreme phenotype than either on their own. At the MG stage, rin/nor fruit firmness was statistically similar to rin but became an intermediate phenotype at the RR stage. For ethylene, rin/nor fruit produced less than either parent at both stages, although not significant, suggesting a combined effect of both mutations.The spontaneous ripening mutants, Cnr, nor, and rin, are essential genetic tools to untangle the complexity of climacteric fruit ripening and to breed for extended shelf-life or field harvest traits in tomato . However, thorough phenotyping of the fruit traits affected by these mutants using plants grown under field conditions has been neglected. Here, we produced an extensive quantitative study of fruit quality in the tomato ripening mutants and corroborated it across multiple field seasons. We were able to carefully describe physiological and molecular differences between the mutants by sampling large numbers of fruit and surveying distinct stages through ripening in ways not feasible with greenhouse experiments.We determined that some ripening events in the mutants nor and rin were not completely blocked but severely delayed. By examining the OR stage, we found that the mutation in nor may strongly affect firmness and taste while pigment accumulation was only delayed and slightly perturbed . These phenotypes were supported by higher expression of carotenoid biosynthesis genes in nor RR than WT and an increase in SlPSY1 between the MG and RR stages . The accumulation of pigments in nor fruit, particularly at late stages in development, has gone unnoticed in previous studies, but it partially resembles the CRISPR-NOR mutants . In contrast, rin fruit showed strong inhibition of pigment accumulation but less dramatic alterations to fruit taste-related traits, only delaying the accumulation of sugars and decrease in acidity . The lack of upregulation of SlPSY1 in rin appears to contribute to the color defects, consistent with evidence that RIN directly regulates this gene . Both nor and rin exhibited severe delays or inhibition of ripening-related gene expression changes. While highly similar to WT at the MG stage, nor and rin fruit showed large deviations from WT at the RR stage . In fact, the gene expression profiles of nor and rin RR fruit remained similar to those from WT MG fruit. The physiological data generated in this study show nor and rin mutations have different impacts on fruit quality traits. Soluble solids and acid accumulation are negatively impacted in both mutants, but more dramatically in nor fruit. In addition, previous reports have demonstrated a similar pattern among volatile profiles of the mutants at the red ripe stage, with rin again showing more similarity to WT in flavor related traits . This suggests rin fruit are less likely to hinder flavor profiles than nor fruit when breeding for fresh-market hybrid varieties with extended shelf-life. Although nor showed lower quality flavor attributes, its coloration at overripe stages was most similar to WT compared to rin; and thus, it can be useful in breeding hybrid varieties when coloration is a critical fruit trait, such as in the case of processing tomato varieties. Overall, this knowledge will provide valuable information on these tradeoffs of using either loci for breeding programs. Because the Cnr, nor, and rin mutants never acquire equivalent colorations to WT, their ripening stages have been determined based on the fruit’s age expressed as days after anthesis or days after the breaker stage. Sometimes described as BR + 7 days, seedling starter pot the RR stage has been the primary developmental time employed for studying the ripening mutants. As we showed here, the OR stage could provide better comparisons against WT RR fruit for mutants with delayed ripening phenotypes. We demonstrated that in the nor fruit, the RIN and CNR genes only begin to increase in expression in a way comparable to WT at the OR stage . This observation corresponds to over a 10-day delay for some of the ripening processes to begin. The delayed ripening events observed in the OR fruit have not been described before in the spontaneous normutant.Although the Cnr mutant has been assumed to have normal fruit development before ripening , there have been indications that the Cnr mutant displays defects that are not ripening-specific, such as earlier chlorophyll degradation and altered expression of CWDE . We showed that the Cnr mutation causes substantial defects in fruit prior to ripening as seen through statistically significant deviations in fruit size, color, firmness, and TA, ethylene production, and gene expression at the MG stage .

Therefore we propose Cnr may be more accurately described as a developmental mutant and not exclusively a ripening mutant. Further complementing these results, the Cnr fruit displayed large transcriptional deviations from WT that can be traced back as far as 7 dpa . These early development defects are likely a result of reduced CNR expression in the mutant, which is typically expressed in locular tissue before fruit maturity . Our analysis of ripening-related gene expression in Cnr showed striking similarities to WT in the number and functions of genes changing between stages. Moreover, 69.5% of ripening-related DEGs in Cnr were shared with WT . These results further support the hypothesis that Cnr is not exclusively a ripening mutant. Instead, Cnr fruit undergoes gene expression changes consistent with WT “ripening.” However, the ripening related changes in gene expression that occur in Cnr are not enough to compensate for the large defects accumulated in the fruit during growth and maturation. In a recent report, a knockout mutation to the gene body of CNR yielded little visible effects on fruit development and ripening , which suggests that the Cnr mutant phenotype may result from more than just a reduced expression of the CNR gene as previously reported . It has also been demonstrated that Cnr fruit have genome-wide methylation changes that inhibit ripening-related gene expression . The developmental defects observed in Cnr are likely caused by these methylation changes, directly or indirectly caused by the Cnr mutation . Thus, to better understand the Cnr mutation, more physiological data at earlier stages of development needs to be analyzed and complemented with more in-depth functional analysis of gene expression alterations at the corresponding stages. In addition, further molecular and genetic studies need to be performed and compared against complete CNR knockout mutants.Previous reports have shown ethylene levels to be very low or even undetectable in the ripening mutants . Our data support that the mutants never produce a burst in ethylene production, even at the OR stage where more ripening phenotypes are observed . The orange-red pigmentation in nor OR fruit and the similarities of OR fruit in texture and taste-related attributes to WT RR fruit occur independently of an ethylene burst. These observations evidence that other regulatory mechanisms exist to initiate ripening events outside of ethylene . Unlike previous reports, our data consistently showed that Cnr presented increased ethylene levels at the MG stage compared to WT . Interestingly, Cnr fruit produced more of the ethylene precursor ACC than WT at the RR stage. Also, rin made equivalent levels to WT fruit. Ethylene biosynthesis is divided into two programs: System 1 produces basal levels of the hormone during development, and System 2 generates the climacteric rise in ethylene during ripening . Each of these systems is catalyzed by a different set of ethylene bio-synthetic enzymes . It is clear that all mutants show defects to System 2 of ethylene biosynthesis, but they also appear to have alterations specific to System 1. For example, we observed that SlACO3, a System 1-specific ACC oxidase, was higher expressed in Cnr fruit than WT .The role of ABA in climacteric ripening is not as well explored but has been reported to be complementary to ethylene . Previous reports in WT fruit have shown that ABA increases until the breaker stage, just before the ethylene burst . ABA has also been shown to induce ethylene production and linked to the NOR transcription factor . We found that nor and rin fruit did not show decreases in ABA concentration during ripening like WT did . For nor, the constant levels of ABA between MG and RR stages are another example of how fruit ripening events are delayed or inhibited. RIN and ABA have been demonstrated to have an inverse relationship where RIN expression is repressed with the induction of ABA . The significant increase of ABA accumulation in rin during ripening suggests that ABA biosynthesis and metabolism are misregulated in this mutant. rin fruit appear to present a delayed peak in ABA levels compared to WT fruit. More developmental stages, genetic manipulations, and exogenous hormone treatments are needed to investigate further the trends of ABA accumulation seen in the ripening mutants.The interactions between the CNR, NOR, and RIN in ripening have been debated in the literature . The TF RIN directly interacts with NOR and CNR, binding to their respective promoters, and therefore has been proposed to be the most upstream TF among the three regulators . Here we provided evidence that the three TFs display at least indirect effects on each other. We have argued that the Cnr mutant shows a wide breadth of defects across fruit development before ripening begins, and thus, we propose the Cnr mutation is acting before NOR or RIN. This further supports the hypothesis made in Wang et al. that Cnr acts epistatically to nor and rin .

Pooled genotyping is one method of lowering the cost to breeders

Consequently, recombination between the causal and marker loci will occur during the breeding process and as allele frequencies change with each selection cycle, LD shifts impacting the accuracy of predictive models over time. GS models therefore requires regular updating and as such, model training becomes an important component of a modern breeding system. With the cost of genotyping rapidly decreasing and the recent release of multiple chromosome-scale, haplotype-phased genome assemblies for alfalfa , genomics is becoming a viable option for many smaller breeding programs. Recently the use of GS has been investigated by alfalfa breeders for biomass yield , forage quality and salinity tolerance . However, these studies were based on phenotypic and/or genotypic data at the individual plant level. Although useful, alfalfa is often evaluated at the family level using half- or fullsib families and then marketed as a synthetic population. Andrade et al. proposed GS may be better incorporated into an alfalfa breeding program by genotyping pooled families to obtain allele frequency marker data rather than individual genotyping calls. One major takeaway from much of the work in GS is the size of the training population plays a key role in the predictive ability of the final model . However, bucket flower for lesser funded breeding programs, genotyping and phenotyping can quickly become prohibitively expensive with the inclusion of more material.

Another is incorporating remote sensing and high throughput phenotyping to reduce the expensive labour component of phenotyping the training population.Plant phenotyping is a core foundation of plant breeding and has evolved through the years. Accurate and rapid measurement of phenotypic data is essential to understanding the genetic basis for plant traits and for the subsequent generation of improved germplasm. For biomass yield in alfalfa this traditionally required the destructive sampling, drying and weighing of hundreds to thousands of experimental units multiple times over the 2-4 year lifespan of alfalfa breeding trials.This process is labor-intensive, time-consuming, and costly. Recently, improvements in camera technology, aerial photography, and data processing have resulted in the broad adoption of remote sensing and high throughput phenotyping in agriculture which can significantly reduce the high labor cost of phenotyping. Remote sensing allows for the accurate, efficient, and non-destructive estimate of biomass and has been shown to be useful for high throughput phenotyping in breeding applications , including the prediction of biomass yield in large alfalfa breeding plots . What is not yet clear however, is whether the same predictive ability transfers to the variety of other plot types used in alfalfa breeding; family rows and minisward plots.

This is of particular interest for training a genomic selection model where upwards of 1000 families need to be evaluated for optimal predictive ability . The benefits of successfully incorporating HTP into an alfalfa breeding program will be twofold. Firstly, it will enable trial sizes to increase, benefitting not only training populations for genomic selection, allowing greater prediction accuracies, but will be useful to upscale traditional evaluation trials. Secondly, non-destructive biomass measurement will allow the tracking of growth rates throughout the season and other temporal traits, something that is not currently feasible with traditional destructive harvest methods.Alfalfa is one of the most widely grown perennial forage legumes in temperate and Mediterranean-climate regions worldwide , owing to its exceptional yield, high nutrition, broad adaptability, nitrogen fixation, and host of beneficial ecosystem services . Alfalfa hay grown in California predominantly supports the largest dairy sector in the USA, but also provides forage for sheep, beef, and horse production as well as a growing export market . Alfalfa is an allogamous autotetraploid and is characterized by severe inbreeding depression . Alfalfa is highly heterozygous, and cultivars are synthetic populations that exhibit high variability . Most breeding programs currently utilize recurrent phenotypic selection, where the best genotypes are recombined following evaluation trials that typically last 2-4 years . Despite the numerous benefits of alfalfa, the economic viability of alfalfa is under threat from an increasing yield gap relative to major cereal crops and other potential substitutes in the dairy ration . This yield gap has developed due to low rates of genetic gain for forage yield in alfalfa, particularly over the last 30 years where progress has stalled completely . This lack of yield improvement can be ascribed to a range of factors common in outcrossing perennial forages, namely long selection cycles, multiple harvests per year,small breeding investment, the inability to develop hybrids, the harvesting of all above ground biomass , the need to maintain forage nutritive value, and significant genotype by environment interaction . However, yield improvement has occurred in other perennial forages such as perennial ryegrass and white clover ; therefore, progress should be possible in alfalfa. Lamb et al. suggested that the lack of yield improvement in alfalfa is because less breeding focus has been placed on yield, instead there has been a focus on improving tolerance to biotic and abiotic stresses.

Although this enables alfalfa to reach its yield potential, it is not increasing yield per se in populations under improvement. Furthermore, alfalfa yield is often selected indirectly based on evaluation of vigor on spaced plants or on family rows , which has been shown to be a poor proxy for forage yield in the dense swards used in commercial alfalfa production . Marker-assisted selection is a useful tool for plant breeding programs and may be one way to improve the rate of genetic gain. Early research enabled breeders to identify molecular markers strongly linked to quantitative trait loci for a variety of important traits in alfalfa . However, MAS is primarily effective for traits controlled by relatively few genes with large effects. Complex traits, including yield, are usually controlled by many loci with small effects . In this case, genomic selection offers a compelling alternative to MAS by using a model that includes the effect of all markers in computing a genomic estimated breeding value for each individual in the population. Genomic selection can address one of the largest impediments to faster genetic gain in alfalfa – the need for multi-year evaluations that extend the length of each selection cycle. Selection can be made on genotypic information alone without the need for phenotypic evaluation, reducing the cycle time length from3-5 years to less than 6 months. With the cost of high-throughput sequencing decreasing and the recent publication of multiple chromosome-scale, haplotype phased genome assemblies for tetraploid alfalfa , the prospect of a robust genomic selection program is now possible for many alfalfa breeding programs. Various studies have investigated the use of GS in alfalfa breeding for a range of traits including biomass yield, cut flower bucket forage quality and salinity tolerance. Moderate prediction accuracies were obtained for biomass yield, stem digestible neutral detergent fiber , and leaf protein content, ranging from 0.3-0.4 . The vigor of alfalfa under salt stress has also been assessed and a predictive model developed with a prediction accuracy of 0.793 . Although the results of these studies suggest GS could be used to increase the rate of genetic gain for a rage of traits in alfalfa, no empirical demonstration of GS has been published to date. We hypothesized that using genomic selection for high yield based on a model developed from the phenotypic evaluation of clonally replicated genotypes would result in higher yield than a population selected by genomic selection for low yield or than phenotypic selection. The objective of this study was to empirically test populations developed from a genomic selection model for forage dry matter yield in densely sown sward plots of alfalfa.The germplasm used for genomic selection derived from a population created by Dr. Don Viands at Cornell Univ. in the 1990s called NY0358. We previously described this population, the NE-1010 clonal selection population, in an experiment using SSR markers for association analysis . Briefly, NY0358 was formed by intercrossing three elite, semi-dormant cultivars and recombining the resulting population twice. The NY0358 population underwent two cycles of selection for biomass yield using clonal evaluations at multiple locations .

About 200 individual plants were included at the beginning of each cycle. These plants were clonally propagated using stem cuttings, and three replications were planted to the field at each location. In each replication, three clones were included in a plot; thus, each individual genotype was replicated nine times at each location in each cycle. Yield data were collected across multiple harvests and multiple years on individual plots, bulking the biomass of the three clones within the plot. In the first cycle, data were obtained from Ithaca, New York; Ste.-Foy, Québec; and Ames, Iowa. The top yielding 10% of genotypes selected based on an across location analysis of total annual yield were recombined to form NY0847. A second cycle of phenotypic selection was conducted using NY0847; genotypes were clonally propagated as for Cycle 1 and yield data collected from Ithaca, NY and Ste.-Foy, Québec. The best 10% of genotypes from NY0847 based on a statistical analysis of total annual biomass yield from NY only were intercrossed to form NY1221 .For genomic selection, we used a model developed from the initial clonal evaluation cycle total annual yield measured in NY only, because these data were more robust than those from the other locations . We based the model on total annual yield, which is a more important trait than yield of any individual harvest. We applied the model to seedlings from the population NY0847 and subsequently conducted a second cycle of genomic selection. We grew 19 or 20 individual seedlings from each of the 20 maternal families composited to create NY0847, for a total of 384 individual seedlings genotyped using GBS, as described previously , multiplexing 100 genotypes in a single lane of a HiSeq 2000 DNA sequencer. We aligned sequences with previously determined sequence tags to only analyze SNP that had been part of the model . Following SNP scoring and imputation, we computed GEBVs for each individual plant. Based on GEBVs, we selected the top 20 genotypes, restricting selections to no more than four individuals from any given maternal half-sib family to maintain variation in the population. These 20 individuals were intermated in the greenhouse by hand without emasculation to form the GSC1H population. An analogous population, GSC1L, based on the lowest 20 GEBVs was also formed. In addition, a random selection of 20 plants from the 400 plant population was intermated as a control population, GSC1R. For the second cycle of selection, seeds of each maternal half-sib family used to form GSC1H were germinated and DNA from 19 or 20 plants from each of the 20 families was isolated for a total of 384 plants analyzed with GBS markers. We again selected the top and bottom 20 individuals based on GEBVs as done for Cycle 1 and intercrossed them separately in the greenhouse to create GSC2H and GSC2L, respectively.Experiments were established in April 2017 at two locations in the United States each consisting of ten replications laid out in a randomized complete block design. The two locations were the Cornell University Research Farm in Ithaca, NY on a Niagara silt loam and 973 mm average annual rainfall; and Tulelake, CA , on a Tulebasin mucky silty clay loam . The sowing rate was 20 kg ha-1 with plots measuring approximately 1.5m × 5m. Each plot contained 8 rows spaced 17cm apart. An alfalfa border was sown around the entire experimental plot area. Soil tests were conducted at each location and fertilizer applied to maintain P and K at recommended levels for high yielding perennial forages . Trials were monitored for weeds, insects, and mammalian pests, with control measures conducted accordingly. Forage yields were estimated by mowing a swath through each plot leaving a residual of 7 cm. Prior to harvest, alleyways between plots were mown to remove edge effects and ensure plots were of uniform length. The target maturity for harvest was bud to early flowering stage. In Ithaca the harvest area was 1- by 4m. There was a total of nine harvests, three in each of 2018, 2019 and 2020, with no data collected in the establishment year. In Tulelake the harvest area was 0.9- by 4-m. There was a total of 12 harvests, three in 2017, four in 2018 and 2019, and a single harvest in 2020.

The highest average weight was found in seeds sired by father 3 in stylar positions

To avoid contamination when handling anthers and pollen, I cleansed our fingers and tweezers by splashing fresh alcohol before and after every use. When possible, each of the mix pollen crosses was replicated at least twice on each plant. Seed paternity – We collected at most five young leaves when available or any leaves from all father, mother, and offspring plants. The tissue collected in the field or at the greenhouse was immediately packed inside labeled clear plastic envelopes, placed inside a cooler with dry ice and promptly transferred to be stored in -80° C until initiating the DNA extractions. We determined paternity by genotyping microsatellite loci or short tandem repeats previously used for Brassicaceae species . Total genomic DNA was extracted from 300 mg of leaf tissue collected using the DNAeasy Plant Mini kit . We followed the kit’s instructions only modifying the elution step by reducing the amount of buffer to 50 µl to yield 100 µl of final product. Ten pairs of previously developed primers for Brassicaceae were initially tested and screened for amplification and detection of polymorphism among the five fathers. DNA concentration was quantified using a micro-volume UV-vis spectophotometer Nanodrop 2000 . Among those ten primers I chose the four most informative and polymorphic comparatively among the five fathers .Polymerase chain reactions amplifications of the four loci were performed in a 20 µL total volume with X 0.3 U of Taq polymerase , 2 µL of 10X buffer , 10 mM dNTP, 10 µM/L primers, 10 µM M13 dye and 1.2 µL of ~5-40 ng total DNA. For each locus, procona London container the forward primer had a M13 tail labeled with a fluorescent dye .

A pigtail sequence was attached to each reverse primer to avoid scoring problems due to genotyping errors as a result of adenosine addition artifacts . Amplification was performed as follows: 94o C for 5 min, 30 cycles of 94o C for 30s, 56o C for 45s, and 72o C for 45s followed by 8 additional M13 tail cycles of 94o C for 30s, 53o C for 45s and 72o C for 45s and a final extension of 72o C for 10 min. Analysis of microsatellite fragment size for all four loci were done in a Big Dye Terminator v3.1 sequencing chemistry .Within-fruit seed characteristics – Data were normalized when needed and feasible with log-normal or Box Cox transformations . Significant probability values were adjusted a posteriori with sequential Bonferroni tests to adjust for type I error . One-way analyses of variance were used to compare seed weight at the lineage and population level. Significant results were followed by TukeyHSD post hoc tests for multiple paired comparisons of means at the lineage and populations levels. To compare within-fruit seed characteristics among different seed positions I used a Kruskal-Wallis tests to compare: seed weight, within-fruit seed weight percentage, and the relative within-fruit seed fecundity. For the purpose of these tests, I discerned among the three types of crosses: either mix or single hand pollination or the control open pollinated plants. Multiple regressions followed by the associated ANOVA were performed to assess the effect of lineage, population, type of cross, maternity, paternity and seed weight.

Comparisons among cross types and paternal siring frequencies at different fruit positions were assessed with goodness of fitness chisquare tests, which were followed by Pearson’s chi-square with 10000 permutations the differences among row were lower than 5. All statistical tests were implemented using the R statistical program and extra statistical R packages were downloaded from the Comprehensive R Archive Network . Paternity – We scored genotypes of father, mother and offspring individuals by visualization of the results in GeneMapper Software 3.7 . A genotype with a single PCR fragment was considered a homozygote having two identical alleles. Visual inspection of allele assignments and manual corrections were systematically done. We employed the exclusion parentage analysis to determine from the pool of fathers used, which one sired a particular seed by comparing the genotype of the three or four father candidates and the known mother to the focal progeny. We determined multiple paternity by comparing the siring fathers at different seed positions within the same fruits. We also assessed whether the fathering occurred in a non-random manner by measuring the frequency at which the siring occurred. Finally, I compared the performance of the fathers by calculating at the offspring siring times, seed weight, within-fruit seed weight percentage and within-fruit relative fecundity and per fathers.Knowing that lineages and populations have a significant influence on weight, I moved on to compare, within lineages, if the type of crosses and within-fruit seed position influence seed weight, within-fruit seed weight percentage and within-fruit seed fecundity . We tested this by using Kruskall-Wallis tests independently for each seed position and seed position bin. The tests were done for each type of cross individually within each lineage. Figures 3.2, 3.3, and 3.4 graphically show the average values from our data set. Within-fruit seed positioning has no statistical significant influence on weight. A trend for heavier seeds at peduncular positions for the hybrid derived lineage CAwr in control fruits can be visualized in figure 3.2. The opposite trend seems to be true for the wild radish Rr. The cultivar Rs has a sinusoidal trend. In the case of the percentage of within-fruit seed weight, seed position per se : influences significantly control, single and mixed crosses fruits of CAwr, and influences significantly mixed Rr and slightly single Rr fruits. Seed position bins only moderately influence within-fruit seed weight percentage in control CAwr fruits. No effects of seed position were found on within-fruit seed fecundity. One-way analysis of variance followed by Tukey post-hoc tests were performed to test if seed position bins for each lineage and type of cross had an effect on seed weight .

The results suggest that only in the case of CAwr control plants is there a significant effect of the bins on seed weight. In this particular case, the Tukey post-hoc test reveals that it is the stylar end bin compared to the peduncular end were the difference lies with a significant negative effect on seed weight. Fecundity and relative fitness – A total of 540 crosses formed viable fruits out of the 595 crosses that I performed. Among the 949 seeds found viable after extraction and first visual inspection, 312 seeds were transplanted to the common gardens at AgOps and among those 247 survived to the end of the experiment. Within lineages, cut flower transport bucket multiplicative fitness functions for mixed and single crosses reveal that total relative fitness was not significantly different among seeds from either cross for both progenitors and marginally different for the hybrid-derived lineage . For Rs the difference was significant as a result of lower viable seed/pod. When the fitness functions for mix and single crosses were compared within each lineage differences were found in all three cases. Surprisingly, mix crosses had lower fitness than single crosses for Rr and CAwr, and the opposite is true for Rs. Differences in fecundity, number of viable pods per pollination and seed viability, while not always significant, affect the overall fitness. Paternity – We found evidence of within-fruit multiple paternity, i.e. seed sired by different fathers within the same fruit, for all three lineages: 11 out of 11 fruits for CAwr, 9 out of 11 for Rr and 1 out of 2 for Rs. Fruits from single crosses had all seed sired by the chosen father. As mentioned before, seed viability was an issue for our experiment. Very few fruits with all or most seeds survived to the end of the experiment, in particular for Rs . For this reason I cannot accurately assess if the multiple paternity is or is not random. Also, because I had so few plants from Rs that survived until the end of the experiment, I eliminated them from the rest of the analysis . The percentage of siring, seed weight average, within-fruit seed percentage average, and within-fruit seed fecundity average for CAwr and Rr for mixed and single crosses are represented in appendices I.1, I.2, J.1, and J2. The siring success of the five CAwr fathers is provided in table 3.7. Each column contains the results by individual father, and each row within a division is the value per seed position bin. All of the values in table 3.7 pertain only to mixed crosses with CAwr and Rr mothers because there was insufficient sample size to analyze the crosses with Rs mothers.

The number of seeds sired in each of the three seed position bins did not differ significantly by father. However, when the number of seeds sired by each father was expressed relative to the number of times a particular father was included in a mixed pollination, the percentages of siring were significantly different. Father 1 and father 2 had higher success siring than the other 3 fathers . Their total percentage of siring exceeds 100% because they sired seeds at least twice within same fruits. Father 2 was the most successful siring at both stylar and peduncular portions of the fruits. Seed weight average and within-fruit seed weight did not differ significantly among fathers. In contrast, the highest average within-fruit seed weight was found on seeds sired by father 2 at peduncular position. The within-fruit fecundity did differ significantly among fathers. Father 2 had higher fecundity relative to the other fathers, at all seed position bins . In mixed crosses, when I assessed each father at different seed position bins, I found that father 2 has higher and moderately to significantly different: siring percentage, average seed weight and percentage of seed weight at peduncular portions of the fruits However its within-fruit relative fecundity was the same in all seed position bins. Father 3 has a moderately higher average seed weight at stylar positions and significantly higher within-fruit relative fruit of seed at peduncular positions. In single crosses, the only significantly different performances across seed position bins happened for average seed weight of seeds sired by father 4 at middle positions and within-fruit relative fecundity of seeds sired by father 4 and father B at peduncular positions. Father performances vary when pollinating in single and mixed pollen crosses. When I assessed lineage-by-lineage, fathers and seed position bins, I found that CAwr offspring resulting from mixed crosses fruits, father 2 appears to have the highest siring percentage at seeds in peduncular ends, with highest average within-fruit seed weight and fecundity at all sections of the fruit. Father 3 has the highest seed weight at stylar and peduncular ends but high average fecundity only at stylar end seeds . These results are not replicated in the case of single pollen crosses. In the case of Rr offspring resulting from mixed crosses, father 2 also sired the most seeds but this time at the stylar end with highest average within-fruit seed fecundity. Here also the results were not replicated at the single pollen crosses. Allele frequencies in father, mother and offspring are compiled in appendix K. Maternal and paternal effects – Maternal effects are significant at phenological life cycle level with significant effect on days to germination, days to first true leaf emergence, and final plant weight. Paternal effects significantly influence reproductive output such as total fruit production and potential reproduction as well as offspring seed weight . Fathers also marginally influence cotyledon width and days to flower buds emergence. Lineage and population influence both morphological as well as fitness related characters including seed weight, which is consistent with our previous results. Seed weight is also influenced by the type of cross but not by seed position bins. Seed weight influences cotyledon width and days to first true leaf emergence.Previous studies have demonstrated non-radom multiple paternity for the hybrid derived CAwr fruits . Our results show that multiple paternity also occurs in both progenitor lineages. Because very few whole fruits were represented in the offspring that survived until the end of the experiment, I were unable to determine if the distribution of paternal DNA is non-random with respect to seed position within the pod. Across lineages, for mixed crosses only, the father that sired most seeds was the one from which offspring were the most fecund. Mixed and single hand pollinations gave different results for individual fathers at different sections of the fruits.

Lesions were smaller and mycelium appeared later in Royal Royce than the other parents

TA percentages were quantified with a Metrohm Robotic Titrosampler System from 1 to 5 ml of the defrosted homogenized fruit juice . SSC was measured from approximately 200 ml of juice on an RX-5000a-Bev Refractometer . Total AC was measured from a 25 ml sample of juice in 200 ml 1% HCl in methanol by reading absorption at a wavelength of 520 nm on a Synergy HTX platereader equipped with Gen5 software . A standard curve was calculated for quantifying AC using a dilution series of pelargonidin from 0 to 300 mg/ml in 50 mg/ml increments, where y was absorption readings for the perlagonin dilution series, s was the slope, x was the concentration of perlagonin inthe dilution series, and i was the intercept. AC was estimated by ðA iÞ=s, where A was the absorption reading.Gray mold incidence ranged from 0.0% to 2.7% among cultivars at 14 dph, a typical postharvest storage duration for LSL cultivars. The five day-neutral cultivars were screened out to 21 dph to develop insights into the post harvest storage limits for modern LSL cultivars . Although the fruit were still marketable at 14 dph, 30 litre plant pots they became marginally marketable or unmarketable by 17–18 dph .

We observed an exponential increase in gray mold incidence beyond 17–18 dph for every cultivar with means ranging from 10.3% to 36.7% among day-neutral cultivars at 21 dph. These studies showed that gray mold was ubiquitous and eventually rendered the fruit unmarketable but that the natural incidence of gray mold was negligible on fruit of LSL cultivars grown in coastal California within the 14 dph storage window . From common knowledge and earlier surveys of phenotypic diversity for resistance to gray mold , we hypothesized that the low incidence of gray mold on commercially produced fruit of LSL cultivars might be genetically correlated with fruit firmness and other fruit quality traits affecting shelf life . Although phenotypic correlations have been reported , genetic correlations have not. The fruit of LSL cultivars are typically much firmer than the fruit of SSL cultivars commonly grown for local or direct-market consumption, as exemplified by Earlimiss, Madame Moutot, and Primella in the present study . The latter are sweeter, softer, and perish more rapidly than “Royal Royce” and other LSL cultivars under normal post harvest storage conditions . To explore how these phenotypic differences affect resistance to gray mold, we developed a training population for GS studies by crossing “Royal Royce,” one of the LSL cultivars assessed for natural infections , with four SSL cultivars , in addition to crossing a pair of LSL parents with differences in fruit firmness and AC .

These full-sib families were phenotyped for resistance to gray mold using an artificial inoculation protocol and genotyped with a 50K Axiom SNP array .Natural infections are too inconsistent and unreliable for analyses of the genetics of resistance to gray mold in strawberry. To overcome this problem, we developed a highly repeatable artificial inoculation protocol for gray mold resistance phenotyping that involved puncturing fruit with a 3-mm probe, propagating spores of a single B. cinerea strain , introducing a known concentration of spores into the wound site, and monitoring disease development on individual fruit stored undisturbed under high humidity . Two quantitative B. cinerea disease symptoms were recorded on multiple fruits harvested from training population individuals: water-soaked LD in mm and the number of dpi when EM was observed on the surface of the fruit. We found that incubating artificially inoculated fruit at 10 C and 95% humidity in the dark yielded highly repeatable results with minimal contamination from other post harvest decay pathogens. LD and EM were recorded daily from 1 to 14 dpi . This protocol produced highly reproducible results with repeatability estimates in the 0.66–0.83 range for LD and 0.68–0.71 range for EM . Although critical for maximizing repeatability, this protocol produced more severe disease symptoms than those commonly observed from natural infection, especially on non-wounded fruit of firm-fruited LSL cultivars .

To study the genetics of resistance to gray mold in strawberry, artificially inoculated fruit of individuals in multifamily and Royal Royce  Tangi populations were phenotyped daily for LD and EM over 14 days in cold storage . The speed of fungal development and symptom severity differed among individuals in both populations . The phenotypic extremes we observed are illustrated in time-series photographs of four individuals from the upper and lower tails of the LD and EM distributions in the multifamily population . Lesions became visible and had enlarged to 10.0 mm by 5 dpi in one of the most susceptible individuals , whereas lesions were not visible until 8 dpi and developed the slowest in one of the least susceptible individuals . Lesions spanned the entire fruit surface of the most susceptible individuals by 8 dpi, thereby resulting in significant post harvest fruit deterioration and fungal decay . Consequently, our genetic analyses of LD were applied to phenotypes observed 8 dpi, the last day in the study that resistance phenotypes could be observed for every individual. As expected, our analyses confirmed that resistance to gray mold is genetically complex in strawberry, a finding consistent with observations in other hosts . Although statistically significant differences were observed among individuals for LD and EM in both populations , every individual was susceptible and the phenotypic ranges were comparatively narrow . LDs were approximately normally distributed and ranged from 7.0 to 33.5 mm at 8 dpi in the multifamily population and 7.0 to34.0 mm at 8 dpi in the Royal Royce  Tangi population , a few in Supplementary Figure S2. Similarly, the speed of appearance of teresting candidate gene associations were identified when short mycelium on the surface of the fruit was approximately nor- QTL-associated haploblocks were searched in the reference gemally distributed and ranged from 4.0 to 12.5 dpi in the multifam- nome for genes with biotic stress and disease resistance annotaily population and 5.5 to 12.5 dpi in the Royal Royce Tangi tions . A cluster of 11 tandemly population . The repeatabil- duplicated genes encoding pathogenesis-related proteins ities for LD and EM among individuals in these populations sug- were found in close proximity to the most significant SNP associated with a QTL on chromosome 4A . observed for gray mold resistance was genetically caused These genes share sequence homology to FcPR10, an Fragaria chi- . Narrow-sense genomic heritability estimates ranged loensis ribonuclease encoding gene previously predicted to reduce from 0.38 to 0.71 for LD and 0.39 to 0.44 for EM, 25 liter pot plastic which suggested the severity of gray mold disease in strawberry . The other QTL-associated candidate genes thatmight warrant further study encode peroxidases reported to modulate reactive oxygen species levels and inhibit fungal growth during B. cinerea infections and transcription factors reported to signal pathogen-triggered immunity, e.g., WRKY and AP2/ERF , that might target pathogenicity factors, e.g., chitinases and protease inhibitors . While these genes are worthwhile candidates for further study , the effects of the associated QTL were too small and insignificant for direct selection . This was, nevertheless, a first attempt to identify loci underlying resistance to B. cinerea in strawberry through a genome-wide search for genotype-to-phenotype associations in the octoploid genome . Royal Royce, the firm-fruited LSL parent, was more resistant to gray mold than the soft-fruited SSL parents .

Royal Royce was the more resistant parent for both traits in the four full-sib families with that parent . For the 05C197P002 16C108P065 full-sib family, 05C197P002 was more resistant than 16C108P065 for LD and vice versa for EM. The LD and EM differences were highly significant with individuals transgressing the phenotypic ranges of the parents . Transgressive segregation was primarily bidirectional for both traits; however, the EM distributions for Royal Royce Tangi and 05C197P002 16C108P065 were right-skewed toward more resistance and lacked individuals in the lower tails distal to the more susceptible parent . These results suggested that favorable alleles were transmitted by both parents for both traits and that favorable alleles for different loci segregated in most of the families.Genomic prediction accuracies for different WGR methods ranged from 0.28 to 0.47 for LD and 0.37 to 0.59 for EM when estimated by cross-validation from 100% of the subsamples . The differences in accuracy among WGR methods for each trait-population combination ranged from 0.00 to 0.07. RKHS produced the highest accuracy for two of the trait-population combinations and was equal in accuracy to G-BLUP for the other two trait-population combinations. SVM often perfomed intermediate to both G-BLUP and RKHS. The prediction accuracy was greater for LD than EM in the Royal Royce Tangi population, whereas the reverse was observed in the multifamily population. Using cross-validation with 100% of the subsamples, clear differences in prediction accuracy and shrinkage were observed between disease symptoms within and between populations . The prediction accuracy for LD was markedly different between the multifamily and Royal Royce Tangi populations. The GEBV range for LD in the multifamily population was half as wide and the kernel density was flatter and more vertical than that observed in the Royal Royce Tangi population . Notably, the LD phenotypes of the most resistant individuals in the RR Tangi population were well predicted. Their EM phenotypes, however, were not as well predicted—the GEBV range for EM was half that of the phenotypic range and the kernel density distribution was flatter and more vertical . One of the challenges of breeding for resistance to gray mold and other postharvest traits is phenotyping throughput. Collectively, 2563 fruit were harvested and individually stored,tracked, and phenotyped in our study . Our expectation was that multiple fruit/individual was needed to more accurately estimate EMMs and GEBVs and nominally increase heritability. To assess the effect of subsamples on prediction accuracy and explore the feasibility of applying selection for resistance to gray mold from a single subsample/individual, GEBVs and prediction accuracies were estimated from a single randomly selected subsample/individual. We observed a significant decrease in narrowsense genomic heritability for LD and EM in the single subsample analyses, e.g., h ^2 decreased from 0.38 to 0.13 for LD and 0.39 to 0.16 for EM in the multifamily population . Naturally, prediction accuracies plummeted in the single subsample analyses too . This is clearly illustrated by the kernel density distributions for GS accuracy estimated for G-BLUP, RKHS, and SVM by cross-validation with a single subsample/individual . GEBV ranges were narrower and kernel density distributions were flatter and more vertical for the single subsample vs multiple subsample analyses for LD and EM in both populations . Hence, we concluded that breeding values cannot be accurately predicted without subsampling fruit.One of our working hypotheses was that selection for increased fruit firmness and other shelf life-associated fruit quality traits pleiotropically decreased susceptibility to gray mold in strawberry. The additive genetic correlations support this hypothesis and highlight between family differences driven by breeding history, the phenotypic diversity of the parents, and transgressive segregation . The pairwise breeding value distributions further highlight the family structure and phenotypic diversity within and among families. The fruit size, firmness, and TA by LD and EM breeding value distributions for the only elite  elite family in our study were distinct from the four elite  exotic families . LD and EM were weakly negatively genetically correlated and weakly to strongly genetically correlated with fruit quality traits in directions predicted by our hypotheses. Because gray mold resistance increases as LD decreases and EM increases, signs of the additive genetic correlations have different interpretations for LD and EM and can be antagonistic or synergistic. The interpretation depends on the specific phenotypes targeted for a particular market, e.g., SSL vs LSL. LD was negatively genetic correlated with titratable acidity and positively genetically correlated with pH ; hence, LD increased as titratable acidity decreased and pH increased . The effect of titratable acidity on resistance phenotypes was the motivation for screening additional individuals from the Royal Royce Tangi family, which had a significant genetic variation for TA and yielded more accurate genomic predictions for LD than were observed in the multifamily population . EM was more strongly positively genetically correlated with fruit size and firmness than LD and negatively genetically correlated with total soluble solids .

Isolated ovules were observed with a stereomicroscope and photographed by a digital camera

For whole genome sequencing a wild-type sugar apple fruit was purchased from a retail source in the United States, seeds from the fruit were planted, and one plant grown in the UC Davis Conservatory was sampled for sequencing with voucher herbarium samples stored as DAV225058 and DAV225059. The Hawaiian seedless line was obtained as budwood from Frankie’s Nursery , grafted to a wild-type A. squamosa rootstock and grown in the UC Davis Conservatory with a voucher sample stored as DAV225060.For genetic inheritance studies, three different wild types were used: M1, M2 and M3 and a seedless Bs line. The authors planned the crosses with different wild types for two propositians: inheritance studies and for initial steps in production of desirable seedless lines. Plants were grown at the Experimental Farm and molecular analysis was performed at Molecular Biology laboratory of the State University of Montes Claros, latitude 15º48′09’’S, longitude 43°18′32’’W and altitude 516 m. For phenotypic characterization of seedless versus seeded two strategies were applied: fruits were harvested, pulped, 30 plant pot and examined for the presence or absence of seeds ; or flowers either fresh or fixed in formalin acetic acid-alcohol were dissected to separate the ovules from the carpel tissue.

The wild-type ovules present a domed shape opposite the funiculs, while the mutant ovules come to a point at this position . Filial generations , self-fertilization , backcrosses with wild-types parents M1, M2, M3 , and backcrosses with mutant parent Bs were obtained. Segregations were evaluated for conformity to predicted ratios with the Chi-square test using the Genes statistical software .DNA was extracted from young leaf samples with hexadecyltrimethylammonium bromide buffer as described by Doyle and Doyle and separated from polysaccharides as described by Cheung et al. . Primers used in PCR are listed in Supplementary Table 1. PCR was performed with DreamTaq and the included reagents with an initial denaturation at 94 °C for 3 min; 35 cycles with denaturation at 94 °C for 30 s, annealing at 56 °C for 30 s, and extension at 72 °C for 1.5 min; and a final extension of 72 °C for 4 min. For reactions using the AsINODel primers a 60° annealing temperature was used. PCR products were electrophoresed on 1.2% agarose buffered with 1×TBE or SB and DNA visualized by staining with ethidium bromide an illumination with ultraviolet light. For sequencing, PCR products were processed with ExoSAP-IT or Quiapure and sequenced using amplification primers on an ABI 3500 or 3730 genetic analyzer at Análises Moleculares Ltda. or the University of California Davis CBS DNA Sequencing Facility .Whole genome sequencing was performed by the North American author prior to initiating the current collaborative effort.

The lines available for sequencing were a wildtype North American commercial line and the Hs line, and these were used for this part of the analysis. DNA for whole genome sequencing was isolated from young leaves by grinding in 100 mM TRIS–Cl, 20 mM EDTA, 1.4 M NaCl, 2% CTAB, 1% each polyvinylpyrrolidone and sodium metabisulfte pH 8.0. Samples were treated with 70 µg/ml RNAaseA , extracted with 1:24 mixture of isoamyl alcohol and chloroform, and precipitated with isopropanol. Samples were dissolved in 10 mM TRIS pH 8.0, 1 mM EDTA, adjusted to 0.3 M Na Acetate, pH 4.8, precipitated with 2 volumes of ethanol and dissolved in 10 mM TRIS pH 8. Wild-type A. squamosa DNA was processed and sequenced at the University of California, Davis Genome Center . For PacBio sequencing, DNA fragments greater than 10 kb were selected by BluePippin electrophoresis and were sequenced on a PacBio RSII or Sequel Single Molecule, Real-time device. This resulted in 2.46 million reads with an average read length of 8 kb comprising more than 29 Gbases, or approximately 37 X genome representation. For Illumina sequencing the DNA was sheared and fragments of an average size of 400 bp were selected and sequenced on a HiSeq 4000 apparatus by the paired-end 150 bp method resulting in approximately 390 million sequences. The sequences were trimmed of poor quality regions and primer sequences with Sickle and Scythe , respectively, resulting in 229 Gbases or approximately 124 X genome representation. Hs DNA was similarly processed and sequenced by QuickBiology resulting in 408 million sequences and approximately 130 X genome representation.

The PacBio reads of wild-type DNA were assembled using Canu with default settings, producing 3519 contigs. The wild-type Illumina reads were aligned with the assembly using BWA MEM with default settings, and Pilon used the alignment to correct the contigs, changing 148 k single nucleotide errors and adding a net of more than 1.8 Mbases of insertions for a final assembly of 707.7 Mbases with an average contig length of 201 kb and an N90 of 93.9 kb. A BLAST search with the known A. squamosa INO gene sequence identifed a 587 kb contig containing the INO gene . BWA MEM aligned the Hs Illumina readswith the assembly and Tablet was used to examine the alignment with the 587 kb contig containing INO. BLAST was used to search one half of the set of Hs Illumina sequence reads for those extending across a detected deletion and the resulting sequences were aligned and assembled using Sequencher 5.4.1 .Genetic diversity was assessed among varieties of seedless sugar apple Bs, Ts, and Hs, with the fertile parent M2 as a contrasting control. Sixty-seven pairs of SSR microsatellite markers, described for A. cherimola were used, with fifteen having been described by Escribano et al. and fifty-two by Escribano et al. . DNA extraction was performed as described above for the markers association of seedless trait with INO deletion. Amplification utilized an initial denaturation at 94 °C for 1 min; 35 cycles at 94 °C denaturation for 30 s, annealing at 48–57 °C depending on the primer; and extension at 72 °C for 1 min; and a final extension of 72 °C for 7 min. The amplification products were separated by 3.0% agarose gel electrophoresis bufered, stained and visualized as above. To calculate diversity, the amplification data of the SSR primers were converted into numerical code per locus for each allele. The presence of a band was designated by 1 and the absence by 0. Although the microsatellite markers can be codominant, grow raspberries in a pot molecular analyses of the locus were performed based on the presence/absence of each amplified fragment. The established binary matrix was used to obtain estimates of genetic similarities between genotype pairs, based on the Jaccard coefficient. The Genes statistical program was used for data processing.The results of the phenotypic analysis of the parents M2 and Bs, F1, F2, backcrosses with the wild-type parent M2 and with mutant parent Bs are displayed in Table 1. In generation F1, all individuals presented fruits with seeds. In the F2 population, among the plants in reproductive stage during the evaluation period, 48 formed fruits with fully developed seeds and ten presented only seed rudiments, characterized by the absence of seeds . Considering segregation hypotheses expected for one, two and three genes , the Chi-square test revealed that the trait under study segregated at a 3:1 ratio , consistent with a monogenic inheritance. These results were corroborated by data from backcrosses with the parent M2 , where all plants that produced fruits had seeds, consistent with the 1:0 ratio, while the plants evaluated through back crossing with the mutant parent Bs , showed segregation of 1:1 for presence and absence of seeds. Taken together, these results corroborate the monogenic inheritance found in the analyses of F2 generations, indicating that a single recessive locus controls the seedless trait in Bs A. squamosa.

Previously described molecular markers for the presence of the INO gene were tested on parents M1, M2, M3, and Bs and displayed the expected band patterns. These markers generated amplification products only in the three wild-type parents, with no amplification of any fragment in Bs for any of the primer pairs used . The dominant marker LMINO primer-set was also used to amplify DNA from F1 plants obtained from crosses between genotypes of A. squamosa with the mutant Bs . All evaluated F1 individuals produced fruits with seeds in the feld and amplified the products with all primer pairs, as shown in the Supplementary Fig. 3B. The same procedure was applied in order to genotype segregating generations in seedling stage in the nursery . Figure 2 shows a sample of individuals amplified with the LMINO1/2 primers and the results confirm the discriminatory capacity of those genetic markers . The field confirmation of presence/absence of seeds in the fruits in these generations F2, BCM and BCBs was obtained later . In the F2 generations of the three crosses , there was a segregation of the products of the amplification of the LMINO markers that correlated exactly with the presence/absence of seeds. Fertile plants in this generation uniformly produced an amplification product with the LMINO1/2 primer set, while plants producing no product produced only seedless fruit . The same complete cosegregation pattern seen in F2 individuals for the presence/absence of seeds and PCR product was also observed in backcross populations of BCBs . For BCM backcross plants, the formation of INO amplification products was observed in all DNA samples tested for these uniformly fertile/seed bearing plants. The χ2 test was performed with the data generated in the molecular analysis to confirm the segregation of the dominant amplification . F1 plants displayed the expected genotypic ratio of 1:0 that had been linked to the trait of seeded fruits. In F2 generations, six segregation hypotheses expected for one, two and three genes were tested . Considering a signifcance of 5% probability, the frequencies of genotypes ft a ratio of 3:1, but allowed rejection of the other predicted ratios, confirming the hypothesis that a single locus confers the phenotype for the trait under study, with the dominant allele responsible for the presence and the recessive allele for the absence of the amplification product. To identify the homogeneity between the F2 crossings , statistical techniques were applied to verify whether the differences observed in the results could be explained by chance or not. The heterogeneity test was not significant and indicated, with a 55% likelihood, that the results of the χ2 were consistent for the populations of the three families studied, confirming the expected segregation . To further support the hypothesis of segregation in F2 generations, BCM and BCBs backcrosses were used. Similarly, the heterogeneity of segregation between the families of the BCBs backcrossing was not significant . BCM and BCBs progenies analyzed separately displayed segregation in a manner consistent with the hypothesis of a single gene. In BCBs backcrossing, carried out between generations F1 and the parent Bs, the proportion was close to 1:1 presence/absence of seeds in the fruits. χ2 test was applied and the deviations between the observed and expected frequencies were not significant. In the BCM backcrossing between generations F1 and parents , the proportion was 1:0 presence/absence of seeds in the fruits. These results confirmed the monogenic inheritance found in the analyses of F2 generations consistent with a single recessive allele being responsible for the seedless trait in A. squamosa considering the 3:1 segregation hypothesis.Whole genome shotgun sequencing was used to determine the characteristics of the INO gene deletion event. A draft wild-type A. squamosa genome was assembled through sequencing of total DNA isolated from a plant grown from seed derived from commercially available A. squamosa fruit. Genomic DNA was sequenced by both long-read Single Molecule, Real-Time sequencing and short read paired-end 150 base methods. The long reads were assembled into a draft sequence that was corrected with the higher coverage short read sequences. The resulting assembly comprised 707 Mbases of DNA in 3,519 contigs, with average contig length of 201 kb. A BLAST search with a previously published A. squamosa INO gene sequence was used to identify a 587 kb contig that included the INO gene . Total Hs A. squamosa DNA was used to produce a second short-read sequence set and this was aligned with the assembled wild-type sequence. Visualization of the alignment of the Hs sequences with the 587 kb contig including INO revealed a clear absence of reads over a region of 16,020 bp indicating a 16 kb deletion that included the INO gene . The alignment program truncates read sequences where they do not align with the reference sequence, so a deletion or a deletion with a heterologous insertion would appear similar in this visualization.

The leaf complexity measures included all leaflets present on the leaf

These cultivars were selected based on leaf shape as described in Tatiana’s TOMATObase and The Heirloom Tomato . Tomato seeds were treated, germinated, and field planted as previously described . In both the 2014 and 2015 seasons, plants were laid out in a randomized block design and were planted and grown in soil, with furrow irrigation once weekly. Gas exchange and intercepted PAR measurements Gas exchange measurements were done in the field on attached leaves after the plants had recovered from transplanting. Measurements were made weekly from week 10 to week 15 , on week 17 , and weeks 18– 21 , on c. 60 plants each week, on three plants per cultivar wk–1 . Measurements were made on leaves fromthe upper and lower portions of the plants to eliminate positional bias within the plant, and measured for three leaves per plant. The A , gst , transpiration, and ɸPS2 of a 6 cm2 area of the leaflet were measured using the LI-6400 XT infrared gas exchange system , and a fluorescence head . The chamber was positioned on terminal leaflets such that the midvein was not within the measured area. Light within the chamber was provided by the fluorescence head at 1500 µmol m2 s 1 photosynthetically active radiation , raspberry plant container and the chamber air flow volume was 400 µmols s1 with the chamber atmosphere mixed by a fan.

CO2 concentration within the chamber was set at 400 µmols mol1 . Humidity, leaf and chamber temperature were allowed to adjust to ambient conditions; however, the chamber block temperature was not allowed to exceed 36°C. Measured leaflets were allowed to equilibrate for 2–3 min before measurements were taken, allowing sufficient time for photosynthetic rates to stabilize with only marginal variation. The amount of intercepted PAR was measured in four orientations per plant and an average PARi calculated. PARi was measured by placing a Line Quantum Sensor onto a base made from ¼” PVC piping, and a Quantum Sensor approximately 1 m above the plant on the PVC rig. Measurements from both sensors were taken simultaneously for each sample using a Light Sensor Logger . This allowed variation in overall light intensities such as cloud movement to be measured and accounted for in the total PARi.After gas exchange measurements, three plants per cultivar were destructively harvested each week. The final yield and fresh vegetative weight of each plant harvested was measured using a hanging scale in the field. Five leaves were collected at random from the bottom and top of the plant to capture all canopy levels, and approximately nine fruit were collected for BRIX measurements. FW was used owing to the large number of plants and measurements being done in situ in the field setting.

All measurements were made in kg. To measure the BRIX value of the tomatoes, the collected fruit was taken to the laboratory where the juice was collected and measured on a refractometer . The yield and BRIX for each plant were multiplied together to get the BRIX 9 yield index , which gives an overall fruit quality measure, accounting for variations and extreme values in either measurement. It should be noted that while BRIX is used as a standard quality measure, BY is a composite value that folds in yield to assess weight of soluble solids per plant and is being used to measure commercial quality and not consumer quality . BY measurements were done for both the 2014 and the more detailed 2015 fields. These data were compared to test for reproducibility of results . Subsequently, primary leaflets were used for imaging and analysis of shape and size as previously described , and the images then processed in IMAGEJ . The images were cropped to individual leaflets maintaining the exact pixel ratio of the original image, and then cropped again to only include the single leaflet using a custom Java script written for FIJI. Single leaflet images converted to a binary image as black on a white background, and smoothed to allow for the exclusion of any particulates in the image were then processed in R using MOMOCS , a shape analysis package. Leaflet images were imported and then aligned along their axes so that all images faced the same direction. They were then processed using elliptical Fourier analysis based on the calculated number of harmonics from the MOMOCS package. Principal component analysis was performed on the resulting eFourier analysis and the principal components were used for subsequent analysis. Traditional shape measures such as leaflet area, circularity, solidity, and roundness were done with the area measurement based on pixel density. These measures were compared with the PCs to determine the characteristics captured by each PC.

The PC values were used for all subsequent leaflet shape and size analyses. Total leaf area for each plant was measured by imaging the whole plant and a 4 cm2 red square and then processed in the EASY LEAF AREA software .Five plants per line were used to analyze leaflet sugar content. The plants were grown under the same conditions as field plants with the following exceptions. Plants remained in the glasshouse after transfer to 1 gallon pots. All plants were watered with nutrient solution and grown until mature leaves could be sampled. Using a hole punch, a disk with an area of 0.28 cm2 was taken from the leaflets and extracted from the disks using a modified extraction method from the Ainsworth laboratory . Leaf disks were placed in 2 mM HEPES in 80% EtOH and heated to 80°C for 20 min and the liquid collected and stored at 20°C. The entire process was repeated twice. They were then placedin 2 mM HEPES in 50% EtOH and heated, collecting the liquid and storing at 20°C followed by another 2 mM HEPES in 80% treatment. The collected liquid was then used to measure the amount of sugar present per area of disk. To measure leaf sugar content a working solution of 100 mM HEPES , 6.3 mM MgCl2 , and 3 mM ATP and NADP at pH 7 was prepared. From the working solution, an assay buffer was made adding 50 U of glucose-6- phosphate dehydrogenase , and 295 or 280 µl of the working solution was added to a 96-well plate for sucrose standards or samples, respectively. Standards were added at a 60-fold dilution and samples were added at a 15-fold dilution. Then 0.5 U of hexokinase , 0.21 U of phosphoglucoisomerase , and 20 U of invertase were added to each well and the plates allowed to sit overnight to reach equilibrium. The plates were measured on a UV spectrometer at 340 nm, followed by analysis in JMP .All statistical analyses were performed using JMP software. To determine statistical significance, measurements were modeled using general linear regression model and tested by a one-way ANOVA followed by Tukey’s honestly significant difference, if necessary. These modeled data for all measured values were compiled into a table and used to create a model using partial least-squares path modeling in SMARTPLS 3.0 . Modeled data were used for the statistical analyses as many measurement types varied in number of data points, and therefore a set of generated predicted values of equal size was used to make an equal data matrix . Partial least squares-PM was used to explore the cause-and-effect relationships between the measured variables through latent values. PLS-PM is effective in both exploring unknown relationships and combining large-scale data, such as field, physiological, container raspberries and morphological data, that otherwise are not well described together . In addition to running the PLSPM, 1000 bootstraps were performed to obtain statistical significance and confidence intervals of the path coefficients and the R 2 values of each latent variable. The path coefficients are the standardized partial regression coefficients , and represent the direction and strength of causal relationships of direct effects. Indirect effects are the multiplied coefficients between the predictor variable and the response variable of all possible paths other than the direct effect . To determine the best path model, the latent variables were combined using our best understanding of biological relationships, and a general model using all data was generated. The paths between LVs were altered until a best-fit model was found. PLSPredict was then used on the dataset to ensure that the model did not over or under fit the data, and for predictive performance of each manifest variable . This structural model, and not the fit values, was retained for use in predictive modeling of a separate dataset. PLSPREDICT, with the structural model developed as described earlier, was used on a separate dataset to determine the efficacy of the model.

Two commercial cultivars, M82 and ‘Lukullus’, were used and only the leaf shape values were entered as exogenous variables. The predicted values for each output variable were compared with the actual measured values to determine how well the model predicted these variables.To perform phylogenetic analysis, all single nucleotide polymorphisms detected by CLC Genomics Workbench 11.0 from whole genome sequencing were exported as a vcf file. The SNPRELATE package for R was used to determine the variant positions that overlapped between cultivars and then all sequences combined into a single gds file . This file was run through SNPhylo with the following parameters: the linkage disequilibrium was set to 1.0, as we wanted to exclude as few variants as possible based on this factor, the minor allele frequency was set to 0.05, and the missing rate was set to 0.1. In all, 1000 bootstraps were performed for confidence intervals and significance. Solanum pimpinellifolium was used as the outgroup. The bootstrapped output tree was displayed in MEGA7 . Analysis of c gene flow was performed using PHYLONETWORKS . All common SNPs from chromosome 6were run through the TICR pipeline and then analyzed using PHYLONETWORKS with default settings, except for the number of runs which was set to 20. After the hybrid network for chromosome 6 was obtained, bootstrap analysis was done in PHYLONETWORKS using default settings with the following exceptions: ftolRel was set to 0.01, ftolAbs was set to 0.001, liktolAbs was set to 0.0001, and Nfail was set to 5. These adjustments were made to decrease processing time. The bootstrapped tree was output in DENDROSCOPE .Tomato is one of the highest-value and most extensively used vegetable crops worldwide. However, to meet increasing demand, modern tomato cultivars have been selected for qualities such as size and firmness instead of taste . Consequently, most of modern commercial varieties have lost their flavor and are often tasteless . Flavor of fruit is the sum of interactions between taste and aroma, whereas sugars and acids are the two of primarily components to activate taste receptors and aroma components such asvolatile compounds activate olfactory receptors . Though the relative contribution of taste and aroma to fruit flavor has not been clearly defined , plenty of studies have shown the importance of sugars and acids in determining fresh fruit flavor . For tomato, the levels of sugars and acids not only contribute to tomato taste , but also are major factors affecting tomato overall flavor intensity , and increasing sugar content of the fruit will enhance tomato flavor . Recent studies have shown that fruit sugar accumulation in modern tomato is two to three-fold less than that in wild species , which can account for the decline in flavor quality of tomato fruit. Fruits are the primary photosynthetic sinks and over 80% of sugars in the fruit are produced in the leaf through photosynthesis and subsequently translocated through the phloem . Therefore, factors involved in regulating leaf photosynthesis, as well as sugar biosynthesis and sugar transport would influence sugar levels in fruit. Leaves are the principle site of plant photosynthesis, and leaf traits directly impact the efficiency of light capture and photosynthetic carbon fixation Thus, changes in leaf traits could have an effect on fruit yield and quality. Studies evaluating the influence of leaf area on tomato yield have shown high leaf area index can lead to an increase in tomato yield as a result of better light interception . Recently, leaf shape was shown to be strongly correlated with fruit sugar levels in tomato, with rounder and more circular leaves having higher sugar content in their fruit .

The other two methods include probabilistic graphical models and meta-prediction

Consistently, six genes, CrPORA, CrCAO, CrCHIH, CrCHLM, CrGGDR and CrMPEC, involved in the process of chlorophyll biosynthesis were more highly expressed, while two genes, CrCLH1 and CrRCCR, involved in chlorophyll degradation were less expressed in WT than in MT . Similarly, auxin could also affect the gene expression of carotenoid and chlorophyll metabolism during tomato fruit ripening . In this study, we did not measure the IAA content in fruit peels for technical reasons. Although peel coloration and pulp maturation are two different processes in the same ripening fruit, they are normally synchronized in the majority of the world citrus growing areas. Similarly, in our case, the changes in peel color were closely coupled with the changes in pulp sugar and acid content. It is therefore relatively safe to assume that the changes in IAA content in pulps should similarly occur in peels, plastic gardening pots and no fruit peel IAA data should interfere with drawing a reliable conclusion.

It must be pointed out that the citrus fruit may delay or stop CC during fruit ripening in rare cases, such as in some very early satsuma mandarins, and in hot, tropical regions.Organisms attain their form and function by readouts from an intricate web of regulatory relationships between DNA, RNA, proteins and metabolites. The era of large-scale biology promises to provide insights into this web of regulation at the whole genome level and has spurred growth of computational methods that allow us to look at diverse readouts and generate a comprehensive frame-work for how molecules generate morphological phenotypes. The number of sequenced genomes grows apace; NCBI currently lists >2500 genomes , and the number of plant genomes is currently listed at >100. However, the challenges associated with whole genome sequencing and assembly have caused many researchers to turn to other types of genome scale data. Since 2009, when RNAseq was described as a recently developed technology that had the potential to revolutionize our understanding of the complexities of eukaryotic transcriptomes, the technology has evolved and proved useful for identifying links between transcription factor activity and transcript abundance, for the generation of transcriptomes in non-model species through de novo assembly methods, for detection of genomic variants, and for identification of splicing variants.

Continued improvements in efficiency combined with reduction in cost of sequencing have made sequencing technology available even to fields that traditionally did not rely on them. In a recent review Alvarez and co-authors reported on >500 studies that relied on either microarray or RNAseq methods in the last 10 years and present the potential to analyze gene expression in an ecological context across multiple taxa. In the field of evolutionary developmental biology the numbers are even more staggering. How can this explosion of genome scale data be leveraged to better understand how organisms develop, evolve, respond to biotic and abiotic stimuli and function in the context of their environment? Network analysis, an offshoot of graph theory used in mathematics and computer science to model relationships between objects, was utilized extensively in the social sciences and has become a method of choice for identifying relationships between units of biological data.Early gene networks, such as those produced by Davidson et al., were generated using perturbation assays and direct experimental data to create a directed network of developmental regulatory control. While these networks were small they were a large step in advancing our understanding of developmental processes that was not obtainable by analysis of just one or two genes at a time.

Many early network models consisted of one of two types of mathematical analysis, the Logical or Boolean model , and the dynamic systems model. The Switch model consisted of genes being either in an ‘On’ or ‘Off’ state, which could be regulated by other genes. This method demonstrated feedback loops as well with genes regulating the activity of other genes , but could not allow for variable states of expression. Dynamic systems utilizing differential equations allowed for variable expression states of genes and nonlinear interactions, but were limited by computational power and the lack of large-scale transcriptomic data availability. Transcription factor– promoter interaction studies provided an additional method of gene regulatory network construction. These interactions showed a multi-tiered, or hierarchical, structure to gene regulation in networks, with a top, core, and bottom tier of transcription factors and their targets. This tiered structure revealed an interesting aspect of gene networks and biological processes, as the top tiered transcription factors and their targets tended to be noisy or have a high degree of variability in expression, while the bottom tier showed very low noise and stricter regulation of expressional states. Jothi et al. hypothesized that the increased variability in top-level gene regulation allowed for greater adaptability, while low variability in the bot-tom tier acted as a buffer against inadvertent changes in the higher tiers that could be detrimental.In the post-genomic era, and with the large volume of whole genome transcriptional data available, gene net-work construction has become readily available to most researchers. Figure 1 represents a flowchart of the potential analyses discussed below. Before constructing a network, genes often need to be subset into interest groups in order to facilitate data visualization and focus analysis on specific biological questions. This involves differential expression analysis in conjunction with dimensionality reduction and clustering methods such as PCA, k-means, hierarchical, self-organizing maps, and t-distributed stochastic neighbor embedding. Each of these methods attempts to reduce high dimensionality data such as gene expression patterns, either over time, different tissue types or treatments, into a representative and more easily interpretable two dimensional structure.

PCA has been the most utilized dimensionality reduction method, using Euclidean distances to measure dissimilarity and determine placement within a two dimensional space. However, the output often does not represent the actual relation- ship of objects from higher dimensional space as it measures distinct orthogonal components within each PC representing the greatest amount of variance without regard to overall gene-to-gene correlations. With k-means, a user-defined number of clusters are used, and a mean vector is calculated for a cluster to assign new members, and then the mean recalculated. This iterative process reduces the level of dissimilarity of objects within the cluster, thus giving a better representation of object relationships in higher dimensional space within the limitation of the defined number of clusters . Despite this improvement, k-means is highly susceptible to noise distortion, or the influence of outliers on the overall mean and structure of a cluster. Hierarchical clustering builds a tree with nodes that represent clusters through multiple different methods including matrix construction by gene pair similarity measures, blueberry pot size and then identifying those genes with the highest degree of similarity. While hierarchical clustering provides a more informed output, merging errors or smaller cluster merging can result in the loss of more interesting local groups of genes. Each of the previous methods relies on Euclidean distances for similarity/ dissimilarity measures between genes within higher dimensional space, which does not conform to a linear relationship by its nature. SOM and t-SNE employ non-linear distance measures to ap-proximate the relationship between genes within higher dimensional space, often providing a much more realistic representation of gene similarity in two dimensions. Once distinct clusters of genes with similar expression patterns have been identified gene ontology or gene set enrichment, tests can be performed to identify the nature of genes within clusters. Networks can then be constructed from an individual cluster or multiple clusters sharing similar biological functions. There are three primary network types; gene regulatory networks which give directionality to the interaction between nodes or genes, association networks which are non- directional but show direct interaction between associated genes, and gene coexpression networks which are non-directional and can show direct or indirect interactions between associated genes. With transcriptomic datasets GCNs offer the most versatile gene interaction exploratory tool, using gene expression patterns to deter- mine potential associations and modularity. This is especially useful in non-model organisms where the function of most or many genes has not been determined, and regulatory interactions remain unknown. Of the four primary network construction methods, the two most commonly utilized are correlation and supervised networks. Correlation network construction consists of determining a correlation between two genes based on expressional changes, with Pearson’s moment correlation coefficient being the most common method . PMCC identifies linear correlations, but suffers from the inability to deal with outliers or genes which may have a nonlinear relationship. Spearman’s rank correlation coefficient deals with both of these issues, as it is more robust to outliers and accommodates non-linear relationships. Maximum information correlation allows detection of the strength of any type of linear or nonlinear correlations between genes, and Partial correlation coefficient can be employed to quantify the association between two genes when conditioning on other genes to infer direct dependencies among variables in a network. In addition, it has been reported that Network deconvolution can allow one to infer direct effects from an observed correlation matrix containing both direct and indirect effects. On the other hand, supervised network construction utilizes regression models , which deal with the response of genes to a set of predictor genes. Supervised network construction deals well with cascade expression changes, but is less reliable when dealing with feedback loops, a feature of the regression analysis where response and predictor variables are set and not necessarily interchangeable during construction. A combination of several mathematical techniques is preferable to obtain a more accurate representation of the gene associations.

There are two types of PGM methods, Bayesian and Markov, with the former providing interaction directionality of gene relationships, and the latter using neighborhood selection methods similar to linear regression in supervised learning. Bayesian PGM is highly sensitive to experimental design and requires computationally intensive methods for interpreting Bayesian networks. The possibility of misinterpreted causal relationships among genes from gene expression data makes this method less appealing. However, when applied correctly the method can provide gene relationship information not obtained with some other methods with large scale, high dimensionality data. Meta-prediction includes meta-analysis and ensemble learning, however each utilizes multiple methods of network construction, and then creates a consensus relationship among gene expression patterns. Meta- Prediction methods, through the use of multiple methods, may provide a more robust network than any one method on its own.Once the GCN has been constructed, the interaction among genes can be determined, and other information such as gene function and biological processes regulated can be obtained. Since transcriptionally coordinated genes are often functionally related, GCN can be used for gene function prediction. Especially a comparative GCN analysis across species can yield more accurate gene function predictions because conserved gene modules are more likely to be functionally relevant. Hub genes, modularity and network restructuring are discussed in the following section. One of the more appealing aspects of GCNs is that whole transcriptome data can be combined with other large scale networks such as metabolic or protein–protein interactions to give a wider view of the biological processes to which specific clusters of genes belong. Interestingly, the transcriptomic data avail-able is outstripping computational capabilities of many researchers, creating a technological bottleneck rather than a biological one.A major challenge in biology is to understand the genetic basis of morphological evolution. Evo-devo studies aim to understand the developmental mechanisms that are modulated over time to give diverse phenotypic outputs. Most evo-devo studies, even though pursued on a gene-bygene level, have underscored the importance of gene expression regulation, suggesting that rewiring of developmental GRNs should be a crucial factor driving morphological evolution. Large-scale genomics tools can be used to investigate rewiring of developmental GRNs as crucial factors driving morphological evolution. Studies determining GRNs within an evo-devo context help us determine how developmental GRNs are reorganized to generate morphological diversity. Recent interaction mapping studies have showed the ability of differential analysis to reveal massive rewiring in the architecture of an interactome during cellular or adaptive responses.Our previous GCN analysis using cross-species and tissue-specific RNA-seq data had revealed the modular structure of the GRN controlling leaf development in the domesticated tomato and its wild relatives. Comparisons of the networks among species with experimental data showed that changes in a module regulating the key KNOX1 TFs made a significant contribution to the variation in leaf complexity.