Root growth is extremely variable, presenting great sensitivity to the growth environment which could explain the non-significant differences in root growth over treatments and the differences between field and pot observations. Several studies described the impact of water deficit on root growth and the variability of root length for annual crops, such as wheat , cotton , sorghum and tree species . These results encourage the application of long-term experiments in order to clarify the link between root growth and canopy transpiration under water stress.Grapevine is a perennial plant that has been cultivated for more than 7000 years in many environments and according to many different viticultural practices. It is a globally important crop, eaten fresh or processed into various products including wine . Like other crops, it faces changing biotic and abiotic stresses linked to climate change or the introduction of exotic pests . The grape and wine industries, must in addition, cope with societal demands to reduce environmental impacts and improve product safety while maintaining cost-effective and sustainable production. Thus,macetas plastico cuadradas the major challenges for viticulture and enology are to control the final berry composition at vintage in variable environments and to sustain yield and quality while limiting the use of pesticides, water and other inputs.
In order to address the scientific questions related to these challenges, the grapevine research community is increasingly using high-throughput data-generative experimental techniques that generate large and heterogeneous data sets describing genotypes, phenotypes and the environment. Indeed, during the last 15 years, several high-throughput data sets from grapevine have been published, including Expressed Sequenced Tags , simple sequence repeats and single-nucleotide polymorphisms molecular markers , QTL maps and transcriptomes . The determination of the genome sequence of grapevine in 200713 created new possibilities for transcriptomic and proteomic studies and for better describing and understanding genome grapevine genetic diversity either through genotyping/re-sequencing studies or de novo sequencing of new genotypes.Phenotypes of different nature have been studied and here too, throughput has notably increased in recent years: for example, the study of single metabolites has been increasingly replaced by metabolomics studies and manual field or greenhouse scoring by the use of more automated processes . The greatest value of these data sets depends on their integration to generate new knowledge, and therefore on the ability to combine the results of different experiments. To allow this, data should be Findable, Accessible, Interoperable and Reusable . An emblematic model in the plant community is Arabidopsis thaliana, for which rich data sets are available and which has been used to derive working hypotheses for gene function in crop species. This has been supported by the TAIR portal and the more recent Arabidopsis Information Portal .
However, in grapevine, the increasing wealth of data is highly dispersed and often poorly accessible, hindering its effective exploitation beyond the scope of its initial production. Moreover, in the absence of dedicated funding and sufficient international collaboration, there is no information portal targeted at the grapevine research community. Although large international repositories do exist for molecular biological data , these do not systematically capture the detailed knowledge related to genome function , the plant material used and any non-molecular phenotyping data that is the specific expertise of grape researchers. Instead, these data are at best published along with research papers and managed in regional and local databases, or at worst isolated on individual researcher’s computers and completely inaccessible to the wider community. There is a clear need for research policies that create incentives favoring data sharing to improve the quality of research results and foster scientific progress.The interpretation of previously published data always requires additional ‘metadata’ to provide the appropriate context. In addition, both data and meta-data should also be formatted in standardized representations to enable its processing in an automated manner and avoid errors generated by manual manipulations, especially in the case of very large data sets.This requires community-wide agreement on guidelines for annotation, tools for data preparation, and the dedicated custodianship of important/exemplar data. Although generic solutions exist for many data types individually, much grapevine data is still far from FAIR, and little support is available for community members to make it so. In 2014, in response to the demands of the grapevine research community, the International Grapevine Genome Program consortium launched an action to define a strategy for the stewardship of grapevine genomic data to allow their easy access and reuse.
The first output was the proposition of a gene nomenclature;the second expected output is a strategy for the broader management of diverse grape data in accordance with the FAIR principles. In this paper, we outline such a strategy for the development of a global Grape Information System , a platform to enable access to a broad collection of data sets and reference data from a wide variety of sources with a flexibility that promotes the rapid introduction of new data sources derived from new and emerging technologies. To meet these objectives, we have devised a plan inspired in part by the experiences of the international Wheat IS initiative that provides a portal for wheat data and by the transPLANT infrastructure for plant genomic science that allows data integration from nine distinct European databases. The GrapeIS will comprise an open federation of independent information systems interconnected by a central web portal , and will provide a tool set to reduce the costs of data publication and interrogation. This will provide a robust, cost-effective model for data integration by exploiting the expertise of existing resources, and best practice and data standards from related research communities grappling with similar problems.Discovering data stored in distinct databases from a single entry point: interoperability of the infrastructures One model for providing integrated access to diverse data sources features a single data custodian, who takes comprehensive responsibility for the storage and integration of all relevant data. An alternative model is to provide an integrated query engine providing a common entry point to dispersed resources, each of which might contain different data . The second model has the advantage of exploiting existing resources . Such a common entry point should allow the discovery of different data types or data sets of the same type , facilitate their integration and facilitate the import of these data into diverse analysis or visualization tools. Achieving this requires a commitment from all contributing resources to serving data in accordance with a set of common standards, such that it can be automatically interrogated in a standard way. The first step in providing FAIR data is ‘findability’. A model for findability for plant-focused resources has been established by the transPLANT project. The transPLANT integrated search engine operates using the generic SolR search engine to provide search facilities over remote data files published by each participating resource conforming to a minimal standard schema .To support more advanced knowledge extraction, the automatic manipulation of data sets, and the efficient and correct reanalysis and re-use of data, a more advanced model is required.Data needs to be annotated with detailed and accurate metadata, requiring both manual curation and automated quality control . Where multiple resources are collaborating, agreement on a common set of controlled vocabularies is required; if vocabulary terms are structured as ontologies , the power of potential queries is increased. In developing such a model,maceta redonda the grape community will be able to draw on other ongoing efforts. Moreover, standard formats must be agreed for publishing such data; and appropriate forums identified for publicizing its availability. Standard formats already exist for many types of data: for example, General Feature Format and Genbank for genome and aligned data, Variant Call Format for nucleotide sequence variants, Binary Alignment Format for next-generation sequence alignments, BioPAX and Systems Biology Mark-up Language for pathways and networks, PSI-MI XML standard for proteomic data and a suite of standards are being proposed by the Data Standards and Metabolite Identification Task Groups of the international Metabolomics Society for metabolites analysis ,as in untargeted metabolomics, robust and standardized structural annotation of metabolites appears crucial to maximize their interpretation and impact. Moreover, international initiatives are on-going to agree on data models that specify APIs for different types of data in relation to plant breeding , genomics ,and with any other specific purpose . Other initiatives as for instance BioSharing , exist to publicize resources with a commitment to providing open data.
With limited resources, a sensible strategy for the grapevine community is to promote the use of existing international repositories for common data types , MetaboLights, PRIDE and so on, which already require submission of standards-compliant data, and to utilize these data in specialized services targeted at the specific needs of grapevine researchers. This has been the strategy of the grapevine community from its start regarding molecular data . For instance, 3971 grapevine transcriptomic data sets have been so far submitted to the GEO database . In turn, phenotypic data are not currently concentrated in any generic resource, nor is there an obvious repository to which submission can be recommended. The grapevine community must therefore assist in the coordination of multiple resources and should contribute to the definition of international standards in the domain. As many of the data will have features in common with those produced by other crop communities, coordination with wider initiatives such as the European Plant Phenotyping Infrastructure is a sensible course.Looking backward, the grapevine community has been increasingly active in the production of data in the life science area, as shown by a very naive search of recent publications in the PubMed database . The data described in the papers are very diverse covering genomes, genotypes, genomic variation, genetic maps, QTLs, association genetics, transcriptomics, proteomics, metabolomics, phenotype characterizations; and rapidly developing, with the quantity of data produced by a single experiment increasing rapidly over time. The development of a common policy for data standardization has lagged and this gap is impairing progress in grapevine research.The foundation of data sharing is to have a good understanding of what is about to be shared. For certain common types of experiments , agreement should be possible about the information that needs to be provided alongside the experimental results in order for that data to be useful and interpretable by others. This idea has been captured, for many experimental types, in ‘Minimum Information’ papers, in which the conceptual metadata needed to support an experiment of that type are defined. Among the metadata standards that might be of interest for the grapevine community are already in common use, including the Minimal Information About a Microarray Experiment ,now evolving into the Minimal Information about high-throughput SEQuencing experiments and the Minimal Information About Proteomic Experiments ,the Metabolomic Standards Initiative has developed a standard for Core Information for Metabolomics Reporting.Such papers have formed the basis for the subsequent development of exchange formats and databases. Others standards are still emerging like the Minimal Information for QTLs and Association Studies , the Minimal Information about a Genotyping experiment or the Minimal Information About Plant Phenotyping Experiments. Experimental metadata within-omics experiments can be conveniently standardized and shared with the ISA-Tab protocols.The success of these standards obviously depends on their adoption by the community, which is determined by many factors, such as its enforcement by publishers and the existence and ease-of-use of an associated toolset.Widespread adoption requires that correct formatting of data must be as simple as possible. On the other hand, if time consuming development of specific tools is required, there is a risk that a format will be slow to evolve, and at risk of being desynchronized with the needs of the data producers in a period where technologies are evolving very rapidly.Inevitably, the understanding of processes that underlie sustainable crop production under varying environmental conditions requires experimentation with a wide diversity of genetic material. This could include the use of mutants or individuals carrying extreme phenotypes to decipher physiological mechanisms, progenies derived from controlled crosses or diversity panels to determine the genetic control of trait variation, individuals collected in situ for the study of the adaptation of populations to environments, the evaluation of wild relatives and so on. In the grapevine community association studies, exploiting natural diversity through large-scale sequencing and phenotyping, have enormous potential to compensate for the lack of large mutant collections and are widely implemented to complement other approaches to support the identification of candidate genes for traits in physiological processes .