Glutamate synthase plays a vital role in nitrogen metabolism, and its ortholog in the ascomycete Magnaporthe oryzae MoGLT1 is required for conidiation and complete virulence on rice. GMC oxidoreductase exhibits important auxiliary activity 3 according to the Carbohydrate-Active enzymes database and is required for the induction of asexual development in Aspergillus nidulans. An extensive approach was used for the global annotation of CAZyme genes in P. pachyrhizi genomes, and after comparison with other fungal genomes, we also found clear expansions in glycoside hydrolases family 18 and glycosyltransferases family 1 . GH18 chitinases are required for fungal cell wall degradation and remodelling, as well as multiple other physiological processes, including nutrient uptake and pathogenicity. The Phakopsoraceae to which P. pachyrhizi belongs represents a new family branch in the order Pucciniales1 . With three P. pachyrhizi genome annotation replicates available, next to the above CAFÉ-analysis, we can directly track gene family expansions and contractions in comparison to genomes previously sequenced. We, therefore, compared P. pachyrhizi to the taxonomically related families Coleosporiaceae, Melampsoraceae and Pucciniaceae, which in turn may reveal unique lifestyle adaptations . The largest uniquely expanded gene family in P. pachyrhizi comprises sequences containing the Piwi domain . Typically, the Piwi domain is found in the Argonaute complex, where its function is to cleave ssRNA when guided by dsRNA. Interestingly, classes of longer-than-average miRNAs known as Piwi-interacting RNAs that are 26-31 nucleotides long are known in animal systems.
In Drosophila, these piRNAs function in nuclear RNA silencing,hydroponic growing systems where they associate specifically with repeat-associated small interfering RNA that originate from TEs. As in other fungal genomes, the canonical genes coding for large AGO proteins with canonical Argonaute, PAZ and Piwi domains can be observed in the genome annotation of the three P. pachyrhizi isolates. The hundreds of expanded predicted Piwi genes consist of short sequences of less than 500 nt containing only a partial Piwi domain aligning with the C-terminal part of the Piwi domain in the AGO protein. Some of these genes are pseudogenes marked by stop codons or encoding truncated protein forms, while others exhibit a partial Piwi domain starting with a methionine and eventually exhibiting a strong prediction for an N-terminal signal peptide. These expanded short Piwi genes are surrounded by TEs, several hundreds of which, but not all, are found in close proximity to specific TE consensus identified by the REPET analysis in the three P. pachyrhizi isolates . However, no systematic and significant association could be made due to the numerous nested TEs present within the genome. Moreover, none of the expanded short Piwi domain genes are expressed in the conditions we tested. However, in many systems, Piwis and piRNAs play crucial roles during specific developmental stages where they influence epigenetic, germ cell, stem cell, transposon silencing, and translational regulation. Finally, the domain present in these short Piwi genes is partial, and we do not know whether they retain any RNase activity. Therefore, we cannot validate at this stage the function of this family, which warrants further study and attention as it may represent either a new type of TE-associated regulator within P. pachyrhizi, or an expansion of a control mechanism to deal with this highly repetitive genome. Several families related to amino acid metabolism have expanded greatly when compared to the respective families in other rust fungi, most notably Asparagine synthase , which has ~75 copies in P. pachyrhizi compared to two copies in Pucciniaceae and one copy in Melampsoraceae .
Similarly, expanded gene families can be observed in citrate synthase , malate synthase , NAD-dependent malate dehydrogenase . These enzymes are involved in energy production and conversion via the citrate cycle required to produce certain amino acids and the reducing agent NADH . Next to the molecular dialogue with effector proteins, plantpathogen interactions are a “tug-of-war” of resources between the host and the pathogen.Therefore, the expansion in amino acid metabolism may reflect an adaptation to become more effective at securing this resource. Alternatively, the expanded categories also may reflect the metabolic flexibility needed to facilitate the broad host range of P. pachyrhizi, which to date comprises 153 leguminous species in 56 genera. Associations with TEs are often a sign for adaptive evolution as they facilitate the genetic leaps required for rapid phenotypic diversifification. Gene duplication and gene family expansion can be directly linked to transposition activity due to imprecise excision and re-insertions and carry other genetic sequences. Transposition independent mechanisms may also promote structural rearrangements leading to gene family expansions through the recombination of homologous regions between TE copies. The TEs in these expansions may potentially be inactive. We, therefore, investigated whether the expansion in amino acid metabolism could reflect a more recent adaptation by studying the TEs in these genomic regions. Furthermore, as described above, a distinction can be made between more recent bursts of TE activity and older TE bursts leading to degeneration of the TE sequence consensus. However, despite the presence of several copies of specific TE subfamilies in the vicinity of the surveyed expanded families such as amino acid metabolism, CAZymes and transporter related genes , no significant enrichment could be observed for any particular TE when compared to the overall TE content of the genome.
This may reflect the challenge of making such clear associations due to the continuous transposition activity, which results in a high plasticity of the genomic landscape and a highly nested TE structure. Alternatively, it may suggest a more ancient origin of these expansions that have subsequently been masked by repetitive episodes of relaxed TE expression .The high molecular weight genomic-DNA was extracted using a carboxyl-modified magnetic bead protocol for K8108, a CTAB-based extraction for MT2006, and a modified CTAB protocol for UFV02. For K8108, a 20-kb PacBio SMRTbell library was prepared by Genewiz with 15-kb Blue Pippin size selection being performed prior to sequencing on a PacBio Sequel system . The K8108 PacBio Sequel genomic reads yielding 69 Gbp of sequence data were error corrected using MECAT; following parameter optimization for contiguity and completeness, the longest corrected reads yielding 50x coverage were assembled with MECAT’s mecat2canu adaptation of the Canu assembly workflow, using an estimated genome size of 500 Mbp and an estimated residual error rate of 0.02. The resulting assembly had further base pair-level error correction performed using the Arrow polishing tool from PacBio SMRTTools v5.1.0.26412. MT2006 genome was sequenced using the Pacific Biosciences platform. The DNA sheared to >10 kb using Covaris g-Tubes was treated with exonuclease to remove single-stranded ends and DNA damage repair mix, nft system followed by end repair and ligation of blunt adapters using SMRTbell Template Prep Kit 1.0 . The library was purified with AMPure PB beads and size selected with BluePippin at >6 kb cutoff size. PacBio Sequencing primer was then annealed to the SMRTbell template library, and sequencing polymerase was bound to them using Sequel Binding kit 2.0. The prepared SMRTbell template libraries were then sequenced on a Pacific Biosystem’s Sequel sequencer using v2 sequencing primer, 1 M v2 SMRT cells, and Version 2.0 sequencing chemistry with 1 × 360 and 1 × 600 sequencing movie run times. The Phakopsora pachyrhizi MG2006 v1.0 genome was sequenced with PacBio, assembled with MECAT, polished with arrow, and annotated with the JGI Annotation Pipeline. For UFV02, the PromethION platform of Oxford nanopore technology was used for long-read sequencing at KeyGene N.V. . The libraries with long DNA fragments were constructed and sequenced on the PromethION platform. The UFV02 genome assembly, the longest 15, 20, 25, 30, 34, 40 and 56x nanopore reads were assembled using the Minimap2 and Miniasm pipeline. To improve the consensus, error correction was performed three times with Racon using all the nanopore reads. The resulting assembly was polished with 50x Illumina PCR-free 150 bp paired-end reads mapped with bwa and Pilon, and repeated three times. We assessed the BUSCO scores after each step to compare the improvement in the assemblies.The gene predictions and annotations were performed in the P. pachyrhizi genomes K8108, MT2006 and UFV02 in parallel using the JGI Annotation Pipeline.
TE masking was done during the JGI procedure, which detects, and masks repeats and TEs. Later, the extensive TE classification performed with REPET was imported and visualized as a supplementary track onto the genome portals. RNAseq data from each isolate was used as intrinsic support information for the gene callers from the JGI pipeline. The gene prediction procedure identifies a series of gene models at each gene locus and proposes the best gene model to define a filtered gene catalogue. Translated proteins deduced from gene models are further used for functional annotation according to international reference databases. All the annotation information is collected into an open public JGI genome portal in the MycoCosm with dedicated tools for community-based annotation. In total, 18,216, 19,618 and 22,467 gene models were predicted from K8108, MT2006 and UFV02, respectively ; of which 10,492, 10,266 and 9,987 genes were functionally annotated. We have performed differential expression analyses using the germinated spores as a reference point in each of the three isolates . A total of 3,608 common differentially expressed genes were identified in at least one condition shared between two or more isolates .For expression analysis, 11 different stages were evaluated, with eight stages having an overlap of two or more isolates. These stages were nominated 1-11, as illustrated in Fig. 3c. For K8108, seven in vitro, one on planta and eight in planta samples, each with three biological replicates, were generated and used to prepare RNA libraries. To get in vitro germ tubes and fungal penetration structures, a polyethylene foil was placed in glass plates and inoculated with a spore suspension . Each biological replicate corresponded to 500 cm² foil and ~4 mg urediospores. The plates were incubated at 22 °C in the dark at saturated humidity for 0.5, 2, 4 or 8 h. After incubation, the spores were collected using a cell scraper. For the appressoria-enriched sample, urediospore concentration was doubled and the plates rinsed with sterile water after 8 h of incubation prior to collection. The material was ground with mortar and pestle in liquid nitrogen. The time 0.5 h was considered as spore , the 2 h as a germinated spore , and the 8 h rinsed as appressoria enriched sample in vitro . The samples of spores collected after 4 and 8 h were not used for expression analysis. To obtain on planta fungal structures, three-week-old soybean plants were inoculated as mentioned above. After 8 HPI, liquid latex was sprayed until complete leaf coverage. After drying off, latex was removed. It contained the appressoria and spores from the leaf surface but no plant tissue. This sample was considered as enriched in appressoria on plant and is exclusive for K8108 isolate . Three middle leaflets of different plants were bulked for each sample and ground in liquid nitrogen using a mortar and pestle. The inoculated leaf samples were harvested at 10, 24, 72 and 192 HPI for the in planta gene expression studies. For MT2006, the germ tubes, and appressorium were produced on polyethylene sheets where urediospores were finely dusted with household sieves held in a double layer of sifting. The PE sheets were then sprayed with water using a chromatography vaporizer and were kept at 20 °C, 95% humidity in the dark. For germ tubes the structures were scratched from the PE sheets after 3 h and for appressoria after 5 h . The formation of both germ tubes and appressoria was checked microscopically. The in vitro samples were only used when there were at least 70% germ tubes or appressoria. The structures were dried by vacuum filtration and stored in 2-ml micro-centrifuge tubes at −70 °C after freezing in liquid nitrogen. The resting spores came directly from storage at −70 °C . For the in planta samples, 21 days old soybean cultivar Thorne was sprayed with a suspension containing 0.01% Tween-20, 0.08% milk-powder and 0.05% urediospores. The inoculated plants were kept, as mentioned previously.