Phylogenomic analyses and distribution of terpene synthases among Streptomyces

Terpene synthases are widely distributed among microorganisms and have been mainly studied in members of the genus Streptomyces. However, little is known about the distribution and evolution of the genes for terpene synthases. Here, we performed whole-genome based phylogenetic analysis of Streptomyces species, and compared the distribution of terpene synthase genes among them. Overall, our study revealed that ten major types of terpene synthases are present within the genus Streptomyces, namely those for geosmin, 2-methylisoborneol, epi-isozizaene, 7-epi-α-eudesmol, epi-cubenol, caryolan-1-ol, cyclooctat-9-en-7-ol, isoafricanol, pentalenene and α-amorphene. The Streptomyces species divide in three phylogenetic groups based on their whole genomes for which the distribution of the ten terpene synthases was analysed. Geosmin synthases were the most widely distributed and were found to be evolutionary positively selected. Other terpene synthases were found to be specific for one of the three clades or a subclade within the genus Streptomyces. A phylogenetic analysis of the most widely distributed classes of Streptomyces terpene synthases in comparison to the phylogenomic analysis of this genus is discussed.


S1
DTL analyses of epi-isozizaene synthases Tables S1 Summary on orthologue analysis based on 93 Streptomyces genomes using OrthoFinder S2 Streptomyces genomes used for constructing the phylogenetic trees in Figure 1 and Supplementary Figure S1 S3 List of geosmin synthases used to build the phylogenetic tree in Figure 3  S4 List of 2-methylisoborneol synthases used to build the phylogenetic tree in Figure 4

S5
List of epi-isozizaene synthases used to build the phylogenetic tree in Figure 5

S6
List of geosmin synthases used to build the DTL tree in Figure S2

S7
List of 2-methylisoborneol synthases used to build the DTL tree in Figure S3  S8 List of epi-isozizaene synthases used to build the DTL tree in Figure S4   node. Node labels in grey indicate that there was gene loss and node labels in black (bold) indicate that there was congruence between the enzyme tree and the species tree. The names on the node labels refer to a particular enzyme and the species harbouring it (see Table S6). Figure S7. DTL analyses of 2-methylisoborneol synthases. T, Transfer node. Blue square, Speciation node. Node labels in grey indicate that there was gene loss and node labels in black (bold) indicate that there was congruence between the enzyme tree and the species tree.

S8
The names on the node labels refer to a particular enzyme and the species harbouring it (see Table S7). Figure S8. DTL analyses of epi-isozizaene synthases. T, Transfer node. Blue square, Speciation node. Node labels in grey indicate that there was gene loss and node labels in black (bold) indicate that there was congruence between the enzyme tree and the species tree.

S9
The names on the node labels refer to a particular enzyme and the species harbouring it (see Table S8). Number of single-copy orthogroups 575 S11 Table S2: Streptomyces genomes used for constructing the phylogenetic trees in Figure 1 and Figure S1.       Table S8: List of epi-isozizaene synthases used to build the DTL tree in Figure S8.  Table S9. Habitats of the Streptomyces species represented in the whole genome-based phylogenetic tree. The species are separated according to the three phylogenetic clades (green, blue and red) shown in the phylogenetic tree in Figure 1.  domain of geosmin synthases were detected under positive selection. In accordance with this, the N-terminal part of geosmin synthase was shown to be highly conserved among Streptomyces and essential for the conversion of FPP to germacradienol and germacrene [95].

S10
Phylogeny of terpene synthases does not correspond to species-level taxonomy NOTUNG analyses [96] were performed to reconcile an associate tree with a reference tree. Under the cost matrix for duplications, transfers, and losses as used by TreeFix, NOTUNG recovered most parsimonious scenarios with 25 putative transfers and 13 corresponding losses in the geosmin synthase category. However, NOTUNG failed to infer any events in epi-isozizaene and 2-MIB categories. This can be explained by inherent topological errors in the maximum-likelihood (ML) trees for epi-isozizaene and 2-MIB synthases. Treefix-DTL (duplication-transfer-loss) was used to correct topological inconsistencies in all available terpene synthase trees, including the geosmin synthase tree. The DTL-reconciliation problem is typically solved in a parsimony framework, where costs are assigned to DTL events and the goal is to find reconciliation with minimum total cost [97]. For the individual categories, treefix-DTL minimised the DTL cost and generated trees with minimum reconciliation cost among all the associated trees that have likelihood statistically equivalent to that of the ML trees ( Figures S6-S8). Accordingly, the subsequent NOTUNG analysis successfully recovered a minimal number of events in all the three categories. NOTUNG inferred a total of 19 transfers and 10 losses while reconciling the geosmin synthase tree with the Streptomyces species whole genome-based tree.
Similarly, the number of transfer/loss in epi-isozizaene-species tree reconciliation