DNA-dependent RNA polymerase IV and V (Pol IV and V) are multi-subunit enzymes occurring in plants. The origin of Pol V, specific to angiosperms, from Pol IV, which is present in all land plants, is linked to the duplication of the gene encoding the largest subunit and the subsequent subneofunctionalization of the two paralogs (NRPD1 and NRPE1). Additional duplication of the second-largest subunit, NRPD2/NRPE2, has happened independently in at least some eudicot lineages, but its paralogs are often subject to concerted evolution and gene death and little is known about their evolution nor their affinity with Pol IV and Pol V.
We sequenced a ~1500 bp NRPD2/E2-like fragment from 18 Viola species, mostly paleopolyploids, and 6 non-Viola Violaceae species. Incongruence between the NRPD2/E2-like gene phylogeny and species phylogeny indicates a first duplication of NRPD2 relatively basally in Violaceae, with subsequent sorting of paralogs in the descendants, followed by a second duplication in the common ancestor of Viola and Allexis. In Viola, the mutation pattern suggested (sub-) neofunctionalization of the two NRPD2/E2-like paralogs, NRPD2/E2-a and NRPD2/E2-b. The d
Sratios indicated that a 54 bp region exerted strong positive selection for both paralogs immediately following duplication. This 54 bp region encodes a domain that is involved in the binding of the Nrpd2 subunit with other Pol IV/V subunits, and may be important for correct recognition of subunits specific to Pol IV and Pol V. Across all Viola taxa 73 NRPD2/E2-like sequences were obtained, of which 23 (32%) were putative pseudogenes - all occurring in polyploids. The NRPD2 duplication was conserved in all lineages except the diploid MELVIO clade, in which NRPD2/E2-b was lost, and its allopolyploid derivates from hybridization with the CHAM clade, section Viola and section Melanium, in which NRPD2/E2-a occurred in multiple copies while NRPD2/E2-b paralogs were either absent or pseudogenized.
Following the relatively recent split of Pol IV and Pol V, our data indicate that these two multi-subunit enzymes are still in the process of specialization and each acquiring fully subfunctionalized copies of their subunit genes. Even after specialization, the NRPD2/E2-like paralogs are prone to pseudogenization and gene conversion and NRPD2 and NRPE2 copy number is a highly dynamic process modulated by allopolyploidy and gene death.