Protein–RNA condensation kinetics via filamentous nanoclusters
Ramon Peralta-Martinez and Araceli Visentin contributed equally to this study.
Review Editor: Jean Baum
Abstract
Protein–RNA phase separation is at the center of membraneless biomolecular condensates governing cell physiology and pathology. Using an archetypical viral protein–RNA condensation model, we determined the sequence of events that starts with sub-second formation of a protomer with two RNAs per protein dimer. Association of additional RNA molecules to weaker secondary binding sites in this protomer kickstarts crystallization-like assembly of a molecular condensate. Primary nucleation is faster than the sum of secondary nucleation and growth, which is a multistep process. Protein–RNA nuclei grow over hundreds of seconds into filaments and subsequently into nanoclusters with approximately 600 nm diameter. Cryoelectron microscopy reveals an internal structure formed by incoming layers of protein–RNA filaments made of ribonucleoprotein oligomers, reminiscent of genome packing of a nucleocapsid. These nanoclusters progress to liquid condensate droplets that undergo further partial coalescence to yield typical hydrogel-like protein–RNA coacervates that may represent the scaffold of large viral factory condensates in infected cells. Our integrated experimental kinetic investigation exposes rate-limiting steps and structures along a key biological multistep pathway present across life kingdoms.
1 INTRODUCTION
Dynamic biochemical compartmentalization within cells is achieved by the formation of biomolecular condensates based on liquid–liquid phase separation (LLPS) principles from polymer chemistry and tightly associated with the concept of membraneless organelles (Banani et al. 2017; Hyman et al. 2014). Examples of biomolecular condensates relate to most aspects of biology both in physiology and pathology (Banani et al. 2017; Chakravarty et al. 2022; Mehta and Zhang 2022; Shin and Brangwynne 2017). Since the early discovery of the liquid nature of P granules (Brangwynne et al. 2009), protein–RNA condensation is at the center of most of these sub-cellular entities (Guo and Shorter 2015; Putnam et al. 2023), many associated with gene function (Hirose et al. 2023), stress and immunity (Glauninger et al. 2022; Goswami et al. 2023; Maharana et al. 2022; Ripin and Parker 2022; Wang and Zhou 2024), and often associated with protein aggregopathies (Alberti and Dormann 2019). A wealth of reports described how it is modulated by RNA sequence, length, reentrant behavior, impacting size, shape, viscosity, surface tension, and composition (Laghmach et al. 2022; Maharana et al. 2018; Netzer et al. 2024; Portz and Shorter 2021; Roden and Gladfelter 2021; Stringer et al. 2023; Yamazaki et al. 2022).
Viral replication sites, also known as viral factories, viral inclusion bodies, or viroplasms, were shown to be liquid in nature in rabies and respiratory syncytial viruses (Nikolic et al. 2017; Rincheval et al. 2017), and since then, biomolecular condensation of viral factories appeared as a common theme among most viruses investigated to date, including SARS-COV-2 (Brocca et al. 2020; Chau et al. 2023; Lopez et al. 2021; Martin et al. 2024; Sehgal et al. 2020; Wu et al. 2022a; Zhang et al. 2023). The nucleocapsid protein from SARS-CoV-2 (N) was shown to be a main driver for LLPS together with RNA (Carlson et al. 2020; Cubuk et al. 2021; Jack et al. 2021; Perdikari et al. 2020). Since the initial studies during the COVID-19 pandemic, a vast number of reports on N condensation produced a wealth of information thoroughly reviewed and systematically compared by Cascarina and Ross (2022). Conclusions of this study are a fair summary of the current knowledge on the system: (i) a role of N in virion assembly, genome packing, and polymerase recruiting; (ii) condensation is RNA-dependent, electrostatically driven with little sequence specificity; (iii) RNA concentration, length, base composition, and structure can alter material properties and produce reentrant behavior; (iv) although it bears an ideal LLPS-prone architecture including RNA binding, dimerization, and substantial intrinsic disorder, there is no consensus on the participation of the different domains in condensation; (v) proteolysis, mutation, and post-translational modification can influence N protein PS behavior in vitro and in vivo; (vi) condensate morphology, size, and material properties are highly condition-sensitive and variable in different reports. The proposed and likely most relevant functions for this strong tendency of N to undergo heterotypic LLPS with RNA are recruitment of N protein to stress granules, selective condensation of the viral genome (gRNA), regulation of host-cell innate immune pathways, transcription modulation, recruitment of the viral RNA-dependent RNA polymerase (RdRp) to viral replication centers, and anchoring of N-gRNA condensates at the endoplasmic reticulum–Golgi membrane prior to virus budding (Cascarina and Ross 2022).
Genome condensation can be linked both to replication by the polymerase and accessory proteins but also to the formation of a nucleocapsid that must pack into a virion. In single-stranded RNA virus assembly the genome packaging and capsid assembly are tightly linked and are considered to take place concomitantly (Bailey et al. 2009). This process is thought to involve weak protein–protein and non-specific electrostatic genome–protein interactions, including packing initiation signals in some cases, and participation of RNA secondary structure (Cascarina and Ross 2022). A recent model paradigm involves the action of multiple cooperative contacts between nucleocapsid protein and RNA genome (Bailey et al. 2009). Some of these concepts are shared in protein–RNA condensation and suggest a link between the two processes. A recent quantitative investigation of self-assembly kinetics of a viral capsid around its genome, revealed details on primary protein binding to the genome, via a nucleation and growth mechanism (Garmann et al. 2019).
In this work, we focus on N-RNA as a minimal model for investigating the kinetic mechanism, that is, the sequence of events and species involved in a protein–RNA condensation reaction, which impacts RNA virus gene function and assembly. We uncover the sequence of events that involve nucleation and a combination of first- and second-order reactions, applying established models of self-assembly. We were able to obtain snapshots of intermediate species that can be characterized as self-limited and monodisperse nanoclustered condensates, with intriguing filament-like composition, which subsequently evolve into partially coalescing hydrogel-like coacervates. We discuss our findings in the light of fundamental protein-RNA condensation mechanisms and viral genome condensation and virion packing.
2 RESULTS
2.1 RNA binding and oligomerization equilibria preceding condensation of N
A myriad of reports with high variations in experimental conditions yielded different, sometimes conflicting, results, which suggest a high sensitivity of the system to the conditions (Cascarina and Ross 2022). In addition, to obtain a homogeneous and pure protein, free of any trace of RNA or conformational heterogeneities, we performed a stringent purification method involving two unfolding–refolding steps (see section 4). Two particularly sensitive parameters that impact LLPS are the oligomeric state of the protein and RNA binding (Cascarina and Ross 2022). The oligomeric state in our selected buffer conditions was evaluated by performing multi-angle light scattering size exclusion chromatography (SEC-MALS). The protein is present predominantly as a dimer (ND) (Figure 1a). However, variations in protein concentration indicate that a monomer–dimer equilibrium is established with a slow exchange between the species that allow us to observe them as individual peaks (Figure 1b). Although the characteristics of the SEC technique do not allow us to obtain an accurate KD, the ratio of both species shows a dissociation transition occurring with a midpoint around 0.5 μM of the injected sample (Figure 1c). Considering a 20-fold dilution in the column, this would correspond to a KD around 25 nM, in agreement with a previous report (Zhao et al. 2021).

In any case, it is therefore safe to assume that in our experimental conditions the protein will not populate monomeric species (NM) above 2 μM. Dilution experiments using dynamic light scattering (DLS) allowed us to observe two consecutive transitions at higher protein concentrations (Figure 1d). At 5 μM the dimer is predominant from 20 to 90 μM, the predominant species is a tetramer, and above 100 μM, an octameric species is formed.
RNA binding is also a fundamental aspect for the condensation mechanism, and therefore we tackled a quantitative assessment. For this, we followed the intrinsic fluorescence changes of the tryptophan signal upon the addition of RNA, which yields a substantial quenching. Three different RNAs were used: (i) TRS-10 (UCUCUAAACG), a 10-mer sequence previously described as relevant for coronavirus life cycle (Alonso et al. 2002); (ii) TRS-15 (UUCUCUAAACGAACU), which extends five additional nucleotides; and (iii) Random-15 (AGUUGAGUUGAGUUG), a 15-mer RNA formed by a repeat of a 5-mer random sequence (AGUUG). We determined a binding stoichiometry of 1:1 NM:RNA from a titration/saturation experiment under this concentration regime (Figure 1e, inset). Under N-RNA dissociation conditions, we determined the affinity for the three different RNAs using a single-site binding model and observed no significant differences in affinity (Figure 1e), suggesting that the protein does not display high sequence specificity in short RNA sequences (10–15 mer). Additionally, to gain insight into the binding kinetics, we determined an observed rate of the RNA binding at 0.4 μM NM and 50 nM of TRS-10 RNA to be 0.88 s−1, corresponding to a t1/2 of 0.78 s, indicating the primary binding is completed in under 5 s (Figure 1f).
2.2 Modulation of N condensation by short versus structured RNA models
As the oligomeric state of the protein depends on concentration, we decided to refer to monomeric concentrations in this section. The tendency for homotypic LLPS of N was evaluated by a phase diagram composed of protein and crowder concentration, where little if any droplets can be observed in the absence of crowder even at high concentration where the tetrameric species is expected (47 μM) (Figure S1a, Supporting Information). In the presence of PEG, we observe a transition from small bead-like sticky droplets to regular spherical droplets, suggestive of different physicochemical properties between 1.9% and 3.8% crowder (Figure 2a,b).

Next, we analyzed the formation and modulation of heterotypic N:RNA LLPS by performing phase diagrams varying the concentration of both components. The 15-mer RNA (which shows no difference in binding affinity to TRS15, as shown in Figure 1e) forms coalescent droplets that increase in size with higher protein and RNA concentrations, as expected (Figure 2e). The increase in droplet size correlates with the absorbance signal at 370 nm (Figure S1c; Pearson correlation coefficient 0.92). Noticeably, there is a sharp transition boundary that shows a condensation onset at 2:1 RNA:N (Figure 2c, red boundary) and not 1:1 RNA:N stoichiometric binding ratio (Figure 1e), not even at high concentrations (Figure 2c). Below the red boundary, no LLPS is observed in excess N, and the opposite applies to excess RNA above the red boundary. Condensates at 2 RNA15:1N ratio disappear upon dilution (Figure 2c, diagonal arrow down). The propensity of N to condense with RNA10 is negligible compared to RNA15 under a similar concentration regime (Figure 2d) and requires lowering the ionic strength (Figure S1b).
N protein wraps and condenses the 30 kb genome of the virus and binds to RNA transcripts, implying interaction with longer and heterogeneous sequences. To test a simple model for heterogeneous variable sequences and structure encountered in the cellular environment, we make use of total purified yeast t-RNA, with an average length of 70 bases. The phase diagram shows that condensation has an optimal stoichiometric window of 1 tRNA:3.5N (Figure 2e, red boundaries), and transition to one-phase above and below that ratio. The difference between RNA15 and tRNA is clearly observed when we monitor turbidity along the RNA:protein ratio range, showing a finely tuned reentrant phase for the tRNA but not for the 15 mer (Figure 2c,e,g). A noticeable observation is that RNA15-N condensates are regular coalesced droplets, while the tRNA condensates have a more rigid appearance that tends to stick with only partial coalescence at higher concentration, typical of hydrogel-like coacervates. We confirmed that both protein and RNA are present in the droplets of the RNA15 as well as tRNA (Figure 2f), but we chose not to use chemically modified N to avoid or minimize possible effects of the fluorophore labeling.
While the tRNA-N condensates are highly sensitive to an increase in the ionic strength (Figure 3a, left) as expected for a charge coacervate of oppositely charged polymers, the crowder enhanced homotypic condensate is increased at high salt (Figure 3a, right), strongly suggesting that condensation leads on average to formation of unfavorable electrostatic interactions relative to N in solution. Conversely, the heterotypic condensate is less stable at high salt concentrations. This suggests that condensation of N and tRNA, which have charges of opposite signs, leads on average to formation of favorable electrostatic interactions relative to the two molecules in solution. Overall, this set of experiments allows us to define controlled solution conditions and macromolecular ratios, based on the different phase diagrams observed, for tackling a quantitative mechanistic investigation.

2.3 Kinetic mechanism of N-RNA condensation
Comprehensive understanding of a condensation mechanism involves the dissection of the initial, intermediate, and endpoint species present, their rates of conversion, and the sequence of events. To this end, we investigated the evolution of the turbidity with time measured as scattered absorbance at 370 nm light, indicative of condensation (Figure 2g). We triggered the reaction by adding RNA15 (Figure 4a) or tRNA (Figure 4b) to pre-incubated N at different concentrations, while keeping a RNA:N ratio (2:1 and 1:3.5, respectively, Figure 2c,e). We monitored the change in the absorbance signal over hundreds to thousands of seconds (Figure 4a,b), and no evidence of fast burst-phase events in terms of condensation in either case, as no significant changes in absorbance can be observed within the experimental deadtime (~15 s). The kinetic traces of both condensation processes can be described empirically using a sum of two exponential reactions (Figure S2a for RNA15 and Figure S2b for tRNA). The fitted amplitudes of the fast and slow processes are close to 50% each. The empirical rate constants in the case of N-RNA15 condensation are 0.0085 ± 0.0001 s−1 (kobs1) and 0.00084 ± 0.00001 s−1 (kobs2) at 1 μM ND (4 μM RNA-15). Rates of 0.0081 ± 0.0002 s−1 (kobs1) and 0.0021 ± 0.0001 s−1 (kobs2) were determined for tRNA condensation, at 1 μM ND (0.6 μM tRNA). Thus, although no significant differences are observed for kobs1, tRNA condensation showed a kobs2 almost three-fold higher compared to RNA15 condensation. Moreover, the total amplitude of both processes decreases at low protein concentrations and displays a similar condensation onset at ~0.7 μM N-RNA (Figure S2c). No lag phase was observed for N condensation under these conditions (Figure 4a,b). However, a lag phase could take place within our experimental deadtime, depending on the conditions and the values of the rate constants.

To gain insight into the N-RNA condensation mechanism, we first applied the model developed by Martins and coworkers, originally developed to describe amyloid fibril formation, but which can also be used to study any association process involving multiple molecules (Crespo et al. 2012; Sárkány et al. 2024). This model describes condensation by a general mechanism involving supersaturation-dependent nucleation and growth steps. The concentration dependence of condensation can be used to extract information on primary and secondary nucleation, growth, and the presence of parallel pathways. The model considers the formation of a primary nucleus from molecules in the solution, growth of the primary nucleus, and the formation of secondary nuclei on the surface of the primary nucleus. The analysis of experimental data quantifies the relative importance of the kinetic steps of primary nucleation, secondary nucleation, and growth. We fitted the model to normalized mass-progress curves (Figure 4a,b, data in red, fitting in black) and to the scaling of t1/2 versus the initial N dimer concentration (Figure 4c,d). The model describes the concentration dependence of N-RNA15 condensation very well (R2 0.987; Figure 4c), with a fitted value for the critical solubility (Cc) of 0.698 μM ND in excellent agreement with that obtained from extrapolation of the total condensation amplitude (Figure S2c). The fitted autocatalytic rate (ka) is 0.0031 s−1, which corresponds to the sum of the rate constants for growth (k+) and secondary nucleation (k2). Mass-progress curves alone are unable to distinguish between these two processes, which requires a detailed analysis using multiple size-progress curves at different concentrations. The dimensionless nucleation rate (kb = kn/ka) is 0.999. This value is greater than 0.1, indicating that according to the model, the primary nucleation is faster relative to the sum of rate constants for growth and secondary nucleation. A global fit of the data yielded a R2 value lower than 0.95 (not shown), suggesting plausible parallel processes such as coalescence and/or off-pathway condensation. Overall, we find that N-RNA15 condensation can be described using the crystallization-like assembly model, where primary nucleation is relatively fast and parallel processes are likely present.
On the other hand, the model describes the concentration dependence of N-tRNA condensation poorly, due to the process slowing down from 2 to 3 μM ND (Figure 4d; R2 –0.76). The poor quality of the fit prevented us from extracting values for the autocatalytic rate (ka) or the dimensionless nucleation rate (kb = kn/ka). However, the v-shaped dependence of t1/2 does have relevant mechanistic information, since it suggests the dominance of secondary nucleation over nucleus growth in the crystallization-like assembly process and the additional presence of soluble and stable off-pathway aggregates (see Silva et al. 2018, fig. 6e,f ).
As a complementary approach, we made use of the model by Zlotnick et al. (1999) for the kinetically limited assembly of a molecular condensate. Although this model was originally developed to describe the assembly of viral capsids, it may be used to study other association processes. The kinetically limited assembly model describes condensation as a cascade of low-order association reactions, where a rate-limiting “nucleation” step is followed by faster elongation steps.
The concentration dependence of condensation can be used to extract information on both the nucleation and the elongation steps. The size of the nucleus can be calculated from the slope of a double logarithmic plot of the concentration of condensed versus free N-RNA at a fixed time point early in the reaction, typically at a time when the kinetics is well described by a straight line (Zlotnick et al. 1999). In the case of N-RNA15 and N-tRNA condensation, we calculated the concentration of condensed and free N-RNA in the linear range. Using this approach, we obtain a slope of 1.94 ± 0.21 for N-RNA15 condensation corresponding to a nucleus formed by N-RNA15 protomers (Figure 4e), each protomer corresponding to the stable N dimer–RNA15 stoichiometric complex (Figure 1f), referred to as the nucleation protomer. In the case of N-RNA15 reaction, the nucleus is composed of two protomers.
In the case of N-tRNA condensation, a slope of 0.96 ± 0.40 indicates a nucleus formed by one N-tRNA dimer as the protomer (Figure 4f). We interpret that a slow early step in N-RNA15 condensation likely involves association of two N dimers after RNA binding and a conformational rearrangement of the initial N dimer after tRNA binding for N-tRNA condensation.
Moreover, the reaction order of the faster elongation step can be determined from the slope of a double logarithmic plot of the initial rate in the reaction (typically at a time when the kinetics are well described as a straight line) versus the initial concentration of N dimer protomers. This analysis yields slopes of 2.89 ± 0.17 and 2.92 ± 0.55 for N-RNA15 and N-tRNA condensation, respectively (Figure 4g,h). This indicates that the reaction order of the elongation reactions for both N-RNA15 and N-tRNA condensation is close to three. There are two possible interpretations for this number. The first one is that the elongation step is a single-step, elementary reaction that involves simultaneous collision of three molecular protomers. We interpret this as improbable because ternary collisions with the right orientation and energy are rare, and the resulting reaction would likely be slower than the competing processes. The second interpretation is that the elongation step in both N-RNA15 and N-tRNA condensation is a multi-step process that involves the incorporation of multiple N-RNA protomers to the growing nucleus. In summary, the kinetically limited assembly model allowed us to describe the nucleation and elongation steps for both N-RNA15 and N-tRNA condensation. The two processes show clear differences and additional complexities to be characterized in future work.
To analyze the contribution of electrostatic interactions to the kinetics of N-tRNA condensation, we evaluated the effect of modifying the salt concentration on the observed rates from exponential analysis (Figure 3c). The absorbance amplitude of the reaction decreased with increasing sodium chloride concentrations (Figure 3b), as expected from the salt dependence of the condensation droplets observed under the microscope (Figure 3a). The two fitted rate constants remain approximately constant upon going from 50 to 100 mM sodium chloride. Since sodium chloride does not have a net effect on the kinetics of N-tRNA condensation, we interpret that the electrostatic interactions stabilizing the condensate are not formed in the rate-limiting transition states along the pathway.
2.4 Characterization of intermediate species along the kinetic condensation pathway
We next addressed the characterization of the intermediates involved along both N-RNA condensation pathways by analyzing the sizes of the species involved using dynamic light scattering (DLS). We used a N concentration of 5 μM monomer, which allows for an accurate determination of sizes and uses the same N:RNA ratio as in the kinetic experiments. Under these conditions, the apparent hydrodynamic radius (rh) of the uncondensed N dimer is between 6.5 and 7.0 nm (Figure 5a,b, gray peaks). Upon addition of RNA15 or tRNA, we observe the formation of larger mono-disperse species within 2 min, with apparent rh of 120 and 261 nm, respectively (Figure 5a,b, 2 min red and blue). The apparent hydrodynamic radius of these species increases to 320 nm for RNA15 and 836 nm for tRNA after 30 min. For both reactions, we observe intermediate sized species of 22.8 ± 2.51 nm and 46.03 ± 5.38 nm on average for RNA15 and tRNA, respectively (Figure 5a, magenta peaks in insets). Noticeably, the peaks are mono-disperse, indicative of discrete sizes, compatible with a nucleation-growth model. However, tRNA peaks show increased polydispersity as time increases. Finally, at longer times we observe a poly-disperse species larger than 2 μm in size, above the limit of the instrument measurement (Figure 5a, orange peak in insets). We ascribe this species to droplet coalescence events, which become evident in microscopy when the droplets decant.

The evolution of the apparent radius of the major species was also followed in detail in 10 s intervals by DLS, with the data shown in Figure 5b,c, overlayed to mass-progress turbidity curves measured in the same conditions. The time evolution of the apparent rh could be fitted to a single exponential function for both N-RNA15 condensation (kDLS of 0.0024 ± 0.003 s−1) and N-tRNA condensation (0.0006 ± 0.00045 s−1). These values are in very good agreement with the rate constants for the slowest process in the absorbance traces (kobs2) under the same conditions, which are 0.0027 ± 0.0002 s−1 for N-RNA15 condensation and 0.0016 ± 0.00045 s−1 for N-tRNA condensation. This strongly suggests that the slowest process in the mass progress curves corresponds to the growth of condensates.
With the goal of obtaining a further layer of structural information about the condensation intermediates, we tackled their analysis by cryoelectron microscopy (Cryo-EM). We triggered N-RNA15 condensation by adding 20 μM RNA15 to 5 μM N, and flash-froze the reaction after 30 min, where the turbidity experiments reach a steady state (Figure 4a). Two-dimensional Cryo-EM images show semi-regular spherical structures of sizes ranging from 300 to 700 nm (Figure 5d), in agreement with mono-disperse species found by DLS, considering the expected differences between techniques. Interestingly, close inspection of the images shows a marked internal arrangement in the shape of layered filaments as opposed to the homogeneous aspect one would expect from a liquid. These filaments are clearly in close contact and in a defined arrangement. Although their precise size and distribution would merit a more detailed investigation, their length oscillates between 180 and 300 nm, but the width is more homogeneous, with a value of 27 ± 3 nm (Figure 5d). Moreover, in all images inspected, the edges of these structures show filaments shedding off the border, reminiscent of a wool ball shape (Figure 5e, white arrows). We ascribe these to incoming loose filaments in the process of formation of these structures. Control images with either N or RNA alone show no evidence of any type of structure (Figure S3), suggesting that the filaments originate from N-RNA interactions. Thus, Cryo-EM images of N-RNA15 revealed filamentous structures whose size is in excellent agreement with the mono-disperse species detected by DLS.
3 DISCUSSION
In this work, we have dissected the kinetic condensation mechanism of N-RNA using mass-progress curves, size-progress curves, a crystallization-like assembly model, a kinetically limited assembly model, DLS, and cryoelectron microscopy. In our working conditions, N is a dimer and binds a cognate sequence and a random sequence with similar affinities close to 165 nm. This suggests that the degree of RNA sequence selectivity of the N dimer is low, although experiments with additional RNA sequences may refine this result. The difference in condensation and binding stoichiometry reflects the presence of secondary lower-affinity RNA binding sites that operate as the protein concentration increases. Thus, primary RNA binding and condensation reactions are well separated by time and by concentration.
We made use of two different types of RNA to investigate the mechanism. The 15-mer RNA oligonucleotide yielded a fine stoichiometric dependence for condensation to occur, showing a threshold at 2RNA15:1N ratio, with no reentrant behavior. The heterogeneous and structured ~70 base tRNA also showed a stoichiometrically controlled condensation but displayed a strong reentrant behavior that supports the idea that the modulated assembly–dissolution of tRNA condensates represents a better model for the in-cell scenario. The lack of sequence discrimination capacity under condensation concentrations suggests that the protein would not discriminate genomic from transcript RNA.
By combining DLS and cryoelectron microscopy, we were able to characterize the nature of the condensation intermediates. Both RNA model reaction pathways populate a major intermediate species with remarkable mono-dispersity within 30 min of reaction, particularly for the chemically homogeneous RNA15. On the other hand, cryo-EM images revealed roughly rounded-shaped particles of sizes compatible with those from DLS (300–600 nm diameter), composed of a surprising network of layered filaments. These filaments vary in length but show homogeneous widths (~27 nm); however, further structural analysis should be addressed to reveal structural details. We propose that the frayed edges observed (Figure 5e) correspond to incoming filaments on the way to the early particle growth, in agreement with our experimentally based kinetic model supported by DLS. We know that the subsequent event down the pathway is the coalescence into droplet coacervates, which require the fibrillar intermediates to be assembling rather than disassembling. The layered filamentous organization of these particles, as opposed to a homogeneous density phase (Tollervey et al. 2023) suggests a non-liquid nature, which makes us consider these structures as nanoclusters. Altogether, N-RNA15 condensation proceeds through a filamentous nanocluster intermediate, where the multiple oligomeric species in equilibrium we and others (Ribeiro-Filho et al. 2022; Zhao et al. 2021) observed for N across the concentration range and stabilized by RNA are the building blocks for the filaments. Further characterization of the species detected in this work, including the soluble dimers/tetramers/octamers (Figure 1d), the protein–RNA complexes (Figure 1e), and the condensation nuclei, filaments, and nanoclusters (Figure 6) will be the subject of future work and should help elucidate the molecular determinants of N condensation.

We combined our results in a full mechanistic model described in Figure 6. Our aim was to build the simplest model that can account for all the current evidence (a minimal kinetic model). N-RNA15 condensation requires fast primary binding of RNA, which takes place in the sub-second time window, typical of an electrostatically driven collision, largely preceding condensation. At concentrations under saturation, we expect the main binding event to be that of the RBD. As concentration increases over saturation, we expect additional RNA binding events to occur at low-affinity sites present in other regions of the protein (Estelle et al. 2023). The saturation of these sites triggers the formation of a nucleus formed by two protomers. This resembles what was previously reported (Zhao et al. 2021), where in excess of RNA a tetramer is formed with a 2:1 binding stoichiometry, which correlates with the 2:1 condensation ratio we observed (Figure 2c). This nucleation process is fast relative to the processes of secondary nucleation and growth. The relationship between the nucleus and the native tetramer observed at higher concentrations remains to be elucidated. Further kinetic complexity, such as off-pathway intermediates, cannot be excluded but is not necessary at this point to describe the data. Also, our evidence for secondary nucleation suggests the presence of parallel pathways where a new filament grows on an existing filament rather than from a newly formed nucleus. The transition from the nucleus to the larger condensates observed in light microscopy goes through a multi-step elongation reaction, involving the formation of a filamentous intermediate which further associates into a nanocluster. The further growth of these nanoclusters likely involves parallel processes such as coalescence and/or off-pathway condensation.
An increase in ionic strength destabilizes the condensate but does not affect the condensation kinetics, suggesting that the overall favorable interactions between the negatively charged RNA and the positively charged N consolidate only in the final condensate. Moreover, the tendency of the protein to form homotypic condensates in the presence of crowder and the insensitivity of these to ionic strength reveals that hydrophobic protein–protein self-interaction likely plays a role in heterotypic condensation with RNA. Taking this together, considering also the inability of the protein to condensate with RNA10, N-RNA condensation occurs through a combination of homotypic and heterotypic interactions as previously reported (Nguyen et al. 2024).
Interestingly, the difference in the concentration dependence of the condensation kinetics of both RNA studied reveals that the molecular details of the process, including nucleus size, rate constants, and the presence of off-pathway intermediates, are dependent on the identity of the condensing RNA molecule. Thus, on one hand, the occurrence of condensation shows little sequence specificity and seems to be a robust feature of the virus life cycle. On the other hand, the exact way in which it takes place may be perturbed by changes in the RNA, arising from natural mutations, cellular state, or other external factors.
The monodispersity of the nanoclusters observed by DLS supports a nucleation mechanism where particles grow from individual nuclei to reach a size plateau. This also suggests that these structures are self-limited in size. These structures resemble the internal organization in virion nucleocapsids (Wu et al. 2022b) and the self-limitation further supports that the reaction may represent the condensation of a packed genome in the path to virion formation. These nanoclusters are significantly larger than the virion, 300–700 nm in diameter compared to 50 nm. However, we highlight the capacity of self-limitation and the presence of internal capsid-like structure. We speculate that large liquid-like condensates represent the viral replication factories as large reaction vessels in a different liquid phase, where most viral RNA synthesis takes place (Lopez et al. 2021), while the self-limited layered nanoclusters may represent the type of structural organization in a virion proto-nucleocapsid. These events could be sequential, that is, the nanoclusters can either evolve to large liquid condensates or be an endpoint in nucleocapsid formation. The fate of each pathway may be dictated by viral and host factors, or whether the composing RNA is either transcript or genomic. Since there is little if any RNA sequence specificity in condensation, there must be other distinctive features in both RNAs, including expression timing and modification of their relative levels.
In this work, we uncovered an elementary mechanistic pathway in a protein–RNA condensation reaction, impacting the myriads of biomolecular condensates linked to cell physiology and pathology, particularly in early events and the definition of sequential steps. In addition, our findings have direct impact on SARS-CoV-2N protein and N-RNA condensation in RNA viruses in general. Understanding biological biomolecular condensation at the molecular and physicochemical level requires mapping experimental kinetic pathways with starting, intermediate, and final species, their energy barriers, and the thermodynamics involved. The ability to dissect and reproduce a robust protein RNA condensation reaction contributes to the ability to design and tune condensates with further relevance for nanotechnology and synthetic biology.
4 MATERIALS AND METHODS
4.1 Protein expression and purification
We used in this work the N nucleoprotein from severe acute respiratory syndrome coronavirus 2 (2019-nCoV) (SARS-CoV-2), Uniprot code P0DTC9. The sequence encoding the SARS-CoV-2N protein was obtained from Genscript in a pET-28 plasmid. N protein vectors were transformed into Escherichia coli C41 for expression. Freshly transformed cells were grown at 37°C in LB-Kanamycin until OD 0.8. Protein expression was induced with 1 mM IPTG for 12 h. Harvested cells were resuspended in lysis buffer (50 mM Tris–HCl pH 8, 1M NaCl, 1 mM PMSF) and sonicated. 6M urea was added to the soluble fraction, left overnight at 25°C and loaded onto a Ni2+ affinity column in Buffer Sodium Phosphate 50 mM pH 8, NaCl 0.4M, urea 6M. Protein was then eluted with buffer 50 mM sodium phosphate pH 8, 0.4M NaCl, 6M urea, 400 mM imidazole, and then dialysed in buffer 50 mM pH sodium phosphate 8, NaCl 0.4M. For His-Tag cleavage and removal, the protein was incubated overnight with a 1:500 TEV:N ratio at 25°C. Then 6M urea was added to the solution, left overnight at 25°C, and loaded onto a Ni2+ affinity column in buffer sodium phosphate 50 mM pH 8, NaCl 0.4M, urea 6M. The cleaved protein was collected in the flow-through and then dialysed in sodium-phosphate 25 mM pH 8, NaCl 0.2M. Protein was concentrated using Millipore centrifugal Amicon filters and stored at −80°C until use.
4.2 Size exclusion chromatography and SEC-MALS
Phosphate buffer (25 mM pH 8, 0.4M NaCl) was used for both SEC-FPLC and SEC-MALS experiments. For SEC-FPLC, a Superdex-200 column was used in Shimadzu SPD-10A equipment. For SEC-MALS, Wyatt Mini DAWN + Optilab equipment coupled to a Jasco UV-4075 HPLC was used. For this a Wyatt 5–1.250 kDa–500 Å pore size was used.
4.3 Equilibrium binding assays
For determination of the apparent dissociation constant, the concentration of N protein was fixed at 500 nM and tryptophan fluorescence signal was traced as increasing concentrations of RNA were added. The sample was excited with 290 nm and emission at 330 nm was detected. The excitation slit was set to 5 nm and the emission slit to 8 nm. The blank was subtracted and the data was fitted to a reversible one-site model using Pro-Fit software (QuantumSoft).
4.4 Light microscopy
For microscopy imaging, 96-well non-binding bottom Greiner plates were used. HEPES 20 mM pH 7.5, NaCl 100 mM was used (unless otherwise indicated). Reactions were triggered after the addition of RNA and incubated for 30 min at room temperature before imaging. The images were acquired using an Axio Observer 3 inverted microscope with a 40×/0,750.3M27 objective and a Colibri 5 LED illumination system if needed. Images were processed using Fiji (a distribution package of ImageJ software, USA).
4.5 Phase separation kinetic assays
Scattering measurements were performed using a Jasco V-750 spectrophotometer set to 370 nm. Reactions were done in HEPES 20 mM pH 7.5, NaCl 100 mM 25°C. Data were further analyzed. For mechanistic analysis, turbidity kinetic traces were fitted using a two-exponential function plus a drift, using the following equation: 2 y = A1 exp(kobs1 t) − A2 exp(kobs2 t) + m t, where Ai is the amplitude of the phase i and kobsi is its empirical rate constant.
4.6 Dynamic light scattering measurements
DLS measurements were carried out using a DynaPro NanoStar II DLS device (Wyatt Technology). Phase separation measurements were performed in HEPES 20 mM pH 7.5, NaCl 100 mM. For this, 5 μM N was first measured to obtain the protein-only measurement. Then either 20 μM of RNA15 or 1.4 μM tRNA were added, and its kinetics were followed by doing 30 measurements every 12 s, each composed of 5 acquisitions with 20 s acquisition time. The temperature was maintained at 25°C by Peltier control system. Results were processed employing the software package included in the equipment. DLS trace was fitted using a single-exponential function using the following equation y = A exp (kobs t) + m t.
The hydrodynamic radius of a molecule is inversely proportional to the medium viscosity as per the Stokes-Einstein equation. In turn, viscosity increases as biopolymer concentration increases (Lefebvre 1982). Thus, we would expect that the concentration increase along the X-axis of Figure 1d would lead to an increase in viscosity and a decrease in the apparent hydrodynamic radius. As a result, the estimated radius in Figure 1d would be an underestimation. Similarly, viscosity increases as biopolymers condensate (Lilyestrom et al. 2013). Thus, we would expect that formation of N:RNA condensates during our time course experiments would lead to an increase in viscosity and a decrease in the apparent hydrodynamic radius. As a result, the estimated radius in Figure 5a,b,d,e would be an underestimation. For these reasons, we refer to our measurements as “apparent (hydrodynamic) radius.”
4.7 Cryo-EM
The condensation process was triggered by adding 20 μM RNA15 to 5 μM N. After 45 min, 3 μL was applied to glow-discharged Lacey grids (01895-TedPella) and vitrified using a Vitrobot Mark IV system (Thermo Fisher Scientific) operated at 22°C and 100% humidity. The images were acquired in a JEOL JEM 1400Plus 120 kV cryogenic transmission electron microscope equipped with a OneView camera 4 k × 4 k (Gatan).
4.8 Zlotnick analysis
4.8.1 Determination of the nucleus size
Plotting the log of the scattering signal at time t versus the log of the free protomers at time t, and performing a linear fit, yields the nucleus size from the slope. For RNA15, t = 30 s, and for tRNA t = 24 s.
4.8.2 Determination of the elongation reaction order
Linear fittings were done for the linear region of the 370 nm light scattering traces. The log of these slopes was plotted against the log of the initial protomer concentration. The slope of this plot yields the elongation reaction order.
ACKNOWLEDGMENTS
We thank LNNano/CNPEM for access to the EM facility (Proposals 20233398 and 20231800). RVP and MAAA thank FAPESP (Grants 2020/06062-1 and 2022/05088-2). We thank Chan Zuckerberg Initiative (CZI) for supporting RPM visit to LNNano/CNPEM (Grant 2021-240156/5022).
Open Research
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.