Volume 33, Issue 9 e5136

TOOLS FOR PROTEIN SCIENCE

Open Access

ARCIMBOLDO at low resolution: Verification for coiled coils and globular proteins

Iracema Caballero,

Iracema Caballero

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Investigation, Writing - original draft, Software, Methodology, Data curation, Validation

Search for more papers by this author

Albert Castellví,

Albert Castellví

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Investigation, Methodology

Search for more papers by this author

Josep Triviño,

Josep Triviño

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Software, Investigation, Methodology

Search for more papers by this author

Elisabet Jiménez,

Elisabet Jiménez

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Investigation, Methodology, Software

Search for more papers by this author

Nicolas Soler,

Nicolas Soler

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Investigation, Methodology

Search for more papers by this author

Rafael Junqueira Borges,

Rafael Junqueira Borges

orcid.org/0000-0001-6049-8806

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Writing - original draft, Investigation, Methodology, Software

Search for more papers by this author

Isabel Usón,

Corresponding Author

Isabel Usón

[email protected]

orcid.org/0000-0003-2504-1696

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

ICREA: Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain

Correspondence

Isabel Usón, Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Baldiri Reixach 15, Barcelona 08028, Spain.

Email: [email protected]

Contribution: Software, Investigation, Writing - original draft, Funding acquisition, Supervision, Methodology

Search for more papers by this author

Iracema Caballero,

Iracema Caballero

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Investigation, Writing - original draft, Software, Methodology, Data curation, Validation

Search for more papers by this author

Albert Castellví,

Albert Castellví

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Investigation, Methodology

Search for more papers by this author

Josep Triviño,

Josep Triviño

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Software, Investigation, Methodology

Search for more papers by this author

Elisabet Jiménez,

Elisabet Jiménez

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Investigation, Methodology, Software

Search for more papers by this author

Nicolas Soler,

Nicolas Soler

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Investigation, Methodology

Search for more papers by this author

Rafael Junqueira Borges,

Rafael Junqueira Borges

orcid.org/0000-0001-6049-8806

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

Contribution: Writing - original draft, Investigation, Methodology, Software

Search for more papers by this author

Isabel Usón,

Corresponding Author

Isabel Usón

[email protected]

orcid.org/0000-0003-2504-1696

Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Barcelona, Spain

ICREA: Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain

Correspondence

Isabel Usón, Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona Science Park, Baldiri Reixach 15, Barcelona 08028, Spain.

Email: [email protected]

Contribution: Software, Investigation, Writing - original draft, Funding acquisition, Supervision, Methodology

Search for more papers by this author

First published: 16 August 2024

https://doi.org/10.1002/pro.5136

Review Editor: Nir Ben-Tal

Share a link

Email
Wechat
Bluesky

Abstract

Crystallography at low resolution must determine the atomic model from less experimental observations, which is challenging in the absence of a model. In addition, model bias is more severe when independent experimental data are scarce. Our methods solve the phase problem by combining the location of accurate model fragments using Phaser with density modification and interpretation of the resulting maps using SHELXE. From a partial, correct structure, the density modification process and the stereochemical constraints draw the rest of the structure, validating the result. This same principle is now exploited at low resolution. Coiled coils are important, ubiquitous structures but notoriously difficult to phase and to predict. Both correct solutions and incorrect ones are poorly discriminated by the crystallographic figures of merit as long as helices are correctly oriented. We incorporate coiled-coil verification, designed to set up competing, incompatible structural hypotheses to probe both the results and establish the power of the data to discriminate them. Efficiency of coiled-coil phasing and validation in test cases from 3 to 4 Å is demonstrated in ARCIMBOLDO_LITE, placing single helices, and in ARCIMBOLDO_SHREDDER, with fragments derived from AlphaFold models. SHELXE tracing at low resolution has been enhanced, maintaining its local character but extending the environment assessment. For non-helical structures, verification is demonstrated in the fragment location process. Its use is exemplified with the solution of the VSR1 structure at 3.5 Å, depending on LLG optimization and the emergence of new features in the electron density. Relying on verification, we have extended the use of the ARCIMBOLDO software to low resolution.

1 INTRODUCTION

Crystallography provides an accurate experimental atomic structure, but on account of the phase problem, errors in the model may bias the determination (Ramachandran & Srinivasan, 1961; Terwilliger, 2004). Only the intensities of the diffracted beams are recorded in the diffraction experiment. The phases needed to compute the electron density map wherefrom the model is built are usually provided by a starting model, used in molecular replacement (MR) phasing (Read, 2001; Rossmann, 1972). The use of predicted models (Baek et al., 2021; Jumper et al., 2021) has been extensively incorporated in crystallographic determination methods (Medina et al., 2022; Simpkin et al., 2023). Low resolution reduces the number of independent experimental data, and the crystallographic determination relies heavily on stereochemical prior knowledge (Urzhumtsev & Lunin, 2019). Experimental data are outnumbered by parameters expressing the atomic positions and their average displacement, typically around 3 Å, taking solvent into account (Wlodawer et al., 2008). This raises the question at low resolution of whether the final model is truly experimental, adding further information, or whether the initial, virtually complete model could not be disproved by the data.

The absence of a model precludes model bias, and thus ab initio phasing, from the recorded intensities alone was long a quest, achieved by enforcement of atomicity as a constraint (Usón & Sheldrick, 1999). The use in ARCIMBOLDO of ubiquitous fragments, such as main chain alpha helices or libraries of small beta sheets, to substitute the atomicity constraint in direct methods extended ab initio phasing to medium resolution around 2.5 Å (Millán et al., 2015). A partial, accurate structure, correctly located in the asymmetric unit with Phaser (Read & McCoy, 2016) can be extended into a full structure from the density-modified map (Usón & Sheldrick, 2018), further enhanced through model building (Usón & Sheldrick, 2024). Structure completion and expansion, marked by a high correlation coefficient (Fujinaga & Read, 1987) of the SHELXE trace, is used as an indication of a successful solution. This principle builds the core of the ARCIMBOLDO methods (Rodríguez et al., 2009), which have been embraced and diversified by software tools like Ample (Bibby et al., 2012) and Fragon (Jenkins, 2018). A minimal starting hypothesis reveals previously unknown structural features and thus validates the solution. Conversely, a wrong solution will not evolve correct inferences.

The same principle applies at low resolution, between 3 and 6 Å, but unsurprisingly, partial models need to provide more scattering, and atomic model extension into the new density becomes uncertain (Borges et al., 2020). Proof of principle where SHELXE density modification was instrumental in drawing a solution from a partial model at low resolution was attained by phasing FtsH (Vostrukhina et al., 2015) or FasR (Lara et al., 2020).

The trade-off between starting information and available resolution is well understood for the assembly of a starting model (McCoy et al., 2017) and can be referred to as eLLG (expected Log-Likelihood Gain versus an uninformative model that a particular model could reach for particular experimental data) (Oeffner et al., 2018). For the expansion of the partial structure, atomic interpretation of a coarser map sequentially building amino acids becomes unreliable and model bias a prime concern.

Agreement with the experimental data and good stereochemistry (Chen et al., 2010) are the main validation criteria to judge the correctness of the model established in a crystallographic determination. High resolution is desirable but is inherently limited by the crystals; at low resolution, the lack of experimental data has always compromised the determination of the atomic model (Jorda et al., 2016; Urzhumtsev et al., 2000). The recent advent of predicted atomic models has eased structure solution but a practically complete model, globally judged by limited experimental data, and a priori conforming to proper stereochemistry may mask errors.

Crystallography builds a model upon consistency with experimental data and prior knowledge. Frequently, a popular figure of merit is relied on, such as correlation coefficient (CC) between model and data above 30%. Furthermore, emergence of new correct features in the electron density supports a partial model. When figures of merit become unreliable or the hypothesis leaves little to infer, a more active method to challenge the determination is needed.

We have expanded the resolution scope for reliable structure solution, implementing dedicated verification strategies in ARCIMBOLDO. In particular, we have extended the resolution limit to 4 Å for phasing coiled coils in ARCIMBOLDO_LITE with ideal helices or using AlphaFold multimeric predictions as models in ARCIMBOLDO_SHREDDER. In globular proteins, LLG-based fragment verification is demonstrated in the solution process from density modified maps.

2 RESULTS AND DISCUSSION

2.1 Verification in ARCIMBOLDO

Verification can be defined as a systematic attempt to disprove the model produced. Ways will vary depending on the scenario and on the more common errors to be expected, but the core idea is that while there are many ways in which a structure can be wrong, the correct solution must be unique. Hence, competing solutions are set up and scored in comparison. The process may correct errors, leading perturbed starts to develop into correct solutions. Equivalently, correct solutions should score higher than incorrect ones. When incompatible solutions reach comparable scores, only one might be right, so possibly none is. In any case, disambiguation will need to come from a different use of the data or from prior knowledge. If data are not able to add information, it is questionable whether we have an experimental determination or an initial model our data could not disprove (Read et al., 2020).

The general mode of ARCIMBOLDO (Rodríguez et al., 2012) imposed a resolution limit of 2.5 Å, intended to prevent false solutions that might go unidentified. To overcome this concern, we introduce verification in our programs, which is here extended to low resolution.

In the case of predicted models at medium resolution, we introduced a procedure in ARCIMBOLDO_SHREDDER to systematically eliminate the starting model in favor of its inferences (Medina et al., 2022). Surely, verification is most needed in cases where data are limited by resolution or by errors. Phasing peptides with microED data where resolution, accuracy, and completeness are compromised, we set up ARCIMBOLDO_BORGES (Sammito et al., 2013) to use heterogeneous fragment libraries rather than geometric variations on a common fold (Richards et al., 2023). Consistency in model selection by phasing success is used as verification.

In the case of coiled coils, ubiquitous structures involved in many cellular processes and typically limited by data quality and lack of models—they remain difficult to predict—fragment based phasing would appear suitable from their high helical content (Thomas et al., 2020). Not only in ARCIMBOLDO, but alternative implementations can be found in CCsolve (Rämisch et al., 2015), and AMPLE (Sánchez Rodríguez et al., 2020; Thomas et al., 2015) pipelines, the later reporting cases up to 3.3 Å resolution (Thomas et al., 2020). Even though phasing may succeed at lower resolution, a severe obstacle is that wrong solutions, featuring mistranslated or reversed helices, typically render high figures of merit. Hence, the need for a verification procedure was recognized and tentatively introduced for the structure solution of coiled coils up to 3 Å (Caballero et al., 2018). Our verification procedure generates perturbations on the substructure leading to the best solution and compares their scores after submitting them to the same density modification and autotracing procedure.

Even at resolutions close to 2 Å, we have observed that single ideal helices were occasionally placed in the correct position but in a reversed direction, as helical periodicity accounts for the main low resolution diffraction features set in either direction. ARCIMBOLDO_LITE addresses this issue by phasing with substructures with reversed helices. Overcoming the lock, in effect in model building, where the biased map would be traced with wrongly reversed helices again and again, was the primary reason to generate the perturbed substructures. Setting up all possible directions and letting them compete showed improved convergence. Then, these same helices are used for generating realistic perturbations for verification. However, when using multimeric predictions, generating a model traced backwards would be contrived, and we have not seen the case where tracing was reverting parts of the model either. Thus, random translations, which constitute realistic perturbations occurring in practice in wrong solutions, are used as a method for verification in ARCIMBOLDO_SHREDDER. They also occur in single helix searches and are used as well in ARCIMBOLDO_LITE, as a baseline for a wrong solution.

In ARCIMBOLDO_LITE (Sammito et al., 2015), perturbations are generated in two ways: random translation and reversing the direction of helices (Figure 1), whereas in ARCIMBOLDO_SHREDDER (Millán et al., 2018; Sammito et al., 2014), only a randomly translated solution is used. The whole solution is shifted, except for the space group P1, where the origin choice is unconstrained and half of the helices are translated with respect to the others. The resulting phase differences are assessed to validate randomness. From the Phaser substructure that led to the best CC after SHELXE, a sparse but systematic reversal of helices is performed. Subsequently, rigid-body refinement and rescoring in Phaser and CC assessment in SHELXE are used to select a subset along with the randomly translated solution and the best solution. Results are compared after the expansion procedure. If the discrimination between the best solution and the random solution persists or the final solutions are equivalent, confidence in this solution will be justified. Thus, the best solution is validated if it can be clearly discriminated from the random solution or if different perturbations develop into a group of equivalent solutions. Conversely, it is not validated if it cannot be discriminated from the random solution, and it is inconclusive when structurally different extensions are characterized by comparable figures of merit.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Perturbations induced in the verification step. (a) Substructure leading to the best solution. (b) Random solution generated applying on the best solution (transparent) the fractional translation vector (0.1, 0.1, 0.1). (c) Sparse group of substructures with reversed helices, selected from the 16 possibilities with 4 fragments. Helices are shown as sticks; arrows indicate helix direction.

Figure 2 illustrates how to interpret the verification plot output by ARCIMBOLDO. SHELXE may successfully rebuild all the helices originally reversed (Figure 2a), all the helices or none (Figure 2b), all the helices, some helices or none (Figure 2c). In all these cases, the best solution is clearly discriminated from the random solution by a CC difference exceeding 15%. In the unsolved structures, all the traces are random, and the figures of merit are similar to the wrong solution (Figure 2d). In these cases, the best solution cannot be discriminated from the random solution, and the difference between their CC falls under 9%.

There is an interval where the difference between the CC from the best and the wrong solution would not discriminate if a structure is solved or not. The group of substructures with reversed helices allows to assess whether the top CC corresponds to a structurally unique solution or different structures render comparable scores. If unique, the sensitivity of the CC to discriminate can be trusted, concluding that the structure is solved. Otherwise, the structure presumably remains unsolved. Error remediation shows the power of the data to discriminate.

The difference between the minimum and maximum MPD (Mean Phase Difference calculated with respect to the best solution) values of the reverse group is smaller in solved structures and larger in unsolved structures. Furthermore, for solved structures, the difference between the MPD of the random solution and the maximum MPD value of the reverse group is larger than in the case of unsolved structures. This is illustrated by the cases of PDB entries 3vir (solved) and 4pna (not solved) in Figure 2e,f, respectively.

Figure 3 shows a decision workflow to classify solved, unsolved, or inconclusive results. The verification step will only be performed if the best scoring solution from ARCIMBOLDO_LITE or ARCIMBOLDO_SHREDDER has reached a CC above 25%. From our tests, a structure is considered solved when the CC difference between the best (CCbest) and a random solution (CCw) exceeds 15% and not solved if this difference falls below 9%. In ARCIMBOLDO_SHREDDER for intermediate values, no conclusive discrimination is reached. In ARCIMBOLDO_LITE, evolution of the substructures with reversed helices is assessed in terms of CC discrimination and MPD consistency.

2.2 Phasing coiled coils at low resolution

Coiled-coil dedicated modes for low resolution phasing, up to 4 Å, have been implemented both in ARCIMBOLDO_LITE for use of individual alpha helices as search models and in ARCIMBOLDO_SHREDDER employing fragments from predicted models. In both programs, verification fulfills its aim. It identifies a solution (Figure 4a,d), flags possible but inconclusive cases to avoid false positives (Figure 4b,e), and discards wrong solutions (Figure 4c,f).

2.2.1 Results for ARCIMBOLDO_LITE: Advantages of searching single helices

Phasing coiled-coil structures is complicated by data modulation and anisotropy, generating problems to differentiate genuine intermolecular tNCS from Patterson artifacts (Caballero et al., 2021), overlapping solutions, helical placement in the correct position but in reversed direction, poor side chain discrimination, and wrong solutions with high figures of merit.

Coiled-coil particularities are addressed in a specific mode (Caballero et al., 2018), which involves deactivating the placement of tNCS-related helices, implementing a new packing filter in Phaser to overcome overlapping solutions, addressing generation and testing of reversed helices, and improving the autotracing in SHELXE. Finally, to verify solutions, perturbations are made and compared for discrimination.

In addition, lack of suitable MR model was a problem traditionally, given the fact that in an extended structure, small changes in torsion angles originate large deviations in the overall geometry. Also, they may be versatile in their association (parallel or antiparallel; number of helices involved), the same sequence giving rise to different architectures, both in nature and in the different crystal forms (Leonardo et al., 2021). Their structure prediction may thus fail or simply not correspond to the species in the crystal.

Figure 5a summarizes the performance of the coiled_coil mode implemented in ARCIMBOLDO_LITE on a set of 30 test structures in the resolution range spanning 3–4 Å. Anisotropic diffraction limit and scaling, possibly with STARANISO (Tickle et al., 2018) had been performed in 18 of these datasets. In 9 cases, the diffraction limit of the deposited data was isotropic and remained unchanged by STARANISO. The other three cases were corrected by STARANISO, and both corrected and uncorrected datasets were used.

Previously, the verification step allowed us to extend the resolution limit from 2.5 to 3 Å. Here, we assess the validity of verification in the range of 3–4 Å. A structure was considered solved when achieving: a weighted mean phase error (wMPE) versus the reference deposited with the PDB below 65°, a CC of the partial structure against the experimental data of 25% or higher, and confirmation through positive verification. According to this, of the 30 structures, 19 (63.3%) were solved using the default parameterization of the coiled_coil mode. Importantly, we did not encounter any case where verification falsely indicated a wrong structure to be solved. The only parameter that varied across runs was the number (2–4) and the length of the helices (18–40 residues), adapted to the expected ASU contents. In general, fragment selection would rely on the eLLG, but for example, 3mqb, a 423 residue structure at 3.2 Å resolution, yields an encouraging CC of 40% for a random phase set characterized by MPE 88.3° after placement of helical fragments of 60 residues, rendering very high LLGs of 154, 279, 384, and 464, respectively. A first search configured to find four helices of 18 residues demonstrated in our previous study was used for sampling; indeed, 10 of 16 solved proteins were thus solved. The other six were solved with longer helices of 30–40 residues. Regarding anisotropy correction, in two of the three cases where comparison was possible, no practical difference was observed in our runs; for one of them (PDBid: 4zry), results improved for the anisotropically truncated data, and a verified solution was reached in ARCIMBOLDO_LITE. A table details the characteristics and results for each of the structures probed (Table S1).

At lower resolution, increasing the signal from side chains by using polyserine (Schwarzenbacher et al., 2004) instead of polyalanine helices might be expected to improve the discrimination of helix directionality and phasing success. We construct an ideal polyserine helix using the most frequent rotamers of the amino acids represented. Results with these search models showed no improvement versus polyalanine helices. Indeed, in 4 of 16 cases, the structure solution was not accomplished with the polyserine helices (Table S1, Data S1).

Benchmarking and computing time reduction limiting the number of clusters are given in Data S1.

2.2.2 Results for ARCIMBOLDO_SHREDDER using fragments from multimeric predictions

ARCIMBOLDO_SHREDDER, originally developed for phasing using fragments from remote homologs, has been adapted to use predicted models (Medina et al., 2022). The predicted_model mode automatically preprocesses the AlphaFold or RoseTTAFold model, extracts overlapping compact fragments of a size determined from the expected LLG (Oeffner et al., 2018), and gives them internal degrees of freedom to improve the model (McCoy et al., 2018). Finally, if the structure is a multimer and expansion of a first placement does not suffice to provide a solution, the sequential search for several copies (multicopy) will be activated. This mode should be activated along with the new SHREDDER coiled_coil mode to activate its particular verification and other dedicated features.

AlphaFold multimer (Liu et al., 2023) was used to predict the coiled coils, subsequently employed as search models, selecting those with the highest average of per-residue model confidence score (pLDDT). Figure 5b summarizes the performance of ARCIMBOLDO_SHREDDER on the same set of test structures. Among these structures, the verification step determined successful solution in 16 cases (53.5%), all exhibiting wMPE versus their reference PDB below 65°, indicative of correct solutions. In seven cases, verification determined that the structures were not solved; in all the cases, the wMPE was above 65°. Furthermore, of the seven cases where the verification step was inconclusive, six have wMPE below 65° and one above 65°. This cautious approach is adopted to minimize the risk of false positives. Finally, it is worth mentioning that two structures (PDBid: 1t8b and 5f4y) could only be solved through the multicopy search. Dataset characteristics and results are compiled in the Table S1.

The default pre-processing in ARCIMBOLDO_SHREDDER trims the side chains to alanine residues and sets a common B-value of 25 Å for all atoms, resulting in a library of fragment models with equivalent scattering. In deep-learning protein predictions, the side chains are typically preserved as long as the backbone prediction is accurate (Jumper et al., 2021), so by default side chains from predicted models will be preserved instead of trimming to polyalanine. Furthermore, the B factors are set to a common value of 25 Å for the main chain and 50 Å for the side chains. Finally, H atoms are removed.

A comparison of the performance of the predicted models, including side chains or not, is shown through the wMPE of the ARCIMBOLDO_SHREDDER solution against the deposited structure (Table S1). In two cases, side chains were crucial to solve the structure. For PDBid 1ovu, this resulted in a reduction of 25° in wMPE (from 90° to 64.1°). Similarly, for PDBid 6ixg, the wMPE improves 11.6° (from 72.6° to 61.1°). However, in another case, the use of side chains prevented solution: for PDBid 4w80, wMPE increased from 39.4° to 90°. This outcome is justified considering that, notwithstanding an average pLDDT of 86, 63% of the model superposes with the deposited structure, rendering an RMSD of 3.77 Å, in a position where the sequences do not coincide at all.

Fragment location applies the standard packing filter to exclude probes occupying the same space as some symmetry equivalents. For fragments, given the smaller fraction of the asymmetric unit occupied, this filter is not as effective as for complete models. Even in the absence of clashes, some packing arrangements may appear intuitively unlikely. For example, coplanar, orthogonal arrangements of close helices. The simplest and most robust idea, implemented in ARCIMBOLDO_SHREDDER, superposes the original model onto the corresponding residues in the probe and calculates the clashes for this full model. For globular proteins, this has a significant effect with minimal assumptions, as it contributes to discard wrong probes at early stages promoting a prioritization of the correct ones for expansion while saving run time. This smart packing option is unsuitable for coiled coils, due to two recurring situations, illustrated in Figure S1. The large RMSD encountered in predicted or homologous models, along with the extended packing contacts, may cause a large number of clashes even for close to correct placements and models. Also, fragments of an helix bundle may fit the structure at several places and better account for the scattering at a location different from the fragment would match in the final model. This would be unlikely for nonperiodic structures, but in the case displayed in Figure S2, the two helices misplaced render a phase set characterized by 60° wMPE, comparable to placement on the corresponding residues producing 55° wMPE. Hence, smart packing is disabled by default for coiled coils.

Phase combination of consistent datasets with ALIXE (Millán et al., 2020) was also deactivated, as we tested its impact on structure solution and no significant differences were observed, rather than increasing the computing time.

2.2.3 Comparison between performance of ARCIMBOLDO_LITE and ARCIMBOLDO_SHREDDER

A comparison between the performance of ARCIMBOLDO_LITE and ARCIMBOLDO_SHREDDER is shown in Figure 6. Both programs succeed in 13 cases out of 30. Furthermore, 6 cases were only solved with ARCIMBOLDO_LITE and 3 cases were only solved with ARCIMBOLDO_SHREDDER. In only 8 structures for both programs, verification concluded that they were not solved or inconclusive.

For example, PDBid 4w80 (Figure 7a) was only solved with ARCIMBOLDO_LITE, along with PDBid 6c4x with an average pLDDT of 46 and an incorrect coiled-coil association (Figure 7b). Since pLDDT in coiled-coil predictions is boosted by the accurate secondary structure, a bad prediction can have a high confidence score. Conversely, three cases were only solved by ARCIMBOLDO_SHREDDER; for PDBid 6gbr, the average pLDDT of the predicted model is 87, like in the previous case (4w80), but the RMSD compared to 98% of the deposited structure is 2.46 Å. For the other two structures, 1t8b and 5f4y, composed of 416 and 319 residues, respectively, solution was only accomplished by searching two copies in the asymmetric unit. ARCIMBOLDO_LITE with ideal helices is advantageous regarding the more sophisticated verification and model independence, but ARCIMBOLDO_SHREDDER may succeed on larger structures or the presence of several copies in the asymmetric unit.

2.2.4 SHELXE with clustering of helical seeds

The autotracing algorithm has been enhanced in SHELXE to be effective in map interpretation and extension of coiled-coil structures at low resolution. At the same time, the atomic character of the algorithm has been retained, since accuracy is essential for structure extension. Torsions within tripeptides are refined with a short helical fragment tethered at the end to avoid losing connectivity at weaker map regions. This choice is automatically triggered within the coiled_coil mode and leads to all autotracing cycles apart from the last being seeded from longer helices and extension of the main chain with helical restraints for Ramachandran angles or helical sliding. It was recognized that best scoring seeds tended to map the same reduced region in the structure, so a more diverse sampling and skipping seeding on solvent-assigned voxels was incorporated. Also, larger radius values are used for the sphere of influence, and extrapolation of unmeasured data is used in all cycles (Usón et al., 2007).

Hence, we developed a new helical tracing algorithm in SHELXE, based on a different choice of seeds (blocking seeds placed in voxels assigned as solvents, clustering seeds rather than just ranking since higher-scoring ones corresponded all to the same helix and the search needed to be broadened). Finally, the extension algorithm was more constrained, refining torsions of helical amino acids linked to a short helical rigid body of five residues to avoid losing connectivity at weaker map regions.

The improvement over the previous SHELXE version is reflected in practice in two of the ARCIMBOLDO_LITE cases, where successful solutions previously ruled out (one as not solved and the other inconclusive) are now recognized as solved. In ARCIMBOLDO_SHREDDER, four successful solutions previously deemed inconclusive are now recognized as solved, and two successful solutions previously considered not solved have been upgraded to inconclusive (Table S1).

2.3 Phasing globular structures at low resolution

2.3.1 SHELXE low resolution extension in the case of the Zika virus NS5

An ARCIMBOLDO solution was obtained in the case of the orthorhombic form of Zika virus NS5 structure (Ferrero et al., 2019). Merging 11 partial datasets rendered complete data, where the solution extended to 4 Å in the best direction but reached 7.4 Å in the worst direction, due to severe anisotropy, as estimated and corrected by STARANISO. The asymmetric unit contained six copies of the full length NS5 protein composed of two domains, leaving space for 80% solvent. ARCIMBOLDO_LITE typically uses main chain helices, but it can take any custom model. In this case, an experimental model was available for the methyltransferase domain (5KQR; Coloma et al., 2016), but the homolog Japanese encephalitis virus had to be used for the RNA-dependent RNA-polymerase domain (JEV RdRP; PDBid 4k6m; Lu & Gong, 2013). Default MR placement of the 12 domains did not work. The coordinates of the partial solution provided starting phases for a map, which, upon density modification with SHELXE (Sheldrick, 2002), allowed manual fitting of the remaining domains (Figure 8).

As the case illustrates, even with good models, in multidomain structures it may be difficult to locate the ones characterized by higher B-values, contributing less to the overall scattering. In practice, this can be achieved with a combination of ARCIMBOLDO_LITE to place the components of a partial solution and likelihood-based docking in Phaser (Millán et al., 2023) using the resulting SHELX maps.

2.3.2 ARCIMBOLDO methodology to solve the full lumenal region of vacuolar sorting receptor 1 at 3.5 Å

For the full lumenal region of Vacuolar Sorting receptor 1 (VSR1) at 3.5 Å resolution, which could not be solved by other methods (Borges et al., 2024) we developed a dedicated methodology within the ARCIMBOLDO principle. It targets this low resolution scenario through systematic verification: a more constrained map interpretation and comparative scoring of alternative hypotheses.

We use the density-modified map calculated with SHELXE from a partial and reliable solution to assemble structural hypotheses consistent with prior knowledge. Multiple alternative fragments derived from remote homologs and secondary structure prediction are evaluated, considering different secondary structure elements, fragment reversion, and side chain inclusion using the most probable sequences generated with the SEQUENCE SLIDER (Borges et al., 2020; Borges et al., 2022) functions. Fragments were rigid-body refined, allowing internal degrees of freedom using Gimble in Phaser (McCoy et al., 2018). Each fragment was scored based on its LLG_contribution to the overall value, roughly estimated from LLG change upon fragment omission. Relative differences in the LLG_contribution of each fragment guided the choice between different possibilities. Fragments with lower values were optimized by changing their size, orientation, curvature, or secondary structure. Emergence of new features in unmodeled map regions confirmed partial model correctness.

The VSR1 structure contains a protease-associated (PA) domain (164 residues), a Central domain (209 residues), and, three epidermal growth factor repeats (EGFs, 158 residues). The complex state of the PA domain 4txj (Luo et al., 2014) rendered a partial solution of VSR1 by MR with Phaser. No models were available for the Central domain at the time, but it shares 15% identity over 150 residues to the extracellular metalloproteinase from Aspergillus fumigatus (PDBid 4 K90) (Fernandez et al., 2013) and Staphylococcus aureus DsbA (PDBid 3bci) (Heras et al., 2008) as identified in HHPRED alignment (Söding et al., 2005). Both structures share a roughly parallel four-helix bundle orientation (Figure S3). This fold was placed in the initial density-modified map, rigid body refinement was performed, and all helices scored LLG_contribution above 10, except the one shown in blue (3.8) (Figure 9a). Guided by this discrepancy, we optimized its orientation, and its LLG_contribution increased to 10.6 (Figure 9b). We manually fitted VSR1 helices to the DsbA model and, as the green helix matched a DsbA strand (Figure 9c), we modeled this alternative secondary structure, improving the LLG_contribution from 10.6 (for 16-residue helix) to 21.6 (for 10-residue strand) (Figure 9d).

After adding the Central domain fragments, whose confidence was supported by LLG calculation and DsbA superposition, we included them in map calculation to reveal new features. The green strand was expanded to a sheet of three strands, and the inclusion of side chains modeled by the SEQUENCE SLIDER function improved their LLG_contribution (Figure 9e). With the significant improvement in the Central domain fragments, we inspected the electron density of a region devoid of atoms, corresponding to the EGF domain. The first density-modified map generated solely from the PA domain was almost featureless in contrast to the map generated using also the Central domain fragments (57 residues), which shows continuous electron density (Figure 9f). This independent control supported fragment correctness. From the new features revealed in the map, we manually extended and joined different fragments of the Central domain and assigned their sequence with SEQUENCE SLIDER.

This methodology within the ARCIMBOLDO framework and its use to phase VSR1 structure establishes a low resolution verification for globular proteins. Building into the SHELXE maps with only the guide from the remote homologs was challenging and involved testing alternative, manually built fragment types. With one or more model alternatives, the Shred_LLG estimation in ARCIMBOLDO_SHREDDER SEQUENTIAL (Sammito et al., 2014) could be used to estimate the contribution of each residue or secondary structure element to the overall LLG, providing a basis for verification.

3 CONCLUSION

Low resolution phasing can exploit the ARCIMBOLDO method, the extension of a partial accurate structure through interpretation of its density modified electron density map. Fragments, rather than individual residues need to be rated to give enough signal. This is illustrated in the case of globular proteins: the Zika Virus NS5 protein, with models for both domains. And plant VSR1, where ab initio extension from one quarter of the structure is achieved through comparative scoring of alternative fragments estimating their LLG_contribution, comparison with superposition of low identity templates and inspection of a region devoid of atoms.

Coiled coils remain a class of proteins for which predicted models may fail, and generation of pseudo-solutions is a typical pitfall. ARCIMBOLDO has been adapted to solve and verify coiled-coil structures at resolutions up to 4 Å, both ab initio and exploiting predicted models. The verification step successfully ruled out 7 false positives in the case of ARCIMBOLDO_LITE and 4 false positives in the case of ARCIMBOLDO_SHREDDER. The development of a suitable map tracing algorithm in SHELXE for this scenario has been instrumental. Verification, defined as a mechanism aiming to disprove an apparent solution against competing alternatives, will also allow to establish the capacity of the data to back the structure. In other words, validate the experimental character of the determination.

4 MATERIALS AND METHODS

4.1 Computing setup

Structure solution and tests were run on a local HTCondor v.8.7.10 (Tannenbaum et al., 2001) grid composed of 146 nodes totaling 237 GFlops using as a submitter a single workstation with one Intel i9-12900KF processor of 16 physical cores, and 64GB RAM, running Linux Debian 11.

AlphaFold2 predictions were performed on a workstation with Intel Core i9-9980XE, GeForce GTX 1080 8 GB, 64 GB RAM, and Debian 10, with a local installation of the code distributed through https://github.com/deepmind/alphafold (Jumper et al., 2021).

4.2 Software versions and figures of merit used

ARCIMBOLDO_LITE and ARCIMBOLDO_SHREDDER are deployed for Linux and Macintosh and are accessible through the Python Package Index (PyPI) (https://pypi.org/project/arcimboldo/) for use in the XDSGUI interface (Brehm et al., 2023) or as part of the CCP4 program suite starting from release 7.0 (Agirre et al., 2023; Winn et al., 2011). Requires Python 3, Phaser version 2.8 or higher from the CCP4 distribution for fragment placement and the SHELXE (Usón & Sheldrick, 2024) version 2024 or higher from SHELX distribution server for density modification and autotracing.

SEQUENCE SLIDER models side chains on partial polypeptide traces in a brute force approach. All probable sequence assignments allowed by the known sequence may be assembled and individually tried. The sequence may be matched to the trace based on the secondary structure prediction to reduce the number of possibilities. Possible models are refined, and crystallographic indicators are used for discrimination. Model extension and improvement of phases for correct models promote solutions. Iteration reveals better discrimination among the possibilities evaluated.

XPREP v.2021/1 was used for data analysis (Sheldrick, 2008). Phaser (McCoy et al., 2007) was employed to calculate the anisotropic delta-B factor for the coiled coils and to perform a molecular replacement for the VSR1. Manual modeling of the VSR1 was made with Coot (Emsley et al., 2010) and PyMol (Schrödinger, LLC, n.d.). Figures were prepared with the PyMOL molecular graphics system (Schrödinger, LLC, n.d.) and Matplotlib v.1.5.3 (Hunter, 2007).

The figures of merit used in decision-making were Phaser intensity-based log-likelihood gain (LLG; Read & McCoy, 2016) and the correlation coefficient between observed and calculated normalized intensities (CC; Fujinaga & Read, 1987 calculated by SHELXE Sheldrick, 2010). Structure-amplitude weighted mean phase errors (wMPE; Lunin & Woolfson, 1993) were calculated with SHELXE against the deposited models PDB (Berman et al., 2000) to assess performance. The mean phase difference (MPD) is formally wMPE calculated against a different structure.

4.3 Coiled-coil test set

The 30 coiled-coil crystal structures used in this study correspond to PDB entries: 6n6s, 6zff, 2oto, 3f6n, 1ovu, 4bxt, 6gbr, 3fx0, 3mqb, 3vem, 4w80, 6yek, 1t8b, 6dlc, 5f4y, 4zry, 4ut5, 3mtt, 5le0, 5zvk, 1s94, 6dma, 2xg7, 3rk3, 3wuq, 6c4x, 6jn2, 3frv, 6ixg, and 3frt.

The resolution limits of the datasets, as defined by the original authors, feature resolution between 3 and 4 Å (23 from 3 to 3.5 Å and 7 from 3.5 to 4 Å). Their structure factors (CIF files) were obtained from the PDB. Upon introducing these data into the STARANISO Server, we observed that diffraction for several datasets was reported as likely truncated due to inappropriate isotropic or anisotropic cut-offs applied in previous processing (6zff, 1ovu, 4bxt, 3mqb, 3vem, 4w80, 1t8b, 6dlc, 5f4y, 4u5t, 5zvk, 6dma, 2xg7, 3rk3, 6jn2, 3frv, 6ixg, and 3frt); for others, no reflections were removed by the anisotropic cut-off (6n6s, 3f6n, 6gbr, 3fx0, 6yek, 3mtt, 5le0, 1 s94, and 3wuq). We performed anisotropic correction using STARANISO on the remaining three datasets: 2oto, 6c4x, and 4zry.

Size lies between 78 and 640 residues, distributed in the asymmetric unit in one to eight chains. Twenty-one different space groups are represented, predominating P2₁2₁2₁ and P3₁21. Helical content above 75% was established with ALEPH (Medina et al., 2020). Furthermore, the coiled-coil domains were confirmed with SOCKET (Walshaw & Woolfson, 2001), and the algorithm recognizes their characteristic knobs-into-holes association, distinguishing coiled coils among the variety of helix–helix packing arrangements observed in globular domains. Otherwise, sequences and architectures of coiled coils are diverse within their characteristic association.

4.4 VSR1 data

Data to a diffraction limit of 3.5 Å were collected at beamline i03 at the Diamond Light Source to determine the functional N-terminus of the Vacuolar Sorting Receptor 1 (VSR1) grown at 4°C (Borges et al., 2024). Crystals belong to space group P2₁3, with cell constants a = b = c = 141.3 Å, and contain one monomer and 70% solvent in the asymmetric unit. The PDBid is 8r4y.

AUTHOR CONTRIBUTIONS

Iracema Caballero: Investigation; writing – original draft; software; methodology; data curation; validation. Albert Castellví: Investigation; methodology. Josep Triviño: Software; investigation; methodology. Elisabet Jiménez: Investigation; methodology; software. Nicolas Soler: Investigation; methodology. Rafael Junqueira Borges: Writing – original draft; investigation; methodology; software. Isabel Usón: Software; investigation; writing – original draft; funding acquisition; supervision; methodology.

ACKNOWLEDGMENTS

This work was supported by STFC-UK/CCP4 “Agreement for the integration of methods into the CCP4 software distribution, ARCIMBOLDO_LOW” and Grant PID2021-128751NB-I00, RE2019-087953 (Ministry of Science and Innovation/Spanish State Research Agency/European Regional Development Fund/European Union). We thank George M. Sheldrick, Randy J. Read, Kay Diederichs, and Claudia Millán for helpful discussion.

Supporting Information

REFERENCES

Agirre J, Atanasova M, Bagdonas H, Ballard CB, Basle A, Beilsten-Edmands J, et al. The CCP4 suite: integrative software for macromolecular crystallography. Acta Crystallogr D. 2023; 79: 449–461.
10.1107/S2059798323003595
CAS Web of Science® Google Scholar
Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021; 373: 871–876.
10.1126/science.abj8754
CAS PubMed Web of Science® Google Scholar
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The protein data bank. Nucleic Acids Res. 2000; 28: 235–242.
10.1093/nar/28.1.235
CAS PubMed Web of Science® Google Scholar
Bibby J, Keegan RM, Mayans O, Winn MD, Rigden DJ. AMPLE: a cluster-and-truncate approach to solve the crystal structures of small proteins using rapidly computed ab initio models. Acta Crystallogr D. 2012; 68: 1622–1631.
10.1107/S0907444912039194
CAS PubMed Web of Science® Google Scholar
Borges RJ, Eldahshoury MK, Nettleship J, Louise Bird JA, Loureiro AJ, Levada NS, et al. Crystal structure of vacuolar sorting receptor 1 (VSR1) lumenal domain reveals a stable domain swapped trimer and a potential new cargo binding site. Nat Plants. 2024; In revision.
Google Scholar
Borges RJ, Meindl K, Triviño J, Sammito M, Medina A, Millán C, et al. SEQUENCE SLIDER: expanding polyalanine fragments for phasing with multiple side-chain hypotheses. Acta Crystallogr D. 2020; 76: 221–237.
10.1107/S2059798320000339
CAS Google Scholar
Borges RJ, Salvador GHM, Pimenta DC, Dos Santos LD, Fontes MRM, Usón I. SEQUENCE SLIDER: integration of structural and genetic data to characterize isoforms from natural sources. Nucleic Acids Res. 2022; 50:e50.
10.1093/nar/gkac029
CAS PubMed Google Scholar
Brehm W, Triviño J, Krahn JM, Usón I, Diederichs K. Xdsgui: a graphical user interface for XDS, SHELX and ARCIMBOLDO. J Appl Cryst. 2023; 56: 1585–1594.
10.1107/S1600576723007057
CAS PubMed Web of Science® Google Scholar
Caballero I, Sammito M, Millán C, Lebedev A, Soler N, Usón I. ARCIMBOLDO on coiled coils. Acta Crystallogr D Struct Biol. 2018; 74: 194–204.
10.1107/S2059798317017582
CAS PubMed Web of Science® Google Scholar
Caballero I, Sammito MD, Afonine PV, Usón I, Read RJ, McCoy AJ. Detection of translational noncrystallographic symmetry in Patterson functions. Acta Crystallogr D Struct Biol. 2021; 77: 131–141.
10.1107/S2059798320016836
CAS PubMed Google Scholar
Chen VB, Arendall WB 3rd, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr. 2010; 66: 12–21.
10.1107/S0907444909042073
CAS PubMed Web of Science® Google Scholar
Coloma J, Jain R, Rajashankar KR, Garcia-Sastre A, Aggarwal AK. Structures of NS5 Methyltransferase from Zika Virus. Cell Rep. 2016; 16: 3097–3102.
10.1016/j.celrep.2016.08.091
CAS PubMed Web of Science® Google Scholar
Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of Coot. Acta Crystallogr D Biol Crystallogr. 2010; 66: 486–501.
10.1107/S0907444910007493
CAS PubMed Web of Science® Google Scholar
Fernandez D, Russi S, Vendrell J, Monod M, Pallares I. A functional and structural study of the major metalloprotease secreted by the pathogenic fungus Aspergillus fumigatus. Acta Crystallogr D Biol Crystallogr. 2013; 69: 1946–1957.
10.1107/S0907444913017642
CAS PubMed Google Scholar
Ferrero DS, Ruiz-Arroyo VM, Soler N, Usón I, Guarne A, Verdaguer N. Supramolecular arrangement of the full-length Zika virus NS5. PLoS Pathog. 2019; 15:e1007656.
10.1371/journal.ppat.1007656
CAS PubMed Web of Science® Google Scholar
Fujinaga M, Read RJ. Experiences with a new translation-function program. J Appl Cryst. 1987; 20: 517–521.
10.1107/S0021889887086102
Web of Science® Google Scholar
Heras B, Kurz M, Jarrott R, Shouldice SR, Frei P, Robin G, et al. Staphylococcus aureus DsbA does not have a destabilizing disulfide. A new paradigm for bacterial oxidative folding. J Biol Chem. 2008; 283: 4261–4271.
10.1074/jbc.M707838200
CAS PubMed Web of Science® Google Scholar
Hunter JD. Matplotlib: a 2D graphics environment. Comput Sci Eng. 2007; 9: 90–95.
10.1109/MCSE.2007.55
Web of Science® Google Scholar
Jenkins HT. Fragon: rapid high-resolution structure determination from ideal protein fragments. Acta Crystallogr D Struct Biol. 2018; 74: 205–214.
10.1107/S2059798318002292
CAS PubMed Web of Science® Google Scholar
Jorda J, Sawaya MR, Yeates TO. Progress in low resolution ab initio phasing with CrowdPhase. Acta Crystallogr D Struct Biol. 2016; 72: 446–453.
10.1107/S2059798316003405
CAS PubMed Google Scholar
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021; 596: 583–589.
10.1038/s41586-021-03819-2
CAS PubMed Web of Science® Google Scholar
Lara J, Diacovich L, Trajtenberg F, Larrieux N, Malchiodi EL, Fernandez MM, et al. Mycobacterium tuberculosis FasR senses long fatty acyl-CoA through a tunnel and a hydrophobic transmission spine. Nat Commun. 2020; 11: 3703.
10.1038/s41467-020-17504-x
CAS PubMed Web of Science® Google Scholar
Leonardo DA, Cavini IA, Sala FA, Mendonca DC, Rosa HVD, Kumagai PS, et al. Orientational ambiguity in septin coiled coils and its structural basis. J Mol Biol. 2021; 433:166889.
10.1016/j.jmb.2021.166889
CAS PubMed Web of Science® Google Scholar
Liu J, Guo Z, Wu T, Roy RS, Quadir F, Chen C, et al. Enhancing alphafold-multimer-based protein complex structure prediction with multicom in casp15. Commun Biol. 2023; 6: 1140.
10.1038/s42003-023-05525-3
CAS PubMed Google Scholar
Lu G, Gong P. Crystal structure of the full-length Japanese encephalitis virus NS5 reveals a conserved methyltransferase-polymerase interface. PLoS Pathog. 2013; 9:e1003549.
10.1371/journal.ppat.1003549
CAS PubMed Web of Science® Google Scholar
Lunin VY, Woolfson MM. Mean phase error and the map-correlation coefficient. Acta Crystallogr D Biol Crystallogr. 1993; 49: 530–533.
10.1107/S0907444993005852
CAS PubMed Web of Science® Google Scholar
Luo F, Fong YH, Zeng Y, Shen J, Jiang L, Wong K-B. How vacuolar sorting receptor proteins interact with their cargo proteins: crystal structures of apo and cargo-bound forms of the protease-associated domain from an Arabidopsis vacuolar sorting receptor. Plant Cell. 2014; 26: 3693–3708.
10.1105/tpc.114.129940
CAS PubMed Web of Science® Google Scholar
McCoy AJ, Grosse-Kunstleve RW, Adams PD, Winn MD, Storoni LC, Read RJ. Phaser crystallographic software. J Appl Cryst. 2007; 40: 658–674.
10.1107/S0021889807021206
CAS PubMed Web of Science® Google Scholar
McCoy AJ, Oeffner RD, Millán C, Sammito M, Usón I, Read RJ. Gyre and gimble: a maximum-likelihood replacement for Patterson correlation refinement. Acta Crystallogr D Struct Biol. 2018; 74: 279–289.
10.1107/S2059798318001353
CAS PubMed Google Scholar
McCoy AJ, Oeffner RD, Wrobel AG, Ojala JR, Tryggvason K, Lohkamp B, et al. Ab initio solution of macromolecular crystal structures without direct methods. Proc Natl Acad Sci U S A. 2017; 114: 3637–3641.
10.1073/pnas.1701640114
CAS PubMed Web of Science® Google Scholar
Medina A, Jiménez E, Caballero I, Castellvi A, Triviño Valls J, Alcorlo M, et al. Verification: model-free phasing with enhanced predicted models in ARCIMBOLDO_SHREDDER. Acta Crystallogr D Struct Biol. 2022; 78: 1283–1293.
10.1107/S2059798322009706
CAS PubMed Google Scholar
Medina A, Triviño J, Borges RJ, Millán C, Usón I, Sammito MD. ALEPH: a network-oriented approach for the generation of fragment-based libraries and for structure interpretation. Acta Crystallogr D. 2020; 76: 193–208.
10.1107/S2059798320001679
CAS Google Scholar
Millán C, Jiménez E, Schuster A, Diederichs K, Usón I. ALIXE: a phase-combination tool for fragment-based molecular replacement. Acta Crystallogr D Struct Biol. 2020; 76: 209–220.
10.1107/S205979832000056X
CAS PubMed Google Scholar
Millán C, McCoy AJ, Terwilliger TC, Read RJ. Likelihood-based docking of models into cryo-EM maps. Acta Crystallogr Sect D. 2023; 79: 281–289.
10.1107/S2059798323001602
CAS Google Scholar
Millán C, Sammito M, Usón I. Macromolecular ab initio phasing enforcing secondary and tertiary structure. IUCrJ. 2015; 2: 95–105.
10.1107/S2052252514024117
CAS PubMed Google Scholar
Millán C, Sammito MD, McCoy AJ, Nascimento AFZ, Petrillo G, Oeffner RD, et al. Exploiting distant homologues for phasing through the generation of compact fragments, local fold refinement and partial solution combination. Acta Crystallogr D Struct Biol. 2018; 74: 290–304.
10.1107/S2059798318001365
CAS PubMed Web of Science® Google Scholar
Oeffner RD, Afonine PV, Millán C, Sammito M, Usón I, Read RJ, et al. On the application of the expected log-likelihood gain to decision making in molecular replacement. Acta Crystallogr D Struct Biol. 2018; 74: 245–255.
10.1107/S2059798318004357
CAS PubMed Web of Science® Google Scholar
Ramachandran GN, Srinivasan R. An apparent paradox in crystal structure analysis. Nature. 1961; 190: 159–161.
10.1038/190159a0
CAS Web of Science® Google Scholar
Rämisch S, Lizatovic R, Andre I. Automated de novo phasing and model building of coiled-coil proteins. Acta Crystallogr D. 2015; 71: 606–614.
10.1107/S1399004714028247
PubMed Google Scholar
Read RJ. Pushing the boundaries of molecular replacement with maximum likelihood. Acta Crystallogr D Biol Crystallogr. 2001; 57: 1373–1382.
10.1107/S0907444901012471
CAS PubMed Web of Science® Google Scholar
Read RJ, McCoy AJ. A log-likelihood-gain intensity target for crystallographic phasing that accounts for experimental error. Acta Crystallogr D Struct Biol. 2016; 72: 375–387.
10.1107/S2059798315013236
CAS PubMed Web of Science® Google Scholar
Read RJ, Oeffner RD, McCoy AJ. Measuring and using information gained by observing diffraction data. Acta Crystallogr D. 2020; 76: 238–247.
10.1107/S2059798320001588
CAS Google Scholar
Richards LS, Flores MD, Millán C, Glynn C, Zee CT, Sawaya MR, et al. Fragment-based ab initio phasing of peptidic nanocrystals by microED. ACS Bio Med Chem Au. 2023; 3: 201–210.
10.1021/acsbiomedchemau.2c00082
CAS PubMed Google Scholar
Rodríguez D, Sammito M, Meindl K, de Ilarduya IM, Potratz M, Sheldrick GM, et al. Practical structure solution with ARCIMBOLDO. Acta Crystallogr D Biol Crystallogr. 2012; 68: 336–343.
10.1107/S0907444911056071
CAS PubMed Web of Science® Google Scholar
Rodríguez DD, Grosse C, Himmel S, Gonzalez C, de Ilarduya IM, Becker S, et al. Crystallographic ab initio protein structure solution below atomic resolution. Nat Methods. 2009; 6: 651–653.
10.1038/nmeth.1365
CAS PubMed Web of Science® Google Scholar
Rossmann M. The molecular replacement method. New York: Gordon & Breach; 1972.
Google Scholar
Sammito M, Meindl K, de Ilarduya IM, Millán C, Artola-Recolons C, Hermoso JA, et al. Structure solution with ARCIMBOLDO using fragments derived from distant homology models. FEBS J. 2014; 281: 4029–4045.
10.1111/febs.12897
CAS PubMed Web of Science® Google Scholar
Sammito M, Millán C, Frieske D, Rodriguez-Freire E, Borges RJ, Usón I. ARCIMBOLDO_LITE: single-workstation implementation and use. Acta Crystallogr D Biol Crystallogr. 2015; 71: 1921–1930.
10.1107/S1399004715010846
CAS PubMed Web of Science® Google Scholar
Sammito M, Millán C, Rodríguez DD, de Ilarduya IM, Meindl K, De Marino I, et al. Exploiting tertiary structure through local folds for crystallographic phasing. Nat Methods. 2013; 10: 1099–1101.
10.1038/nmeth.2644
CAS PubMed Web of Science® Google Scholar
Sánchez Rodríguez F, Simpkin AJ, Davies OR, Keegan RM, Rigden DJ. Helical ensembles outperform ideal helices in molecular replacement. Acta Crystallogr D. 2020; 76: 962–970.
10.1107/S205979832001133X
CAS Google Scholar
Schrödinger, LLC. Forthcoming 2015 November. The PyMOL molecular graphics system, version 1.8.
Google Scholar
Schwarzenbacher R, Godzik A, Grzechnik SK, Jaroszewski L. The importance of alignment accuracy for molecular replacement. Acta Crystallogr D Biol Crystallogr. 2004; 60: 1229–1236.
10.1107/S0907444904010145
CAS PubMed Web of Science® Google Scholar
Sheldrick GM. Macromolecular phasing with SHELXE. Z Kristallogr. 2002; 217: 644–650.
10.1524/zkri.217.12.644.20662
CAS Web of Science® Google Scholar
Sheldrick GM. Xprep. 2nd ed. Madison: Bruker AXS Inc.; 2008.
Google Scholar
Sheldrick GM. Experimental phasing with SHELXC/D/E: combining chain tracing with density modification. Acta Crystallogr D Biol Crystallogr. 2010; 66: 479–485.
10.1107/S0907444909038360
CAS PubMed Web of Science® Google Scholar
Simpkin AJ, Caballero I, McNicholas S, Stevenson K, Jiménez E, Sánchez Rodríguez F, et al. Predicted models and CCP4. Acta Crystallogr D Struct Biol. 2023; 79: 806–819.
10.1107/S2059798323006289
CAS PubMed Google Scholar
Söding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 2005; 33: W244–W248.
10.1093/nar/gki408
PubMed Web of Science® Google Scholar
Tannenbaum T, Wright D, Miller K, Livny M. Condor: a distributed job scheduler. In: T Sterling, editor. Beowulf cluster computing with linux. Cambridge: MIT Press; 2001. p. 307–350.
Google Scholar
Terwilliger TC. Using prime-and-switch phasing to reduce model bias in molecular replacement. Acta Crystallogr D Biol Crystallogr. 2004; 60: 2144–2149.
10.1107/S0907444904019535
CAS PubMed Web of Science® Google Scholar
Thomas JM, Keegan RM, Bibby J, Winn MD, Mayans O, Rigden DJ. Routine phasing of coiled-coil protein crystal structures with AMPLE. IUCrJ. 2015; 2: 198–206.
10.1107/S2052252515002080
CAS PubMed Google Scholar
Thomas JMH, Keegan RM, Rigden DJ, Davies OR. Extending the scope of coiled-coil crystal structure solution by AMPLE through improved ab initio modelling. Acta Crystallogr D. 2020; 76: 272–284.
10.1107/S2059798320000443
CAS Google Scholar
Tickle IJ, Flensburg C, Keller P, Paciorek W, Sharff A, Vonrhein C, et al. Staraniso. Cambridge: Global Phasing Ltd; 2018.
Google Scholar
Urzhumtsev AG, Lunin VY. Introduction to crystallographic refinement of macromolecular atomic models. Crystallogr Rev. 2019; 25: 164–262.
10.1080/0889311X.2019.1631817
Google Scholar
Urzhumtsev AG, Lunina NL, Skovoroda TP, Podjarny AD, Lunin VY. Density constraints and low-resolution phasing. Acta Crystallogr D Biol Crystallogr. 2000; 56: 1233–1244.
10.1107/S0907444900009331
CAS PubMed Google Scholar
Usón I, Sheldrick GM. Advances in direct methods for protein crystallography. Curr Opin Struct Biol. 1999; 9: 643–648.
10.1016/S0959-440X(99)00020-2
CAS PubMed Web of Science® Google Scholar
Usón I, Sheldrick GM. An introduction to experimental phasing of macromolecules illustrated by SHELX; new autotracing features. Acta Crystallogr D Struct Biol. 2018; 74: 106–116.
10.1107/S2059798317015121
CAS PubMed Web of Science® Google Scholar
Usón I, Sheldrick GM. Modes and model building in SHELXE. Acta Crystallogr D Struct Biol. 2024; 80: 4–15.
10.1107/S2059798323010082
CAS PubMed Google Scholar
Usón I, Stevenson CEM, Lawson DM, Sheldrick GM. Structure determination of the O-methyltransferase NovP using the ‘free lunch algorithm’ as implemented in SHELXE. Acta Crystallogr Sect D. 2007; 63: 1069–1074.
10.1107/S0907444907042230
CAS PubMed Web of Science® Google Scholar
Vostrukhina M, Popov A, Brunstein E, Lanz MA, Baumgartner R, Bieniossek C, et al. The structure of Aquifex aeolicus Ftsh in the ADP-bound state reveals a C2-symmetric hexamer. Acta Crystallogr D Biol Crystallogr. 2015; 71: 1307–1318.
10.1107/S1399004715005945
CAS PubMed Web of Science® Google Scholar
Walshaw J, Woolfson DN. SOCKET: a program for identifying and analysing coiled-coil motifs within protein structures. J Mol Biol. 2001; 307: 1427–1450.
10.1006/jmbi.2001.4545
CAS PubMed Web of Science® Google Scholar
Winn MD, Ballard CC, Cowtan KD, Dodson EJ, Emsley P, Evans PR, et al. Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr. 2011; 67: 235–242.
10.1107/S0907444910045749
CAS PubMed Web of Science® Google Scholar
Wlodawer A, Minor W, Dauter Z, Jaskolski M. Protein crystallography for non-crystallographers, or how to get the best (but not more) from published macromolecular structures. FEBS J. 2008; 275: 1–21.
10.1111/j.1742-4658.2007.06178.x
CAS PubMed Web of Science® Google Scholar

Volume33, Issue9

September 2024

e5136

This article also appears in:

Tools for Protein Science 2024

Filename	Description
pro5136-sup-0001-SupplementaryTable.xlsxExcel 2007 spreadsheet , 36 KB	Table S1. Dataset characteristics and results.
pro5136-sup-0002-Supplementarymaterial.pdfPDF document, 222.3 KB	Data S1.

ARCIMBOLDO at low resolution: Verification for coiled coils and globular proteins

Abstract

1 INTRODUCTION