Volume 62, Issue 45 e202310801
Research Article
Open Access

Conception and Evaluation of a Library of Cleavable Mass Tags for Digital Polymers Sequencing

Thibault Schutz

Thibault Schutz

Université de Strasbourg, CNRS, ISIS, 8 allée Gaspard Monge, 67000 Strasbourg, France

Université de Strasbourg, CNRS, Institut Charles Sadron UPR22, 23 rue du Loess, 67034 Strasbourg Cedex 2, France

Search for more papers by this author
Isaure Sergent

Isaure Sergent

Aix Marseille Université, CNRS, UMR 7273, Institute of Radical Chemistry, 13397 Marseille Cedex 20, France

Search for more papers by this author
Georgette Obeid

Georgette Obeid

Université de Strasbourg, CNRS, ISIS, 8 allée Gaspard Monge, 67000 Strasbourg, France

Search for more papers by this author
Laurence Oswald

Laurence Oswald

Université de Strasbourg, CNRS, Institut Charles Sadron UPR22, 23 rue du Loess, 67034 Strasbourg Cedex 2, France

Search for more papers by this author
Dr. Abdelaziz Al Ouahabi

Dr. Abdelaziz Al Ouahabi

Université de Strasbourg, CNRS, Institut Charles Sadron UPR22, 23 rue du Loess, 67034 Strasbourg Cedex 2, France

Search for more papers by this author
Dr. Paul N. W. Baxter

Dr. Paul N. W. Baxter

Université de Strasbourg, CNRS, ISIS, 8 allée Gaspard Monge, 67000 Strasbourg, France

Université de Strasbourg, CNRS, Institut Charles Sadron UPR22, 23 rue du Loess, 67034 Strasbourg Cedex 2, France

Search for more papers by this author
Dr. Jean-Louis Clément

Dr. Jean-Louis Clément

Aix Marseille Université, CNRS, UMR 7273, Institute of Radical Chemistry, 13397 Marseille Cedex 20, France

Search for more papers by this author
Dr. Didier Gigmes

Dr. Didier Gigmes

Aix Marseille Université, CNRS, UMR 7273, Institute of Radical Chemistry, 13397 Marseille Cedex 20, France

Search for more papers by this author
Prof. Laurence Charles

Corresponding Author

Prof. Laurence Charles

Aix Marseille Université, CNRS, UMR 7273, Institute of Radical Chemistry, 13397 Marseille Cedex 20, France

Search for more papers by this author
Dr. Jean-François Lutz

Corresponding Author

Dr. Jean-François Lutz

Université de Strasbourg, CNRS, ISIS, 8 allée Gaspard Monge, 67000 Strasbourg, France

Université de Strasbourg, CNRS, Institut Charles Sadron UPR22, 23 rue du Loess, 67034 Strasbourg Cedex 2, France

Search for more papers by this author
First published: 22 September 2023
Citations: 6

Graphical Abstract

An optimal set of phosphoramidite monomers was synthesized herein for facilitating the mass spectrometry (MS) sequencing of digital poly(phosphodiester)s. These molecules contain a cleavable alkoxyamine group and various side-chain substituents with a distinct mass signature. When incorporated periodically in the polymers, these new molecules enable a controlled MS fragmentation and an easy identification of the resulting fragments.

Abstract

A library of phosphoramidite monomers containing a main-chain cleavable alkoxyamine and a side-chain substituent of variable molar mass (i.e. mass tag) was prepared in this work. These monomers can be used in automated solid-phase phosphoramidite chemistry and therefore incorporated periodically as spacers inside digitally-encoded poly(phosphodiester) chains. Consequently, the formed polymers contain tagged cleavable sites that guide their fragmentation in mass spectrometry sequencing and enhance their digital readability. The spacers were all prepared via a seven steps synthetic procedure. They were afterwards tested for the synthesis and sequencing of model digital polymers. Uniform digitally-encoded polymers were obtained as major species in all cases, even though some minor defects were sometimes detected. Furthermore, the polymers were decoded in pseudo-MS3 conditions, thus confirming the reliability and versatility of the spacers library.

Introduction

Synthetic digital polymers are a new class of functional sequence-defined macromolecules enabling molecular information storage.1 They are typically synthesized by a sequence-controlled multistep process2 and deciphered by a sequencing tool.3 Since the first report in 2014,4 about 20 different families of synthetic digital polymers have been described.1c, 5 The relevance of such information-containing polymers in applications such as cold data storage, anti-counterfeiting technologies, cryptography, materials traceability and plastic recycling has been evidenced.6 Despite this recent progress, synthetic polymers are still less explored than DNA for data storage.7 Yet, one major advantage of synthetic digital polymers is that their storage properties can be controlled by macromolecular design.8 For example, the molecular structure of synthetic polymers can be tailored to facilitate sequencing by nanopore sensing,9 controlled chain degradation10 or mass spectrometry.11

For mass spectrometry, which is currently the most used sequencing method for synthetic polymers, it was recently reported that digital poly(phosphodiester)s with a storage capacity as high as 440 bits/chain can be decoded in a routine instrument.12 To achieve such a record, the digital chains were carefully engineered to undergo programmed fragmentations in the mass spectrometer. As reported earlier and as schematized in Figure 1,13 the inclusion of periodically-distributed cleavable alkoxyamine sites inside a digital poly(phosphodiester) facilitates its decoding. This is due to the fact that low dissociation energy NO−C bonds can be selectively cleaved in the presence of phosphate repeat units.14 For example, when these weak links are placed between bytes, the tandem mass spectrometry (MS/MS) fragmentation of the macromolecule leads to a library of cleaved oligomers (Figure 1a). Yet, in order to retrieve the positioning of each byte in the initial sequence, a mass tag with a specific MS signature is incorporated in all bytes with the exception of the first one.13 Each cleaved byte is then subjected to a further fragmentation (MS3) and decoded. Ultimately, the entire sequence of the initial polymer can be reconstructed either manually or using a specific software.15 Main-chain NO−C bonds also influence the stability of digital polymers, which may degrade thermally. This interesting property can be exploited for the design of erasable or editable polymers.5a

Details are in the caption following the image

(a) General concept used in previous works for the mass spectrometry sequencing of digital poly(phosphodiester)s.13 The polymers contain periodically-distributed cleavable spacers. Thus, in MS/MS conditions, they break into a library of predictable coded fragments. Except the first one, all fragments contain a mass tag that allows their identification. They are sequenced individually in pseudo MS3 conditions and ultimately the complete information sequence can be reconstructed, either manually or using a software. (b) Molecular structure of previously-reported phosphoramidite spacers allowing the incorporation of cleavable alkoxyamine spacers in the digital poly(phosphodiester) chains.13, 18 (c) Molecular structure of the phosphoramidite spacers prepared in the present work that contain both a mass tag and a cleavable site.

Digital poly(phosphodiester)s are synthesized by phosphoramidite polymer chemistry (PPC), a technique that allows the use of classical phosphoramidite nucleosides16 but also of a wide variety of non-natural monomers.17 They are usually binary-encoded using two different monomers 0 and 1, leading to propyl- and 2,2-dimethylpropyl- phosphodiester repeat units, respectively (Figure 1a).5b In our original report,13 the phosphoramidite spacer L1 (Figure 1b) and commercial phosphoramidite nucleosides were respectively used to install alkoxyamine cleavable sites and mass tags in poly(phosphodiester) chains. However, the spacer L1 leads to reactive radical fragments that form intense secondary peaks series and prevent the use of a decoding software. It was therefore recently replaced by the spacer L2 (also known as RISC2), which leads to cleaner fragmentations and therefore allowed automated sequencing, as reported recently in this journal.18

Yet, the use of nucleoside mass tags was still required. In other words, when using a simple binary alphabet based on two different comonomers,5b, 8 10 phosphoramidite building-blocks are required to synthesize a byte (i.e. 1 spacer, 1 mass tag and 8 coded monomers). One important issue in the field of digital polymers is atom economy, as exemplified by the recent development of expanded coding alphabets.5f, 5i, 19 In this context, it would be simpler and more useful if the alkoxyamine group and the mass tag could be combined in a single molecule. In the present communication, we report the design, synthesis and use of phosphoramidite spacers 210 (Figure 1c) containing both a cleavable site and a mass tag.

Results and Discussion

The design of the cleavable mass tags was derived from L2, which contains both a dimethoxytrityl (DMT)-protected OH group and a reactive phosphoramidite moiety, as requested for PPC.17c Furthermore, in order to induce controlled MS fragmentations, L2 also contains a tetramethylpiperidinyloxy (TEMPO)-based alkoxyamine.13 Another important design in L2 is the presence of a rigid aromatic linker that prevents back-biting radical reactions and subsequent rearrangements after MS fragmentation.18 All these features were kept in the structures prepared in this work. Different types of chemistries can be considered for including mass tags in such spacers. Still, the chosen chemistry shall be easy and versatile to allow incorporation of a wide variety of substituents. To this end, we developed a synthetic route involving a nucleophilic substitution on a secondary amine. It relies on tagged TEMPO intermediates that are derived in two steps from 4-oxo-TEMPO (Scheme 1). First, the intermediate 4-((3-hydroxypropyl)amino)-TEMPO T1 was prepared. The secondary amine of T1 was then reacted with bromo-derivatives to afford the tagged intermediates T2-T10 in good yields. The mass tags (i.e. N-substituents) of these derivatives were carefully selected in order to fulfil the previously-reported mass requirements for optimal MS3 sequencing.13

Details are in the caption following the image

Synthesis of tagged TEMPO intermediates. Experimental conditions: (i) Na(CH3COO)3BH, CH3COOH, amino propanol, anhydrous 1,2-dichloroethane, RT, overnight; (ii) Acetonitrile, K2CO3, 83 °C, overnight.

Afterwards, the synthesis of the alkoxyamine-containing phosphoramidite spacers was investigated. At first, a molecular design involving a main-chain alkyne was considered (11, Figure 2). This alkyne function was included because it could bring further rigidity in the final molecule. Furthermore, the synthetic route leading to 11 (Scheme S1) was tempting because it involves an easy Sonogashira coupling step. As a proof-of-feasibility, this route was only studied with intermediate T2. The spacer 11 was synthesized and incorporated in model poly(phosphodiester)s (data not shown) for MS sequencing. However, it was observed that the presence of an acetylene group in the cleavable alkoxyamine spacer complicated sequencing rather than simplifying it (data not shown). Therefore, a revised route (Scheme S1) involving the reduction of the alkyne was explored.

Details are in the caption following the image

Molecular design originally considered for the cleavable mass tags. Molecular structures of the phosphoramidite spacers 11 and 12.

This alternative route was also only studied with intermediate T2. The corresponding saturated spacer 12 was obtained. However, the overall yield was low and, on the whole, this second route did not seem so practical. Consequently, a third molecular design, which is closer to the one of L2, was explored. Scheme 2 shows the synthetic route that was used in the present work for the synthesis of the library of tagged spacers 210. Steps (ii) to (v) have already been reported for the synthesis of the spacer L2.18 However, instead of starting the synthesis from commercial 4-acetylphenylacetic acid a as done in our prior work, compound a was first synthesized in step (i) from 4-iodoacetophenone following a literature procedure.20 Afterwards, steps (ii), (iii), (iv) and (v) were performed, as already reported,18 to afford the intermediates b, c, d and e, respectively. For the ATRA step (vi), the tagged TEMPO derivatives T2T10 were reacted with e to afford the DMT-protected intermediates f2f10. Ultimately, the phosphoramidite spacers 210 were obtained in step (vii).

Details are in the caption following the image

Synthetic route studied herein for the synthesis of the tagged phosphoramidite spacers 210. Experimental conditions: (i) PdCl2, AgOAc, NaOAc, AcOH, 130 °C; (ii) NaBH4, ethanol, RT, overnight; (iii) HCl 37 %, THF, 0 °C, 4 h; (iv) BH3 ⋅ SMe2, anhydrous THF, RT, overnight; (v) anhydrous pyridine, anhydrous THF, RT, 1 h; (vi) CuBr, Cu(0), PMDETA, anhydrous THF, RT, overnight; (vii) DIPEA, anhydrous DCM, RT, 1 h.

The reactivity of spacers 210 was then investigated. To do so, each spacer was incorporated in the middle of a short oligo(phosphodiester) sequence and its coupling efficiency was assessed by UV titration of DMT deprotection.21 The oligomers were synthesized by automated PPC on a thymidine-loaded solid support, leading after cleavage to a thymidine nucleotide residue noted T.21 As described in the introduction, the polymers were all encoded with a two-symbols alphabet, in which propyl phosphate and 2,2-dimethylpropyl phosphate motifs represent 0 and 1 bits, respectively.5b The model oligomers all contain the sequence T1010-x-1010 (the sequence is written in the synthesis direction), where x denotes a spacer of the TISP series 210. The coupling efficiency was calculated by comparing the absorbance Ax of the DMT released after spacer coupling to the absorbance Aref of the DMT released after coupling the previous monomer of the sequence (i.e. monomer 0 located in the fourth position from the thymidine nucleotide end-group). Table 1 shows the coupling efficiency estimated for spacers 210. The original UV spectra are displayed in Figure S1. On the whole, the spacers exhibited markedly different reactivities, depending on the chosen mass tag. Spacers 2, 3, 5 and 9 led to near quantitative coupling yields in standard PPC protocols. In comparison, derivatives 6 and 8 exhibited a lower efficiency. Still, the lowest results were obtained with 4, 7 and 10. Different reasons may explain the observed differences in coupling efficiencies. First, the bulkiness of the mass tag seems to influence phosphoramidite activation and coupling. However, other parameters may play a role such as solubility. For instance, derivative 7 has a low solubility in acetonitrile due to classical pyrene aggregation.Thus, some drops of DCM were always added in the reaction solution to improve the solubility of 7. Furthemore, optimized protocols can be used to improve the reactivity of bulky phosphoramidite monomers.19b One possibility is for example to increase the concentration of the spacer. Entries 6–8 in Table 2 compare the reactivity of 7 at a concentration of 0.1, 0.15 and 0.2 mol ⋅ l−1.22 At the highest concentration, a near quantitative coupling efficiency was observed. Using a similar strategy, it was also possible to increase the reactivity of 10, however with moderate success. Overall, spacers 23 and 59 seem appropriate for the preparation of digital polymers, whereas derivatives 4 and 10 are less suitable.

Table 1. Coupling efficiencies measured for spacers 210.

Entry

Spacer[a]

Coupling efficiency [%]

1

2

>99[a]

2

3

>99[a]

3

4

7.5[a]

4

5

>99[a]

5

6

89[a]

6

7

24[a]

7

7

91[b]

8

7

99[c]

9

8

78[a]

10

9

>99[a]

11

10

9.5[a]

12

10

17[b]

  • Spacer concentration: [a] 0.1 mol ⋅ l−1; [b] 0.15 mol ⋅ l−1; [c] 0.2 mol ⋅ l−1.
Table 2. Digital poly(phosphodiester)s that were synthesized in this work.

Sequence[a]

Composition

m/zth[b]

m/zexp[b]

P1

ω-00000000–2-11111111–8-00000000-α

C148H285N4O104P25

777.5007

777.4983

P2

ω-00000000–8-11111111–6-00000000-α

C152H287N4O104P25F6

785.8367

785.8356

P3

ω-00000000–7-11111111–9-00000000-α

C160H299N4O104P25

784.8539

784.8528

P4

ω-01010000–5-01001101–6-01000011-α

C158H297N4O104P25

780.5180

780.5139

P5

ω-01010000–8-01001101–6-01000011-α

C154H291N4O104P25F6

790.5086

790.5068

P6

ω-01010000–7-01001101–9-01000011-α

C162H303N4O104P25

789.5258

789.5225

  • [a] Sequences are written in the reading direction (i.e. from ω to α, which is opposite to the synthesis direction). See Figure 3 for the meaning of the numbers as well as the Greek letters α and ω. [b] Measured as [Pi-6H]6− at monoisotopic peak.

The spacers were then tested for the synthesis and MS decoding of digital poly(phosphodiester)s. Figure 3 shows the general molecular structure of the synthesized polymers and Table 2 lists the different samples P1P6 that have been studied herein. All polymers were prepared by automated PPC.21 As described in the previous section, the polymers were prepared using comonomers 0 and 1.5b Two different model sequences were investigated in this work, a triblock (0)8-b-(1)8-b-(0)8 and an ASCII-encoded digital sequence 01010000–01001101–01000011. Following our previously-established convention,13 the reading direction of the polymers was set opposite to the synthesis direction. After each sub-sequence of 8 coded monomers, a cleavable spacer was incorporated in the chain. The size of blocks was selected to be long enough to contain one byte (i.e. 8 bits) but short enough to prevent extensive charging which complicates recovery of block sequence in pseudo-MS3.13 Thus, each byte is labelled with a tag with the exception of the last synthesized one (i.e. first byte for sequencing). The combination and sequence of the tags were selected according to pre-established rules for pseudo-MS3 sequencing.13 Another important feature in these polymers is the chain-end of the first synthesized byte (i.e. last byte for sequencing). In previous works,12, 13, 19b, 21 a thymidine-loaded solid support was used. After cleavage, it leads to poly(phosphodiester)s with a thymidine chain-end (see previous section). Although the initial reason for this choice was UV detection in HPLC,21 the thymidine nucleoside was afterwards exploited as a byte tag for pseudo MS3 sequencing.13 However, since nucleoside tags are no longer used in this work, the presence of a terminal thymidine is not mandatory. Therefore, a universal solid support was selected herein. We chose to work with commercial UnySupport, which is a methylated version of the classical UnyLinker.23 Upon cleavage, it leads to a terminal dephosphorylation.23, 24 Consequently, the cleaved poly(phosphodiester)s do not contain a residual end-group inherited from the solid support, as shown in Figure 3.

Details are in the caption following the image

General molecular structure of the polymers synthesized in this work. The Greek letters α and ω indicate the end-groups and chain-directionality. Synthesis starts from α whereas decoding starts from ω.

All the formed polymers were characterized by negative mode ESI-MS (Figures 5 and S2–S6). Some issues have arisen during the synthesis of the polymers. The first obstacle was due to the utilization of the commercial Unysupport. The standard cleavage procedure with this support includes a 2 h heating treatment at 55 °C. However, when applying this procedure, an abundant species with an increment of mass of +293 Da as compared to the targeted polymers was often detected in mass spectrometry.This is probably due to an incomplete dephosphorylation and corresponds to polymers in which the leaving group of the support was still attached to the main-chain, as proposed in Figure 4. Therefore, the cleavage procedure was modified. A more efficient dephosphorylation could be achieved using slightly more elevated temperatures (60 or 70 °C), as exemplified in Figure S2a for P1 and Figure S4a for P4. However, these conditions are not optimal for the main-chain alkoxyamines of the polymers and may result in partial strand degradation. In this context, the best compromise was to achieve cleavage overnight at 50 °C (Figure S3a for P2).

Details are in the caption following the image

Different types of α-chain-ends observed for the formed polymers after cleavage from commercial UnySupport. Left: targeted hydroxyl chain-ends. Right: chains containing a residual group due to incomplete dephosphorylation.

In all cases, the targeted polymers appear as dominant species, even though some defects may be detected. As written above, defects due to incomplete desphophorylation were significantly reduced in these samples. Thus, they were either not detected or appeared as minimal species. Still, some truncated sequences due do to missing (macro)monomers can be seen in some spectra. As expected, sequences containing spacers 2 and 59 could be easily achieved and detected. Despite its low solubility in acetonitrile, 7 could also be incorporated efficiently in polymers P3 and P6. Furthermore, all polymers could be sequenced. First, the MS/MS analysis of samples P1P6 resulted in the controlled fragmentation of the backbone, thus indicating that the expected alkoxyamine cleavage occurs with all studied spacers. The MS/MS collision conditions were adapted in order to release all the bytes including the inner byte. As a result, MS/MS spectra highlighting a clear byte library were obtained in all cases (Figures 5 and S2–S6). For example, Figure 5c shows the MS/MS spectrum obtained for polymer P3. Although the same information can be obtained from any [P3-zH]z− ions (with z=4–9) observed in MS (Figure 5b), selecting the charge state of the dissociating precursor as a multiple of the number of bytes allows a single charge state for each fragment, hence highly contributing to the readability of MS/MS data. Importantly, the chosen combinations of mass tags permitted to identify unequivocally the cleaved bytes in all samples. These results indicate that the synthetic mass tags develop herein proceed as efficiently as previously-used nucleotide tags. Consequently, complete sequencing could be obtained through the MS3 analysis of each fragmented byte (Figure S7).

Details are in the caption following the image

Mass spectrometry analysis of polymer P3. (a) Molecular structure of P3. (b) Negative mode ESI-MS spectrum of P3. (c) MS/MS spectrum (11 eV laboratory frame) of P3 obtained by collision-induced dissociation of the [P3-6H]6− precursor ion.

Conclusion

In summary, eleven different phosphoramidite monomers containing a cleavable alkoxyamine and a mass tag were synthesized in this work. At first, a synthetic route involving an alkyne connection was explored but was not retained because of synthesis and sequencing issues. An optimized route was found and allowed successful preparation of nine different tagged monomers. They were then tested individually for the stepwise phosphoramidite synthesis of oligo(phosphodiester)s and their reactivity was assessed by UV monitoring. It was observed that their coupling yields depend on the chemical nature of the mass tags. For instance, monomers containing benzyl, 4-fluoro benzyl, 4-tert-butylbenzyl and biphenyl mass tags exhibited near quantitative reactivities in standard PPC conditions. However, other monomers exhibited lower or significantly lower coupling yields. This could be improved by increasing their concentration in PPC. Still, alternative molecular design may also be considered in the future. For bulky substituents, for instance, the use of long linkers between the mass tag and the phosphoramidite group may improve coupling yields. Nevertheless, the best monomers were tested for the synthesis of digital polymers. Poly(phosphodiester)s with different monomer sequences were prepared and sequenced in pseudo-MS3 conditions. It was found that the main chain alkoxyamine of the spacers do enable controlled fragmentations and the mass tags allow decoding. Furthermore, spacers library allows some flexibility in the choice of the tags for PPC. In terms of atom economy, the cleavable mass tags reported herein correspond to synthons with a molar mass ranging from 532 (benzyl tag) to 668 Da (3,5-bis(trifluoromethyl)benzyl tag). In our previous strategy employing both a spacer and a nucleotide mass tag, the L2 synthon18 has a molar mass of 383 Da and the tags have a molar mass ranging from 289 (deoxycytidine monophosphate tag) to 415.9 Da (iodo-deoxyuridine monophosphate tag),13 thus a global contribution ranging from 672 to 798.9 Da. This means that for each cleavable site, the present system leads to a mass reduction ranging from 4 to 266.9 Da. In addition, the use of only one phosphoramidite monomer in PPC instead of two significantly improves atom economy. Yet, the present concept is of course not restricted to the mass tags studied herein and a broader variety of spacers can be envisioned in the future. Overall, this work considerably simplifies the design of digital polyphosphodiesters and opens up new opportunities for macromolecular data storage.

Acknowledgments

The authors thank the French National Research Agency for financial support (Project shapeNread, grant numbers ANR-19-CE29-0015-01 and ANR-19-CE29-0015-02). G.O. thanks the CSC Graduate School funded by the French National Research Agency (CSC-IGS ANR-17-EURE-0016) for a master fellowship. For the polymer synthesis part, this research work has been started at the Institut Charles Sadron (ICS) and finalized at the Institut de Science et d'Ingénierie Supramoléculaires (ISIS). The authors also thank Hava Aksoy and Cyril Antheaume (ISIS) for preliminary MS characterization.

    Conflict of interest

    The authors declare no conflict of interest.

    Data Availability Statement

    The data that support the findings of this study are available from the corresponding author upon reasonable request.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.