Searching the Chemical Space of Bicyclic Dienes for Molecular Solar Thermal Energy Storage Candidates
Abstract
Photoswitches are molecular systems that are chemically transformed subsequent to interaction with light and they find potential application in many new technologies. The design and discovery of photoswitch candidates require intricate molecular engineering of a range of properties to optimize a candidate to a specific applications, a task which can be tackled efficiently using quantum chemical screening procedures. In this paper, we perform a large scale screening of approximately half a million bicyclic diene photoswitches in the context of molecular solar thermal energy storage using ab initio quantum chemical methods. We further device an efficient strategy for scoring the systems based on their predicted solar energy conversion efficiency and elucidate potential pitfalls of this approach. Our search through the chemical space of bicyclic dienes reveals systems with unprecedented solar energy conversion efficiencies and storage densities that show promising design guidelines for next generation molecular solar thermal energy storage systems.
Introduction
Molecules that can undergo chemical transformations and change their properties in response to light, so called photoswitches, have been researched heavily in a wide variety of applications.1 The topics in which photoswitches have been investigated include molecular,2-5 supramolecular,6 and materials applications7, 8 and the strength of applying photoswitches lie in the chemical control that is achieved when the properties of a material can be modulated with light. On a molecular level, photoswitches have e.g. been used to incorporate on/off-switches into drugs,5, 9, 10 to activate molecular pumps that can drive systems out of equilibrium,11 and in molecular solar thermal energy storage.12, 13 Meanwhile, in materials photoswitches can be incorporated as a light responsive component that can e.g. be used to encode information14 or as molecular transistors whose resistance can be varied by photoswitching.15, 16
A key challenge in the design and discovery of photoswitches is that there is no single photoswitch suitable for all applications. The demanded molecular properties vary for different applications as, e.g., information transfer requires very short thermal half-lifes of the photoproduct,14 while energy storage applications on the other hand require quite significant thermal half-lifes.12, 13 Furthermore, solar energy storage applications require large energy differences between different the isomers of the molecular photoswitch and a spectral overlap with the irradiation of the sun. For solar energy storage applications, many different photoswitch moieties have been studied including azobenzenes,1, 17-20 diaryl ethenes,21, 22 spiropyrans,23, 24 dihydroazulenes,25-27 bicyclooctadienes,28, 29 norbornadienes30, 31 and more.12, 13, 32 Considering the various substitution patterns that are possible for each of these, it is thus key to efficiently determine the properties of a large number of molecular photoswitch candidates to computationally evaluate their applicability in a given technology and to guide synthesis and experimental testing towards promising systems.
In the past, the usual workflow for computational evaluation of molecular photoswitches has been to predict molecular properties using quantum chemical methods such as density functional theory (DFT).33, 34 Yet, for the enormous amount of possible photoswitch candidates an exhaustive search using DFT is currently unfeasible. To circumvent this bottleneck, recent studies have employed machine learning and genetic algorithms to accelerate the prediction of properties and chemical reactions such that enormous chemical spaces of photoswitch candidates can be explored rapidly to identify promising photoswitch candidates.35-50 One caveat of machine learning approaches is the amount of effort required to establish a training database, to train and test the model, and to circumvent ambiguities arising from incompleteness of the training set. To avoid these problems, we recently presented an ab initio screening procedure based on extended tight binding (xTB) methods51 for norbornadiene/ quadricyclane (NBD/QC) photoswitches in the context of molecular solar thermal energy storage (MOST)52 which we will expand and utilize in this study. For MOST applications, as depicted in Figure 1, the relevant quantities are the optical absorption of both the parent form and the photoproduct, the energy difference between these two isomers which represent the storage energy, and the thermal back reaction barrier all of which can be efficiently predicted using xTB methods.

Schematic of a molecular photoswitch in MOST applications.
Here, our focus is on performing an exhaustive search for MOST candidates in the chemical space of bicyclic dienes depicted in Figure 2. The chemical space consists of a series of bicyclic dienes including donor-acceptor type substitution patterns, which has previously shown great promise in NBD/QC derivatives.30, 31 The chemical space is constructed by generating all possible combinations of bridging units X and substitution patterns on positions A–D on the bicyclic diene core with the listed substituents and phenyl substituents generated from positions A1–5 in the space of substituents with a restriction of maximum three nonhydrogenic substituents on the phenyl ring. Furthermore, all compounds that are identical due to symmetry and rotation of the phenyl group are only included once to avoid performing redundant calculations. This check is done by comparing canonical SMILES strings of the generated compounds. The resulting chemical space consists of approximately 466,000 bicyclic dienes that we have screened for their potential applicability in MOST technology. The choice of substituents in Figure 2 and the associated neglect of other substituents are based on several considerations. Firstly, synthetic feasibility is taken into account by only considering substituents that have previously been used in the synthesis of novel NBD/QC derivatives.

The chemical space of bicyclic dienes that is searched for MOST candidates: (Top-left) The donor-acceptor substituted bicyclic diene core with bridging unit X and substituents position A–D. (Bottom-left) Phenyl group with substituent positions A1–5 (Middle) Substituents in the chemical space. (Right) Bridging units considered in the chemical space.
Secondly, we have neglected substituents that only differ from those in Figure 2 by having a longer alkane chain, as the predicted thermochemical and optical properties will most likely be quite similar. One can then e.g. swap the aldehyde for a ketone to increase stability once actual synthesis is undertaken. To predict MOST properties of the 466,282 different bicyclic dienes, we utilize the screening procedure that we recently developed and benchmarked.52 In brief, from a SMILES string representation of the bicyclic diene the minimum energy conformation of the parent bicyclic diene and the photoproduct is generated from a systematic conformational search. Subsequently, GFN2-xTB53 is utilized to estimate the storage energy and the thermal back reaction barrier while the optical properties of both the reactant and the photoproduct are determined using sTDA-xTB54 (See Supporting Information or Ref. [52] for more details). This setup allows efficient evaluation of the potential performance of each molecular system in a MOST application, as the results obtained display similar tendencies to those obtained using DFT and high level coupled cluster calculations that have previously been shown to correlate with experimental results.
In addition to that, the screening procedure utilized here does not suffer from some of the limitations of e.g. machine learning models where potential pitfalls include lack of generality in the data set used for training, over fitting of the parameter space, and the time it takes to establish the machine learning algorithm. This undertaking, therefore, enables us to scrutinize whether each bicyclic diene system is a relevant MOST candidate and thus whether it is relevant for further experimental investigations. The screening procedure gave meaningful results for all of the roughly 466,000 systems and below we will analyze the results.
After generating the database for the 466,000 MOST candidates, it is a quite difficult task to sort the large data set to find the best MOST candidates by simultaneously comparing all of the predicted properties. To simplify the task of estimating the performance of a MOST candidate, different studies have introduced various ways of estimating the MOST performance of a system based on ab initio quantum chemical data. Mikkelsen and co-workers introduced a simulation framework that estimates the actual performance of a MOST candidate in a hybrid device and tested it on the dihydroazulene/vinylheptafulvene system.55 Arpa and Durbeej introduced an all-around performance descriptor that estimates the performance of dihydroazulene/vinylheptafulvene systems based on the product of the storage energy, the thermal back reaction barrier, and the spectral overlap of dihydroazulene and vinylheptafulvene in the solar spectrum.56 They consequently do not account for the actual intensity of the sunlight at a given photon energy. This aspect is, on the other hand, considered in other studies by Moth-Poulsen and co-workers57, 58 as well as Strubbe and Grossman59 where integration over the AM1.5G solar spectrum is also performed to estimate the solar energy conversion efficiency. Such models have been utilized in several studies to estimate the efficiency of MOST candidates from both experimental and ab initio quantum chemical data.58, 60, 61



with εR(ω), εP(ω), cR, and cP being the molar extinction coefficients and the molar concentrations of reactant and product, respectively. In the above, we have assumed a quantum yield of photoisomerization of 1 to obtain the upper theoretical limit of the solar conversion efficiency and it is further assumed that no energy is lost to reorganization on the excited state subsequent to absorption. The η is thus a theoretical limit for solar energy conversion, as it assumes lossless conversion. In the following, we assume a total molar concentration of 1 M, a conversion of 50 % such that cR=cP=0.5 M and that absorption of each molecule can be modelled by convolution of the first absorption with an oscillator strength of at least 0.01 using a Gaussian with full width half maximum of 0.25 eV.
Using η as our metric, we can now evaluate the predicted MOST performance of each of the 466,000 systems in the database by considering a single number that encompasses the ability of the system to absorb sunlight taking competing absorption from the photoproduct into account, the capacity of the system to store solar energy through the storage energy, and also the storage time through thermal back reaction barrier. For the dataset, the largest predicted η value is 29.4 %, the mean value is 1.8 %, and the median value is 1.2 %. This shows that, although most of the studied systems have relatively low η values, there are systems with very large η values that could provide great MOST performance. However, there are some potential pitfalls that must be avoided before blindly using the η as a MOST performance metric. It is necessary to strictly enforce the requirement that the excitation energy must be larger than the sum of the storage energy and the thermal back reaction barrier to ensure that sufficient energy is present to facilitate the storage cycle displayed in Figure 1. This is fortunately build into Eq. (1) via the energy cutoff in the integration that removes photons with insufficient energy. However, from the screening database we note that approximately 9000 of the 466,000 systems have a vertical excitation energy of the reactant which is lower than the sum of the storage energy and the thermal back reaction barrier, which would lead to a vanishing η value. It is necessary to further investigate whether the η value favours specific properties without considering other properties adequately.
To that end, we display η as a function of the different database parameters in Figure 3. From Figure 3, a multitude of trends are observed. Initially, we note that η appears to be strongly correlated with the storage energy, the thermal back reaction barrier, and the absorbance of both the reactant bicyclic diene as well as the photoproduct, which makes sense when the dependence of the η on these parameters is taken into account. Meanwhile, the strength of the predicted absorption does not appear to be critical to the predicted performance. The largest predicted η values of around 29.4 % are obtained for systems with a storage energy above 100 kJ/mol and absorption that is well above 400 nm. However, these systems all have thermal back reaction barriers of 100 kJ/mol or less. Coupling this observation with the tendency that xTB methods and other single reference methods such as DFT generally overestimate the thermal back reaction barrier of bicyclic dienes,29, 52 this corresponds to storage times of a few minutes at most. For those reasons, we have to introduce a lower bound on the thermal back reaction barrier of 120 kJ/mol to only consider systems that have storage times that have been shown to be relevant for MOST applications.12, 13 Apart from this, we do not find any other restrictions that need to be enforced when η is used as a metric for scoring the systems in the database.

Heatmaps of the predicted solar conversion efficiency, η, with respect to the different predicted MOST properties.
After discarding the systems with a predicted thermal back reaction barrier below 120 kJ/mol, the six systems with the highest η values are those shown in Figure 4 and the screening data as well as results from M06-2X62/def2-SVPD63 calculations on the screening geometries in Orca64 for these systems are shown in Table 1 (See Supporting Information for data on the 20 best systems). It is noticeable that none of the six best systems are a NBD/QC system, while the six systems feature bridging units with a three-membered ring containing either a CH2 group or an amine at the terminal position. Furthermore, it is actually the case that 18 of the 20 best performing systems have a modified bridging unit, while only two of them are NBD/QC systems. In line with previous studies, this observation indicates that improved MOST performance can possibly be obtained by modifying the bridging unit of NBD/QC. From the data in Table 1, it is also clear that systems 1–6 with the modified bridging unit gain their high efficiency by having a significantly higher storage energy than corresponding NBD/QC systems. In addition to that, systems 1–6 have a donor/acceptor type substitution pattern and significantly redshifted absorption compared the unsubstituted version of the systems.

Molecular structures of the six bicyclic dienes with the largest predicted η.
System |
ΔES (kJ/mol) |
ΔETBR (kJ/mol) |
λR (nm) |
fR |
λP (nm) |
fP |
η |
---|---|---|---|---|---|---|---|
|
Screening Data |
||||||
1 |
80.3797 |
123.055 |
433.4 |
0.1746 |
371.5 |
0.0786 |
10.11 % |
2 |
77.7282 |
120.693 |
434.5 |
0.0653 |
431.8 |
0.0632 |
10.05 % |
3 |
79.9903 |
122.670 |
431.6 |
0.0682 |
391.3 |
0.0221 |
9.91 % |
4 |
86.0069 |
122.001 |
421.4 |
0.1001 |
400.4 |
0.1538 |
9.83 % |
5 |
87.2355 |
122.892 |
417.5 |
0.2280 |
374.3 |
0.0267 |
9.82 % |
6 |
87.0140 |
123.366 |
434.1 |
0.0382 |
329.9 |
0.0675 |
9.62 % |
|
|
|
|
|
|
|
|
|
M06-2X/def2-SVPD Data |
||||||
1 |
88.8463 |
151.276 |
406.4 |
0.0716 |
360.2 |
0.0075 |
4.82 % (7.22 %) |
2 |
98.3211 |
166.203 |
407.1 |
0.0808 |
370.4 |
0.1072 |
2.07 % (6.97 %) |
3 |
92.9671 |
158.398 |
430.6 |
0.0348 |
374.0 |
0.0196 |
3.14 % (8.43 %) |
4 |
86.0162 |
157.565 |
370.0 |
0.0396 |
363.8 |
0.0369 |
2.59 % (2.61 %) |
5 |
104.689 |
148.553 |
398.7 |
0.1302 |
356.6 |
0.0059 |
4.13 % (8.01 %) |
6 |
93.0832 |
169.812 |
423.2 |
0.0560 |
374.3 |
0.0150 |
2.30 % (7.30 %) |
Another relevant observation is that the -CH=C(CN)2 substituent is present in all of the top six systems and more interestingly attached directly to the bicyclic diene core in five of the six cases. The calculations indicate that the introduction of the -CH=C(CN)2 substituent provides significant redshifts in the absorption of the bicyclic diene. This promotes higher η values through a greater spectral overlap with the irradiation from the sun. Also, it appears that this substituent can be incorporated into the bicyclic diene structure without significantly compromising the storage capacity or the storage time which makes it a very appealing substituent to incorporate in future synthesis.
To validate the screening data, we performed single point M06-2X/def2-SVPD calculations on the screening geometries in Orca. By comparison of the data in Table 1, we see that the η values decrease quite dramatically going from the xTB screening data to DFT data. This mainly stems from two factors. The wavelength of absorption is overestimated by sTDA-xTB as was also seen in Ref. [52] and the thermal back reaction barrier increases dramatically for DFT. However, the thermal back reaction barrier predicted by both the screening procedure and DFT is too large, as the transition state of these compounds have been shown to be of multi-reference character.29 The barriers will therefore be overestimated using DFT. Furthermore, the transition state is not relaxed to a saddle point which means that both the screening and DFT barriers are upper limits. We therefore also report η values for the DFT data Table 1 with a ΔETBR value scaled by 0.75 in parenthesis to give a better estimation of the η value. A part from compound 4, the found systems still retain large η values that are unprecedented for experimentally studied bicyclic dienes when DFT data is used rather than our screening data, which indicates that the screening does capture potentially interesting systems.
Overall, our search finds compound 1 to be the most promising MOST compound. The high η value for this compound likely stems from promising properties of the -CH=C(CN)2 substituent in the donor position as well as the modified bridging unit that provides a high storage energy, as the photoproduct most likely has a significant amount of ring-strain. Meanwhile, including a substituent in the ortho-position on the phenyl group has also previously been shown to increase the storage time of bicyclic dienes.28
Another point to discuss is whether it is even realistic to prepare these systems synthetically. We note that many of the systems with high η values, not only those in Figure 4, also have heavily substituted phenyl groups. This is going to be a challenge in actual synthesis of the proposed systems, as aromatic rings are inherently difficult to substitute multiple times and especially in adjacent positions with various substituents such as nitro groups. Furthermore, the aldehyde -COH and the -CH=C(CN)2 substituents are not necessarily sufficiently stable for synthesis or usage in actual MOST technology. This means that if the systems are synthesized, these substituents should most likely be changed to more stable versions. Even though the systems can be synthetically prepared, there is no guarantee that they are soluble in relevant solvents and that they will actually photoswitch in high yield or at all, as we have assumed in η. These are other aspects to consider given that very few studies28 exist on the photoconversion of bicyclic dienes with modified bridging units relative to that of NBD/QC and it cannot be assumed that these will photoswitch in high quantum yields.
To give a more general overview of the effects of different structural features, we have summarized the mean value of all the calculated properties for systems in which different bridging units and substituents are present in Table 2 to analyze trends rather than single systems. Focusing on the effects of bridging units, it is initially very noticeable that the mean predicted storage energy is significantly smaller for NBD/QC systems than for all other types of compounds. Further, it is by far the largest for the bridging units that feature an additional carbon atom while the storage energy of systems with a three-membered ring in the bridge attain intermediate values. Meanwhile, the thermal back reaction barrier is smaller and the excitation energies of the reactants are larger for all of the modified bridging units showing that a destabilization of the photoproduct leads to a lower barrier and a blueshift of the absorption. Despite the blueshifted absorption and higher overlap of reactant and product absorption, the modified bridging units show higher η values on average which can most likely be attributed to the increased storage energy. Lastly, the strength of the transitions does not appear to change with bridgehead modifications.
Structural feature |
ΔES (kJ/mol) |
ΔETBR (kJ/mol) |
λR (nm) |
fR |
λP (nm) |
fP |
η |
---|---|---|---|---|---|---|---|
Bridging Units |
|
|
|
|
|
|
|
-CH2- |
24.8 |
194.9 |
341.2 |
0.15 |
318.1 |
0.16 |
0.89 |
-CH2-CH2- |
115.7 |
162.2 |
323.8 |
0.14 |
330.4 |
0.16 |
1.93 |
-CH(CH3)-CH2- |
115.0 |
162.7 |
326.0 |
0.14 |
331.2 |
0.16 |
2.01 |
-CH(CH2CH3)-CH2- |
115.8 |
163.4 |
325.6 |
0.14 |
331.7 |
0.16 |
1.99 |
-CH(CH2)CH- |
89.3 |
173.0 |
328.6 |
0.14 |
326.3 |
0.16 |
1.90 |
-CH(CH3)-CH(CH3)- |
114.2 |
163.4 |
327.6 |
0.14 |
332.3 |
0.16 |
2.04 |
-CH(NH)CH- |
93.0 |
182.0 |
325.2 |
0.14 |
319.6 |
0.15 |
1.74 |
-CH(O)CH- |
95.0 |
185.3 |
322.8 |
0.13 |
317.0 |
0.14 |
1.61 |
|
|
|
|
|
|
|
|
Substituents |
|
|
|
|
|
|
|
-F |
100.9 |
175.7 |
318.9 |
0.15 |
318.4 |
0.16 |
1.47 |
-NO2 |
97.6 |
153.7 |
341.9 |
0.11 |
349.6 |
0.15 |
2.43 |
-Cl |
101.2 |
177.1 |
315.7 |
0.14 |
314.0 |
0.15 |
1.36 |
-CN |
97.9 |
171.3 |
322.6 |
0.14 |
323.2 |
0.15 |
1.62 |
-CHC(CN)2 |
98.4 |
160.4 |
350.8 |
0.19 |
349.8 |
0.22 |
2.60 |
-COH |
96.7 |
162.8 |
332.7 |
0.11 |
333.9 |
0.13 |
2.00 |
-OCH2CH3 |
101.3 |
172.0 |
335.1 |
0.14 |
329.3 |
0.15 |
2.06 |
-CH3 |
99.3 |
170.2 |
327.3 |
0.14 |
327.0 |
0.16 |
1.81 |
Turning attention to different substituents, we see that the main difference arises in the mean value of the predicted excitation energies. The excitation energies are significantly lower for systems in which the -CHC(CN)2 or -NO2 substituents are present for both the reactant and product forms. Furthermore, the -CHC(CN)2 also leads to a higher average oscillator strength and the increased absorption in the solar spectrum leads to higher η values for systems with the -CHC(CN)2 or -NO2 substituents. Consequently, our search does not necessarily reveal the optimal MOST candidates in an experimental setting, but it gives some clear indications that modified bridging units and new substituents that have not been considered in great detail previously can provide properties that are desirable in a MOST setup. Our results indicate that systems featuring a three-membered ring in the bridging unit and the -CHC(CN)2 substituent in the A position show great promise.
In conclusion, we have established and analyzed a screening database for 466,000 bicyclic diene systems for molecular solar thermal energy storage which features molecular systems with unprecedented solar energy conversion efficiencies. The best systems reach predicted solar energy conversion efficiencies of over 10 % while having a storage times that are long enough to be useful in actual energy storage applications. This is close to the theoretical limit estimated of 10.6 % for a system with a thermal back reaction barrier of 120 kJ/mol as outlined in previous studies.57, 58 The high solar energy conversion efficiencies are mainly found for systems that feature a modified bridging unit compared to that of the well studied NBD/QC system, and the -CH=C(CN)2 substituent, which has rarely been used in NBD/QC systems, is featured in many of the top performing systems. Our work thus provides valuable new insights into the possible modifications of the NBD/QC moiety that can lead to increased performance in energy storage applications and provides candidates for next generation MOST systems, which are relevant in the transition to renewable energy sources. Furthermore, our established database could be used as a training set for machine learning algorithms that would allow us to search a much larger chemical space of bicyclic dienes.
Acknowledgments
Financial support is acknowledged from the European Commission (Grant No. 765739), and the Danish Council for Independent Research, DFF-0136-00081B.
Conflict of interest
There are no conflicts to declare.
Open Research
Data Availability Statement
The data that support the findings of this study are available in the supplementary material of this article.